[인공지능 #11 ] hello-rnn /char-seq-rnn /char-seq-softmax-only /rnn_long_char
인공지능 구현에 대한 글입니다.
글의 순서는 아래와 같습니다.
================================================
1.#lab-12-1-hello-rnn
전 단계의 출력이 다음단계의 출력에 영향을 주는 경우 적용함
- 단어, 연관검색등..
2. # lab-12-2-char-seq-rnn
-rnn 적용 ==> 정확도 높음
. 49 loss: 0.000650434 Prediction: if you want you
. y값 if you want you
3. #lab-12-3-char-seq-softmax-only
rnn 미적용 ==> 정확도 미흡함
2999 loss: 0.277323 Prediction: yf you yant you
y값 if you want you
4. # lab-12-4-rnn_long_char
error : from __future__ import print_function ==> 실행불가로 주석처리함
MultiRNNCell 로 여러단을 만들면 , 정확도가 높아짐
softmax =>reshape 수행
6. 코드탐구(추가)
==>lab-12-5-rnn_stock_prediction
lab-13-1-mnist_using_scope
lab-13-2-mnist_tensorboard
lab-13-3-mnist_save_restore
7. 참고자료
=================================================
[ #lab-12-1-hello-rnn ]
#lab-12-1-hello-rnn
"""
전 단계의 출력이 다음단계의 출력에 영향을 주는 경우 적용함
- 단어, 연관검색등..
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# Lab 12 RNN
import tensorflow as tf
import numpy as np
tf.set_random_seed(777) # reproducibility
idx2char = ['h', 'i', 'e', 'l', 'o']
# Teach hello: hihell -> ihello
x_data = [[0, 1, 0, 2, 3, 3]] # hihell
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello
num_classes = 5
input_dim = 5 # one-hot size
hidden_size = 5 # output from the LSTM. 5 to directly predict one-hot
batch_size = 1 # one sentence
sequence_length = 6 # |ihello| == 6
learning_rate = 0.1
X = tf.placeholder(
tf.float32, [None, sequence_length, input_dim]) # X one-hot
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, X, initial_state=initial_state, dtype=tf.float32)
# FC layer
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
# fc_w = tf.get_variable("fc_w", [hidden_size, num_classes])
# fc_b = tf.get_variable("fc_b", [num_classes])
# outputs = tf.matmul(X_for_fc, fc_w) + fc_b
outputs = tf.contrib.layers.fully_connected(
inputs=X_for_fc, num_outputs=num_classes, activation_fn=None)
# reshape out for sequence_loss
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(50):
l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_one_hot})
print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print("\tPrediction str: ", ''.join(result_str))
'''
0 loss: 1.71584 prediction: [[2 2 2 3 3 2]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: eeelle
1 loss: 1.56447 prediction: [[3 3 3 3 3 3]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: llllll
2 loss: 1.46284 prediction: [[3 3 3 3 3 3]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: llllll
3 loss: 1.38073 prediction: [[3 3 3 3 3 3]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: llllll
4 loss: 1.30603 prediction: [[3 3 3 3 3 3]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: llllll
5 loss: 1.21498 prediction: [[3 3 3 3 3 3]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: llllll
6 loss: 1.1029 prediction: [[3 0 3 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: lhlllo
7 loss: 0.982386 prediction: [[1 0 3 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: ihlllo
8 loss: 0.871259 prediction: [[1 0 3 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: ihlllo
9 loss: 0.774338 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: ihello
10 loss: 0.676005 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]]
Prediction str: ihello
...
'''
[# lab-12-2-char-seq-rnn ]
# lab-12-2-char-seq-rnn
"""
rnn 적용 ==> 정확도 높음
- 49 loss: 0.000650434 Prediction: if you want you
- y값 if you want you
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# Lab 12 Character Sequence RNN
import tensorflow as tf
import numpy as np
tf.set_random_seed(777) # reproducibility
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idex
# hyper parameters
dic_size = len(char2idx) # RNN input size (one hot size)
hidden_size = len(char2idx) # RNN output size
num_classes = len(char2idx) # final output size (RNN or softmax, etc.)
batch_size = 1 # one sample data, one batch
sequence_length = len(sample) - 1 # number of lstm rollings (unit #)
learning_rate = 0.1
sample_idx = [char2idx[c] for c in sample] # char to index
x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell
y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
x_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
cell = tf.contrib.rnn.BasicLSTMCell(
num_units=hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, x_one_hot, initial_state=initial_state, dtype=tf.float32)
# FC layer
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
outputs = tf.contrib.layers.fully_connected(X_for_fc, num_classes, activation_fn=None)
# reshape out for sequence_loss
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(50):
l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_data})
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print(i, "loss:", l, "Prediction:", ''.join(result_str))
'''
0 loss: 2.35377 Prediction: uuuuuuuuuuuuuuu
1 loss: 2.21383 Prediction: yy you y you
2 loss: 2.04317 Prediction: yy yoo ou
3 loss: 1.85869 Prediction: yy ou uou
4 loss: 1.65096 Prediction: yy you a you
5 loss: 1.40243 Prediction: yy you yan you
6 loss: 1.12986 Prediction: yy you wann you
7 loss: 0.907699 Prediction: yy you want you
8 loss: 0.687401 Prediction: yf you want you
9 loss: 0.508868 Prediction: yf you want you
10 loss: 0.379423 Prediction: yf you want you
11 loss: 0.282956 Prediction: if you want you
12 loss: 0.208561 Prediction: if you want you
...
'''
#lab-12-3-char-seq-softmax-only
"""
rnn 미적용 ==> 정확도 미흡함
- 2999 loss: 0.277323 Prediction: yf you yant you
- y값 if you want you
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# Lab 12 Character Sequence Softmax only
import tensorflow as tf
import numpy as np
tf.set_random_seed(777) # reproducibility
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idex
# hyper parameters
dic_size = len(char2idx) # RNN input size (one hot size)
rnn_hidden_size = len(char2idx) # RNN output size
num_classes = len(char2idx) # final output size (RNN or softmax, etc.)
batch_size = 1 # one sample data, one batch
sequence_length = len(sample) - 1 # number of lstm rollings (unit #)
learning_rate = 0.1
sample_idx = [char2idx[c] for c in sample] # char to index
x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell
y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
# flatten the data (ignore batches for now). No effect if the batch size is 1
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
X_for_softmax = tf.reshape(X_one_hot, [-1, rnn_hidden_size])
# softmax layer (rnn_hidden_size -> num_classes)
softmax_w = tf.get_variable("softmax_w", [rnn_hidden_size, num_classes])
softmax_b = tf.get_variable("softmax_b", [num_classes])
outputs = tf.matmul(X_for_softmax, softmax_w) + softmax_b
# expend the data (revive the batches)
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
weights = tf.ones([batch_size, sequence_length])
# Compute sequence cost/loss
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss) # mean all sequence loss
train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(3000):
l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_data})
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print(i, "loss:", l, "Prediction:", ''.join(result_str))
'''
0 loss: 2.29513 Prediction: yu yny y y oyny
1 loss: 2.10156 Prediction: yu ynu y y oynu
2 loss: 1.92344 Prediction: yu you y u you
..
2997 loss: 0.277323 Prediction: yf you yant you
2998 loss: 0.277323 Prediction: yf you yant you
2999 loss: 0.277323 Prediction: yf you yant you
'''
[# lab-12-4-rnn_long_char]
# lab-12-4-rnn_long_char
"""
error : from __future__ import print_function ==> 실행불가로 주석처리함
# MultiRNNCell 로 여러단을 만들면 , 정확도가 높아짐
# softmax =>reshape 수행
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# from __future__ import print_function
import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
tf.set_random_seed(777) # reproducibility
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
data_dim = len(char_set)
hidden_size = len(char_set)
num_classes = len(char_set)
sequence_length = 10 # Any arbitrary number
learning_rate = 0.1
dataX = []
dataY = []
for i in range(0, len(sentence) - sequence_length):
x_str = sentence[i:i + sequence_length]
y_str = sentence[i + 1: i + sequence_length + 1]
print(i, x_str, '->', y_str)
x = [char_dic[c] for c in x_str] # x str to index
y = [char_dic[c] for c in y_str] # y str to index
dataX.append(x)
dataY.append(y)
batch_size = len(dataX)
X = tf.placeholder(tf.int32, [None, sequence_length])
Y = tf.placeholder(tf.int32, [None, sequence_length])
# One-hot encoding
X_one_hot = tf.one_hot(X, num_classes)
print(X_one_hot) # check out the shape
# Make a lstm cell with hidden_size (each unit output vector size)
def lstm_cell():
cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True)
return cell
multi_cells = rnn.MultiRNNCell([lstm_cell() for _ in range(2)], state_is_tuple=True)
# 위와 같이.MultiRNNCell 로 여러단을 만들면 , 정확도가 높아짐
# outputs: unfolding size x hidden size, state = hidden size
outputs, _states = tf.nn.dynamic_rnn(multi_cells, X_one_hot, dtype=tf.float32)
# softmax =>reshape 수행
# FC layer
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
outputs = tf.contrib.layers.fully_connected(X_for_fc, num_classes, activation_fn=None)
# reshape out for sequence_loss
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
# All weights are 1 (equal weights)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
mean_loss = tf.reduce_mean(sequence_loss)
train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(mean_loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(500):
_, l, results = sess.run(
[train_op, mean_loss, outputs], feed_dict={X: dataX, Y: dataY})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
print(i, j, ''.join([char_set[t] for t in index]), l)
# Let's print the last char of each result to check it works
results = sess.run(outputs, feed_dict={X: dataX})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
if j is 0: # print all for the first result to make a sentence
print(''.join([char_set[t] for t in index]), end='')
else:
print(char_set[index[-1]], end='')
'''
0 167 tttttttttt 3.23111
0 168 tttttttttt 3.23111
0 169 tttttttttt 3.23111
…
499 167 of the se 0.229616
499 168 tf the sea 0.229616
499 169 the sea. 0.229616
g you want to build a ship, don't drum up people together to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea.
'''
[# lab-12-5-rnn_stock_prediction]
[참고자료]
https://www.inflearn.com/course/기본적인-머신러닝-딥러닝-강좌
https://github.com/hunkim/deeplearningzerotoall
https://www.tensorflow.org/api_docs/python/tf/layers
'프로젝트 > 인공지능' 카테고리의 다른 글
[인공지능 #13 ] q_net_frozenlake / cartpole (0) | 2017.08.07 |
---|---|
[인공지능 #12 ] Q-Learning / OpenAI gym / frozenlake (0) | 2017.08.06 |
[인공지능 #10 ]mnist_cnn/mnist_deep_cnn/mnist_cnn_class/mnist_cnn_layers/mnist_cnn_ensemble_layers (0) | 2017.08.06 |
[인공지능 #9]mnist_softmax /mnist_nn/mnist_nn_xavier/mnist_nn_deep / mnist_nn_dropout (0) | 2017.08.06 |
[인공지능 #8]xor / xor-nn / xor-nn-wide-deep (0) | 2017.08.06 |