TechTogetWorld

인공지능 구현에 대한 글입니다.

글의 순서는 아래와 같습니다.


================================================

1. rate overfitting , regularization tips

2. train/test data sheet , learning rate , normalization(new)

 - 학습과 테스트 data를 구분하는것이 합리적임

 - mnist 소개

3. 참고자료

=================================================


[rate overfitting , regularization tips]


1. COST 값이 줄지않고, 늘어난다면  Learning rate을 더 작게 변경해주어야함

   반대로 너무 작게 줄거나, 도중에 멈춘다면 learing rate을 좀더 크게 해주어야함

   보통 0.01을 기준으로 늘리거나 줄이면서 조정해 나가면 된다.


2.  x data 값이 차이가 큰 경우, cost가 잘 줄지않거나, 학습이 잘 일어나지 않음 , 이럴때는 NOMALIZATION 을 해주어야 함.

3. OVERFITTING

  - 더많은 TRAINING DATA

  - REDUCE FEATURES

  - REGULARIZATION ==> 구별선이 구부러지지 않토록 , X값을 일반화 시킴


[train/test data sheet , learning rate , normalization(new)]


# <lab-07-1-learning_rate_and_evaluation>


import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# Lab 7 Learning rate and Evaluation

import tensorflow as tf

tf.set_random_seed(777)  # for reproducibility


x_data = [[1, 2, 1],

          [1, 3, 2],

          [1, 3, 4],

          [1, 5, 5],

          [1, 7, 5],

          [1, 2, 5],

          [1, 6, 6],

          [1, 7, 7]]

y_data = [[0, 0, 1],

          [0, 0, 1],

          [0, 0, 1],

          [0, 1, 0],

          [0, 1, 0],

          [0, 1, 0],

          [1, 0, 0],

          [1, 0, 0]]


# Evaluation our model using this test dataset


x_test = [[2, 1, 1],

          [3, 1, 2],

          [3, 3, 4]]

y_test = [[0, 0, 1],

          [0, 0, 1],

          [0, 0, 1]]


X = tf.placeholder("float", [None, 3])

Y = tf.placeholder("float", [None, 3])


W = tf.Variable(tf.random_normal([3, 3]))

b = tf.Variable(tf.random_normal([3]))


# tf.nn.softmax computes softmax activations

# softmax = exp(logits) / reduce_sum(exp(logits), dim)

hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)


# Cross entropy cost/loss

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

# Try to change learning_rate to small numbers

optimizer = tf.train.GradientDescentOptimizer(

    learning_rate=0.1).minimize(cost)


# dhp Learning rate을 1.5로 움직이는 크기를 크게하면, H(y)값은 발산을 하게된다 : overfitting )


# Correct prediction Test model

prediction = tf.arg_max(hypothesis, 1)

is_correct = tf.equal(prediction, tf.arg_max(Y, 1))

accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))


# Launch graph

with tf.Session() as sess:

    # Initialize TensorFlow variables

    sess.run(tf.global_variables_initializer())


    for step in range(201):

        cost_val, W_val, _ = sess.run(

            [cost, W, optimizer], feed_dict={X: x_data, Y: y_data})

        print(step, cost_val, W_val)


# 여기부터는 test data 로 test 진행함

# tf 입장에서는 처움보는 data (x)가 되는것이다.


    # predict

    print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))

    # Calculate the accuracy

    print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))


'''

when lr = 1.5

0 5.73203 [[-0.30548954  1.22985029 -0.66033536]

 [-4.39069986  2.29670858  2.99386835]

 [-3.34510708  2.09743214 -0.80419564]]

1 23.1494 [[ 0.06951046  0.29449689 -0.0999819 ]

 [-1.95319986 -1.63627958  4.48935604]

 [-0.90760708 -1.65020132  0.50593793]]

2 27.2798 [[ 0.44451016  0.85699677 -1.03748143]

 [ 0.48429942  0.98872018 -0.57314301]

 [ 1.52989244  1.16229868 -4.74406147]]

3 8.668 [[ 0.12396193  0.61504567 -0.47498202]

 [ 0.22003263 -0.2470119   0.9268558 ]

 [ 0.96035379  0.41933775 -3.43156195]]

4 5.77111 [[-0.9524312   1.13037777  0.08607888]

 [-3.78651619  2.26245379  2.42393875]

 [-3.07170963  3.14037919 -2.12054014]]

5 inf [[ nan  nan  nan]

 [ nan  nan  nan]

 [ nan  nan  nan]]

6 nan [[ nan  nan  nan]

 [ nan  nan  nan]

 [ nan  nan  nan]]


 ...

Prediction: [0 0 0]

Accuracy:  0.0


-------------------------------------------------

When lr = 1e-10

0 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

1 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

2 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

...


198 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

199 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

200 5.73203 [[ 0.80269563  0.67861295 -1.21728313]

 [-0.3051686  -0.3032113   1.50825703]

 [ 0.75722361 -0.7008909  -2.10820389]]

Prediction: [0 0 0]

Accuracy:  0.0

-------------------------------------------------

When lr = 0.1


0 5.73203 [[ 0.72881663  0.71536207 -1.18015325]

 [-0.57753736 -0.12988332  1.60729778]

 [ 0.48373488 -0.51433605 -2.02127004]]

1 3.318 [[ 0.66219079  0.74796319 -1.14612854]

 [-0.81948912  0.03000021  1.68936598]

 [ 0.23214608 -0.33772916 -1.94628811]]

2 2.0218 [[ 0.64342022  0.74127686 -1.12067163]

 [-0.81161296 -0.00900121  1.72049117]

 [ 0.2086665  -0.35079569 -1.909742  ]]


...


199 0.672261 [[-1.15377033  0.28146935  1.13632679]

 [ 0.37484586  0.18958236  0.33544877]

 [-0.35609841 -0.43973011 -1.25604188]]

200 0.670909 [[-1.15885413  0.28058422  1.14229572]

 [ 0.37609792  0.19073224  0.33304682]

 [-0.35536593 -0.44033223 -1.2561723 ]]

Prediction: [2 2 2]

Accuracy:  1.0

'''


# < lab-07-2-linear_regression_without_min_max >


# dhp data 편차가 터무니없이 크게되면, 학습이 잘 되질 않됨


import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf

import numpy as np

tf.set_random_seed(777)  # for reproducibility



xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],

               [823.02002, 828.070007, 1828100, 821.655029, 828.070007],

               [819.929993, 824.400024, 1438100, 818.97998, 824.159973],

               [816, 820.958984, 1008100, 815.48999, 819.23999],

               [819.359985, 823, 1188100, 818.469971, 818.97998],

               [819, 823, 1198100, 816, 820.450012],

               [811.700012, 815.25, 1098100, 809.780029, 813.669983],

               [809.51001, 816.659973, 1398100, 804.539978, 809.559998]])


x_data = xy[:, 0:-1]

y_data = xy[:, [-1]]


# placeholders for a tensor that will be always fed.

X = tf.placeholder(tf.float32, shape=[None, 4])

Y = tf.placeholder(tf.float32, shape=[None, 1])


W = tf.Variable(tf.random_normal([4, 1]), name='weight')

b = tf.Variable(tf.random_normal([1]), name='bias')


# Hypothesis

hypothesis = tf.matmul(X, W) + b


# Simplified cost/loss function

cost = tf.reduce_mean(tf.square(hypothesis - Y))


# Minimize

optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)

train = optimizer.minimize(cost)


# Launch the graph in a session.

sess = tf.Session()

# Initializes global variables in the graph.

sess.run(tf.global_variables_initializer())


for step in range(101):

    cost_val, hy_val, _ = sess.run(

        [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})

    print(step, "Cost: ", cost_val, "\nPrediction:\n", hy_val)



'''

0 Cost:  2.45533e+12

Prediction:

 [[-1104436.375]

 [-2224342.75 ]

 [-1749606.75 ]

 [-1226179.375]

 [-1445287.125]

 [-1457459.5  ]

 [-1335740.5  ]

 [-1700924.625]]

1 Cost:  2.69762e+27

Prediction:

 [[  3.66371490e+13]

 [  7.37543360e+13]

 [  5.80198785e+13]

 [  4.06716290e+13]

 [  4.79336847e+13]

 [  4.83371348e+13]

 [  4.43026590e+13]

 [  5.64060907e+13]]

2 Cost:  inf

Prediction:

 [[ -1.21438790e+21]

 [ -2.44468702e+21]

 [ -1.92314724e+21]

 [ -1.34811610e+21]

 [ -1.58882674e+21]

 [ -1.60219962e+21]

 [ -1.46847142e+21]

 [ -1.86965602e+21]]

3 Cost:  inf

Prediction:

 [[  4.02525216e+28]

 [  8.10324465e+28]

 [  6.37453079e+28]

 [  4.46851237e+28]

 [  5.26638074e+28]

 [  5.31070676e+28]

 [  4.86744608e+28]

 [  6.19722623e+28]]

4 Cost:  inf

Prediction:

 [[ -1.33422428e+36]

 [ -2.68593010e+36]

 [ -2.11292430e+36]

 [ -1.48114879e+36]

 [ -1.74561303e+36]

 [ -1.76030542e+36]

 [ -1.61338091e+36]

 [ -2.05415459e+36]]

5 Cost:  inf

Prediction:

 [[ inf]

 [ inf]

 [ inf]

 [ inf]

 [ inf]

 [ inf]

 [ inf]

 [ inf]]

6 Cost:  nan

Prediction:

 [[ nan]

 [ nan]

 [ nan]

 [ nan]

 [ nan]

 [ nan]

 [ nan]

 [ nan]]

'''



# <lab-07-3-linear_regression_min_max >


# dhp minmaxscala 함수를 이용해서, 정규화 (0~1사이의수)로 변환을 시킨다(nomallize)

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf

import numpy as np

tf.set_random_seed(777)  # for reproducibility



def MinMaxScaler(data):

    numerator = data - np.min(data, 0)

    denominator = np.max(data, 0) - np.min(data, 0)

    # noise term prevents the zero division

    return numerator / (denominator + 1e-7)



xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],

               [823.02002, 828.070007, 1828100, 821.655029, 828.070007],

               [819.929993, 824.400024, 1438100, 818.97998, 824.159973],

               [816, 820.958984, 1008100, 815.48999, 819.23999],

               [819.359985, 823, 1188100, 818.469971, 818.97998],

               [819, 823, 1198100, 816, 820.450012],

               [811.700012, 815.25, 1098100, 809.780029, 813.669983],

               [809.51001, 816.659973, 1398100, 804.539978, 809.559998]])


# very important. It does not work without it.

xy = MinMaxScaler(xy)

print(xy)


x_data = xy[:, 0:-1]

y_data = xy[:, [-1]]


# placeholders for a tensor that will be always fed.

X = tf.placeholder(tf.float32, shape=[None, 4])

Y = tf.placeholder(tf.float32, shape=[None, 1])


W = tf.Variable(tf.random_normal([4, 1]), name='weight')

b = tf.Variable(tf.random_normal([1]), name='bias')


# Hypothesis

hypothesis = tf.matmul(X, W) + b


# Simplified cost/loss function

cost = tf.reduce_mean(tf.square(hypothesis - Y))


# Minimize

optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)

train = optimizer.minimize(cost)


# Launch the graph in a session.

sess = tf.Session()

# Initializes global variables in the graph.

sess.run(tf.global_variables_initializer())


for step in range(101):

    cost_val, hy_val, _ = sess.run(

        [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})

    print(step, "Cost: ", cost_val, "\nPrediction:\n", hy_val)


'''

100 Cost:  0.152254

Prediction:

 [[ 1.63450289]

 [ 0.06628087]

 [ 0.35014752]

 [ 0.67070574]



# <lab-07-4-mnist_introduction>


import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# Lab 7 Learning rate and Evaluation

import tensorflow as tf

import random

# import matplotlib.pyplot as plt

tf.set_random_seed(777)  # for reproducibility


from tensorflow.examples.tutorials.mnist import input_data

# Check out https://www.tensorflow.org/get_started/mnist/beginners for

# more information about the mnist dataset

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


nb_classes = 10


# MNIST data image of shape 28 * 28 = 784

X = tf.placeholder(tf.float32, [None, 784])

# 0 - 9 digits recognition = 10 classes

Y = tf.placeholder(tf.float32, [None, nb_classes])


W = tf.Variable(tf.random_normal([784, nb_classes]))

b = tf.Variable(tf.random_normal([nb_classes]))


# Hypothesis (using softmax)

hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)


cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)


# Test model

is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))

# Calculate accuracy

accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))


# parameters

training_epochs = 15

batch_size = 100


with tf.Session() as sess:

    # Initialize TensorFlow variables

    sess.run(tf.global_variables_initializer())

    # Training cycle

    for epoch in range(training_epochs):

        avg_cost = 0

        total_batch = int(mnist.train.num_examples / batch_size)


        for i in range(total_batch):

            batch_xs, batch_ys = mnist.train.next_batch(batch_size)

            c, _ = sess.run([cost, optimizer], feed_dict={

                            X: batch_xs, Y: batch_ys})

            avg_cost += c / total_batch


        print('Epoch:', '%04d' % (epoch + 1),

              'cost =', '{:.9f}'.format(avg_cost))


    print("Learning finished")


    # Test the model using test sets

    print("Accuracy: ", accuracy.eval(session=sess, feed_dict={

          X: mnist.test.images, Y: mnist.test.labels}))


    # Get one and predict

    r = random.randint(0, mnist.test.num_examples - 1)

    print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))

    print("Prediction: ", sess.run(

        tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]}))


    # 그래프 인쇄가 않됨, 향후 원인 파악필요함, 일단은 #로 주석처리 함 

    # don't know why this makes Travis Build error.

    # plt.imshow(

    #     mnist.test.images[r:r + 1].reshape(28, 28),

    #     cmap='Greys',

    #     interpolation='nearest')

    # plt.show()



'''

Epoch: 0001 cost = 2.868104637

Epoch: 0002 cost = 1.134684615

Epoch: 0003 cost = 0.908220728

Epoch: 0004 cost = 0.794199896

Epoch: 0005 cost = 0.721815854

Epoch: 0006 cost = 0.670184430

Epoch: 0007 cost = 0.630576546

Epoch: 0008 cost = 0.598888191

Epoch: 0009 cost = 0.573027079

Epoch: 0010 cost = 0.550497213



[참고자료 ]

https://www.inflearn.com/course/기본적인-머신러닝-딥러닝-강좌

https://github.com/hunkim/deeplearningzerotoall