[인공지능 #5 ] Logistics classification 가설함수

프로젝트/인공지능2017. 7. 30. 21:27

뷰어
댓글로
이전글
다음글

 daviduino.co.kr techtogetworld.com
 david201207.blog.me cafe.naver.com/3dpservicedavid

python 문법에 관한 글 입니다.
인공지능 tensor flow 코딩을 위해 python 문법에 대해 정리를 하고자 합니다.
글의 순서는 아래와 같습니다
==========================================
1.Logistics classification 가설함수
  ==> 합격, 불합격을 구분할때 적용함
2. 당뇨병 예측
3. Next step
  ==> Multinomial  : 여러개의 classification의 확율을 구함.
4. 참고자료 
============================================

[ Logistics classification 가설함수]

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# Lab 5 Logistic Regression Classifier

import tensorflow as tf

tf.set_random_seed(777) # for reproducibility

x_data = [[1, 2],

[2, 3],

[3, 1],

[4, 3],

[5, 3],

[6, 2]]

# x data에 따라 합부(0,1)로 나타는 Y data를 예측함

# 공부한 시간에 따른 합격여부 data

y_data = [[0],

[0],

[1],

[1]]

# placeholders for a tensor that will be always fed.

X = tf.placeholder(tf.float32, shape=[None, 2])

Y = tf.placeholder(tf.float32, shape=[None, 1])

W = tf.Variable(tf.random_normal([2, 1]), name='weight')

b = tf.Variable(tf.random_normal([1]), name='bias')

# sigmoid 함수 사용함

==> h(x) 가 0~1 사이에 값을 갖게됨.

==> cost(w)은 y값과 h(x)차이가 크면 1에 가깝고, 작으면 0에 가까워지도록 구현함 , 이를 하나의 공식으로 통일시킴.

-기존의 cost함수형식으로 하게되면 울퉁불퉁한 2차 곡선 형태되고, 경사를 내려오면서 수많은 기울기 0 지점을 지나게 되어, 목표지점에 다다르지 못하게됨

-따라서 다른 코스트 함수가 필요함, 아래 함수가 그 대안임

# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))

hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

# cost/loss function

cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *

tf.log(1 - hypothesis))

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# Accuracy computation

# True if hypothesis>0.5 else False

predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)

accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

# Launch graph

with tf.Session() as sess:

# Initialize TensorFlow variables

sess.run(tf.global_variables_initializer())

for step in range(10001):

cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})

if step % 200 == 0:

print(step, cost_val)

# Accuracy report

h, c, a = sess.run([hypothesis, predicted, accuracy],

feed_dict={X: x_data, Y: y_data})

print("\nHypothesis: ", h, "\nCorrect (Y): ", c, "\nAccuracy: ", a)

'''

0 1.73078

200 0.571512

400 0.507414

600 0.471824

800 0.447585

...

9200 0.159066

9400 0.15656

9600 0.154132

9800 0.151778

10000 0.149496

Hypothesis: [[ 0.03074029]

[ 0.15884677]

[ 0.30486736]

[ 0.78138196]

[ 0.93957496]

[ 0.98016882]]

y 예측치

Correct (Y): [[ 0.]

[ 0.]

[ 1.]

[ 1.]]

y 예측치와 실제 y값의 차가 0.5 이상이면 0. 이하면 1

Accuracy: 1.0

예측정확도 100%

'''

[당뇨병 예측]

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# Lab 5 Logistic Regression Classifier

import tensorflow as tf

import numpy as np

tf.set_random_seed(777) # for reproducibility

xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)

x_data = xy[:, 0:-1]

y_data = xy[:, [-1]]

print(x_data.shape, y_data.shape)

# placeholders for a tensor that will be always fed.

X = tf.placeholder(tf.float32, shape=[None, 8])

Y = tf.placeholder(tf.float32, shape=[None, 1])

W = tf.Variable(tf.random_normal([8, 1]), name='weight')

b = tf.Variable(tf.random_normal([1]), name='bias')

# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))

hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

# cost/loss function

cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *

tf.log(1 - hypothesis))

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# Accuracy computation

# True if hypothesis>0.5 else False

predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)

accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

# Launch graph

with tf.Session() as sess:

# Initialize TensorFlow variables

sess.run(tf.global_variables_initializer())

for step in range(10001):

cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})

if step % 200 == 0:

print(step, cost_val)

# Accuracy report

h, c, a = sess.run([hypothesis, predicted, accuracy],

feed_dict={X: x_data, Y: y_data})

print("\nHypothesis: ", h, "\nCorrect (Y): ", c, "\nAccuracy: ", a)

'''

0 0.82794

200 0.755181

400 0.726355

600 0.705179

800 0.686631

...

9600 0.492056

9800 0.491396

10000 0.490767

...

[ 1.]

[ 1.]]

Accuracy: 0.762846

[ 참고자료 ]

https://www.inflearn.com/course/기본적인-머신러닝-딥러닝-강좌/

http://agiantmind.tistory.com/176

https://www.tensorflow.org/install/

https://github.com/hunkim/deeplearningzerotoall

http://www.derivative-calculator.net/

http://terms.naver.com/entry.nhn?docId=3350391&cid=58247&categoryId=58247 ==> 미분계산/공식

http://matplotlib.org/users/installing.html ==>matplotlib 설치

https://www.kaggle.com/==> DATA SAMPLE 을 구할수 있는 사이트

'''

저작자표시 비영리 변경금지

'프로젝트 > 인공지능' 카테고리의 다른 글

[python] python 문법정리 (0)	2017.08.05
[인공지능 #6 ] multinomial 적용 (0)	2017.07.30
[인공지능 #4 ] 여러개의 DATA를 (X 변수) 구현 (0)	2017.07.30
[인공지능 #3 ] Cost 최소화 알고리즘 구현 (0)	2017.07.30
[인공지능 #2 ] Tensorflow Linear regression 구현 (0)	2017.07.30

daviduino.co.kr	techtogetworld.com
david201207.blog.me	cafe.naver.com/3dpservicedavid

TechTogetWorld

[인공지능 #5 ] Logistics classification 가설함수

'프로젝트 > 인공지능' 카테고리의 다른 글

최근에 올라온 글

최근에 달린 댓글

공지사항

글 보관함

최근에 받은 트랙백

링크

티스토리툴바

« 2024/07 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31