运行个简单线性模型,发现keras的实现与原生tensorflow的实现结果有些差异,比如在相同学习率下,跑的次数相同情况,结果差异比较大。
原生tensorflow模型
-
import tensorflow as tf
-
import numpy as np
-
-
x_data = np.random.rand(50)
-
y_data = x_data * 0.6 + 0.8
-
-
w = tf.Variable(0.)
-
b = tf.Variable(0.)
-
-
y = w*x_data + b
-
loss = tf.reduce_mean(tf.square(y_data - y))
-
optimizer = tf.train.GradientDescentOptimizer(0.1)
-
train = optimizer.minimize(loss)
-
-
init = tf.global_variables_initializer()
-
-
with tf.Session() as sess:
-
sess.run(init)
-
for step in range(1001):
-
sess.run(train)
-
if step%50==0:
-
print(step, sess.run([w, b]))
0 [0.09048238, 0.20768635]
50 [0.4752352, 0.85256165]
100 [0.52949804, 0.82970226]
150 [0.5601607, 0.81678414]
200 [0.5774875, 0.8094844]
250 [0.5872785, 0.80535954]
300 [0.5928114, 0.8030285]
350 [0.59593767, 0.80171144]
400 [0.5977045, 0.8009671]
450 [0.59870297, 0.80054647]
500 [0.5992671, 0.80030876]
550 [0.59958595, 0.8001745]
600 [0.5997661, 0.8000986]
650 [0.5998678, 0.8000557]
700 [0.59992516, 0.8000316]
750 [0.5999576, 0.8000179]
800 [0.59997594, 0.80001026]
850 [0.5999863, 0.80000585]
900 [0.5999923, 0.80000323]
950 [0.5999953, 0.80000204]
1000 [0.59999704, 0.8000013]
可以看到结果与[6, 8]是很相近的, 如果学习率改成0.01,次数不够,结果比如会为1000 [0.54564947, 0.8279966]
keras实现的线性模型
-
import numpy as np
-
from keras.models import Sequential
-
from keras.layers import Dense
-
from keras import optimizers
-
-
x_data = np.random.rand(50)
-
y_data = x_data * 0.6 + 0.8
-
-
model = Sequential()
-
model.add(Dense(units=1, input_dim=1))
-
model.compile(loss = 'mse', optimizer='sgd')
-
for step in range(1001):
-
cos = model.train_on_batch(x_data, y_data)
-
if step % 50 == 0:
-
#print(cos)
-
w, b = model.layers[0].get_weights()
-
print('{} {}, {}'.format(step, w, b))
0 [[-0.7188647]], [0.0306965]
50 [[-0.18009827]], [0.8542211]
100 [[-0.01156109]], [1.0535711]
150 [[0.0583386]], [1.0905893]
200 [[0.10079067]], [1.0860826]
250 [[0.13452986]], [1.0716068]
300 [[0.16455613]], [1.0553639]
350 [[0.19226888]], [1.0394464]
400 [[0.21811923]], [1.0243533]
450 [[0.24230473]], [1.0101675]
500 [[0.26495165]], [0.9968669]
550 [[0.28616306]], [0.9844051]
600 [[0.30603108]], [0.97273123]
650 [[0.32464132]], [0.96179634]
700 [[0.34207347]], [0.95155346]
750 [[0.3584019]], [0.9419592]
800 [[0.3736968]], [0.932972]
850 [[0.38802335]], [0.92455405]
900 [[0.40144295]], [0.916669]
950 [[0.41401306]], [0.90928304]
1000 [[0.42578715]], [0.90236473]
这样不指定学习率,而使用默认值的方式,跑1000次效果比较差
指定学习率为0.1情况:
-
sgd = optimizers.SGD(lr=0.1)
-
model.compile(loss = 'mse', optimizer=sgd)
1000 [[0.5999974]], [0.8000014]
当指定学习率为0.01时
1000 [[0.6899499]], [0.7515403]
还有一种fit的写法
-
import numpy as np
-
from keras.models import Sequential
-
from keras.layers import Dense
-
from keras import optimizers
-
from keras.callbacks import ReduceLROnPlateau
-
-
x_data = np.random.rand(50)
-
y_data = x_data * 0.6 + 0.8
-
-
model = Sequential()
-
model.add(Dense(units=1, input_dim=1))
-
model.compile(loss = 'mse', optimizer='sgd')
-
model.fit(x_data, y_data, epochs=20, batch_size=1)
-
w, b = model.layers[0].get_weights()
-
print('{} {}, {}'.format(step, w, b))
…前面省略
50/50 [==============================] - 0s 2ms/step - loss: 0.0020
1000 [[0.46180877]], [0.87417203]
这种方式,epochs的次数也要达到一定量才能出来比较接近正确的结果,这里只设置了20,是不够的。
学习率与跑的次数如果把握不好,会对结果有较大的影响。
个人理解默认的学习率应该是0.01,所以跟指定为0.01应该是一相差不大的,如果有偏差也只是因为0.01对这种数据及次数下本身结果每次差异比较大。
作者:帅得不敢出门