本项目使用文本卷积神经网络,并使用MovieLens
数据集完成电影推荐的任务。
推荐系统在日常的网络应用中无处不在,比如网上购物、网上买书、新闻app、社交网络、音乐网站、电影网站等等等等,有人的地方就有推荐。根据个人的喜好,相同喜好人群的习惯等信息进行个性化的内容推荐。比如打开新闻类的app,因为有了个性化的内容,每个人看到的新闻首页都是不一样的。
这当然是很有用的,在信息爆炸的今天,获取信息的途径和方式多种多样,人们花费时间最多的不再是去哪获取信息,而是要在众多的信息中寻找自己感兴趣的,这就是信息超载问题。为了解决这个问题,推荐系统应运而生。
协同过滤是推荐系统应用较广泛的技术,该方法搜集用户的历史记录、个人喜好等信息,计算与其他用户的相似度,利用相似用户的评价来预测目标用户对特定项目的喜好程度。优点是会给用户推荐未浏览过的项目,缺点呢,对于新用户来说,没有任何与商品的交互记录和个人喜好等信息,存在冷启动问题,导致模型无法找到相似的用户或商品。
为了解决冷启动的问题,通常的做法是对于刚注册的用户,要求用户先选择自己感兴趣的话题、群组、商品、性格、喜欢的音乐类型等信息,比如豆瓣FM:
下载数据集
运行下面代码把数据集
下载下来
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from collections import Counter
import tensorflow as tf
import os
import pickle
import re
from tensorflow.python.ops import math_ops
from urllib.request import urlretrieve
from os.path import isfile, isdir
from tqdm import tqdm
import zipfile
import hashlib
def _unzip(save_path, _, database_name, data_path):
"""
解压
:param save_path: The path of the gzip files
:param database_name: Name of database
:param data_path: Path to extract to
:param _: HACK - Used to have to same interface as _ungzip
"""
print('Extracting {}...'.format(database_name))
with zipfile.ZipFile(save_path) as zf:
zf.extractall(data_path)
def download_extract(database_name, data_path):
"""
下载提取数据
:param database_name: Database name
"""
DATASET_ML1M = 'ml-1m'
if database_name == DATASET_ML1M:
url = 'http://files.grouplens.org/datasets/movielens/ml-1m.zip'
hash_code = 'c4d9eecfca2ab87c1945afe126590906'
extract_path = os.path.join(data_path, 'ml-1m')
save_path = os.path.join(data_path, 'ml-1m.zip')
extract_fn = _unzip
if os.path.exists(extract_path):
print('Found {} Data'.format(database_name))
return
if not os.path.exists(data_path):
os.makedirs(data_path)
if not os.path.exists(save_path):
with DLProgress(unit='B', unit_scale=True, miniters=1, desc='Downloading {}'.format(database_name)) as pbar:
urlretrieve(
url,
save_path,
pbar.hook)
assert hashlib.md5(open(save_path, 'rb').read()).hexdigest() == hash_code,
'{} file is corrupted. Remove the file and try again.'.format(save_path)
os.makedirs(extract_path)
try:
extract_fn(save_path, extract_path, database_name, data_path)
except Exception as err:
shutil.rmtree(extract_path) # Remove extraction folder if there is an error
raise err
print('Done.')
# Remove compressed data
# os.remove(save_path)
class DLProgress(tqdm):
"""
下载时处理进度条
"""
last_block = 0
def hook(self, block_num=1, block_size=1, total_size=None):
"""
A hook function that will be called once on establishment of the network connection and
once after each block read thereafter.
:param block_num: A count of blocks transferred so far
:param block_size: Block size in bytes
:param total_size: The total size of the file. This may be -1 on older FTP servers which do not return
a file size in response to a retrieval request.
"""
self.total = total_size
self.update((block_num - self.last_block) * block_size)
self.last_block = block_num
data_dir = './'
download_extract('ml-1m', data_dir)
Extracting ml-1m...
Done.
先来看看数据
本项目使用的是MovieLens 1M 数据集,包含6000个用户在近4000部电影上的1亿条评论。
数据集分为三个文件:
- 用户数据users.dat
- 电影数据movies.dat
- 评分数据ratings.dat
用户数据
- 用户ID
- 性别
- 年龄
- 职业ID
- 邮编
数据中的格式:UserID::Gender::Age::Occupation::Zip-code
-
Gender is denoted by a "M" for male and "F" for female
-
Age is chosen from the following ranges:
- 1: "Under 18"
- 18: "18-24"
- 25: "25-34"
- 35: "35-44"
- 45: "45-49"
- 50: "50-55"
- 56: "56+"
-
Occupation is chosen from the following choices:
- 0: "other" or not specified
- 1: "academic/educator"
- 2: "artist"
- 3: "clerical/admin"
- 4: "college/grad student"
- 5: "customer service"
- 6: "doctor/health care"
- 7: "executive/managerial"
- 8: "farmer"
- 9: "homemaker"
- 10: "K-12 student"
- 11: "lawyer"
- 12: "programmer"
- 13: "retired"
- 14: "sales/marketing"
- 15: "scientist"
- 16: "self-employed"
- 17: "technician/engineer"
- 18: "tradesman/craftsman"
- 19: "unemployed"
- 20: "writer"
users_title = ['UserID', 'Gender', 'Age', 'OccupationID', 'Zip-code']
users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
users.head()
UserID | Gender | Age | OccupationID | Zip-code | |
---|---|---|---|---|---|
0 | 1 | F | 1 | 10 | 48067 |
1 | 2 | M | 56 | 16 | 70072 |
2 | 3 | M | 25 | 15 | 55117 |
3 | 4 | M | 45 | 7 | 02460 |
4 | 5 | M | 25 | 20 | 55455 |
可以看出UserID、Gender、Age和Occupation都是类别字段,其中邮编字段是我们不使用的。
电影数据
- 电影ID
- 电影名
- 电影风格
数据中的格式:MovieID::Title::Genres
-
Titles are identical to titles provided by the IMDB (including
year of release) -
Genres are pipe-separated and are selected from the following genres:
- Action
- Adventure
- Animation
- Children's
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western
movies_title = ['MovieID', 'Title', 'Genres']
movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
movies.head()
MovieID | Title | Genres | |
---|---|---|---|
0 | 1 | Toy Story (1995) | Animation|Children's|Comedy |
1 | 2 | Jumanji (1995) | Adventure|Children's|Fantasy |
2 | 3 | Grumpier Old Men (1995) | Comedy|Romance |
3 | 4 | Waiting to Exhale (1995) | Comedy|Drama |
4 | 5 | Father of the Bride Part II (1995) | Comedy |
MovieID是类别字段,Title是文本,Genres也是类别字段
评分数据
- 用户ID
- 电影ID
- 评分
- 时间戳
数据中的格式:UserID::MovieID::Rating::Timestamp
- UserIDs range between 1 and 6040
- MovieIDs range between 1 and 3952
- Ratings are made on a 5-star scale (whole-star ratings only)
- Timestamp is represented in seconds since the epoch as returned by time(2)
- Each user has at least 20 ratings
ratings_title = ['UserID','MovieID', 'Rating', 'timestamps']
ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
ratings.head()
UserID | MovieID | Rating | timestamps | |
---|---|---|---|---|
0 | 1 | 1193 | 5 | 978300760 |
1 | 1 | 661 | 3 | 978302109 |
2 | 1 | 914 | 3 | 978301968 |
3 | 1 | 3408 | 4 | 978300275 |
4 | 1 | 2355 | 5 | 978824291 |
评分字段Rating就是我们要学习的targets,时间戳字段我们不使用。
来说说数据预处理
- UserID、Occupation和MovieID不用变。
- Gender字段:需要将‘F’和‘M’转换成0和1。
- Age字段:要转成7个连续数字0~6。
- Genres字段:是分类字段,要转成数字。首先将Genres中的类别转成字符串到数字的字典,然后再将每个电影的Genres字段转成数字列表,因为有些电影是多个Genres的组合。
- Title字段:处理方式跟Genres字段一样,首先创建文本到数字的字典,然后将Title中的描述转成数字的列表。另外Title中的年份也需要去掉。
- Genres和Title字段需要将长度统一,这样在神经网络中方便处理。空白部分用‘< PAD >’对应的数字填充。
实现数据预处理
def load_data():
"""
从文件中加载数据集
"""
# 读取User数据
users_title = ['UserID', 'Gender', 'Age', 'JobID', 'Zip-code']
users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
users = users.filter(regex='UserID|Gender|Age|JobID')
users_orig = users.values
# 改变User数据中性别和年龄
gender_map = {'F':0, 'M':1}
users['Gender'] = users['Gender'].map(gender_map)
age_map = {val:ii for ii,val in enumerate(set(users['Age']))}
users['Age'] = users['Age'].map(age_map)
# 读取Movie数据集
movies_title = ['MovieID', 'Title', 'Genres']
movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
movies_orig = movies.values
# 将Title中的年份去掉
pattern = re.compile(r'^(.*)((d+))$')
title_map = {val:pattern.match(val).group(1) for ii,val in enumerate(set(movies['Title']))}
movies['Title'] = movies['Title'].map(title_map)
# 电影类型转数字字典
genres_set = set()
for val in movies['Genres'].str.split('|'):
genres_set.update(val)
genres_set.add('<PAD>')
genres2int = {val:ii for ii, val in enumerate(genres_set)}
# 将电影类型转成等长数字列表,长度是18
genres_map = {val:[genres2int[row] for row in val.split('|')] for ii,val in enumerate(set(movies['Genres']))}
for key in genres_map:
for cnt in range(max(genres2int.values()) - len(genres_map[key])):
genres_map[key].insert(len(genres_map[key]) + cnt,genres2int['<PAD>'])
movies['Genres'] = movies['Genres'].map(genres_map)
# 电影Title转数字字典
title_set = set()
for val in movies['Title'].str.split():
title_set.update(val)
title_set.add('<PAD>')
title2int = {val:ii for ii, val in enumerate(title_set)}
# 将电影Title转成等长数字列表,长度是15
title_count = 15
title_map = {val:[title2int[row] for row in val.split()] for ii,val in enumerate(set(movies['Title']))}
for key in title_map:
for cnt in range(title_count - len(title_map[key])):
title_map[key].insert(len(title_map[key]) + cnt,title2int['<PAD>'])
movies['Title'] = movies['Title'].map(title_map)
# 读取评分数据集
ratings_title = ['UserID','MovieID', 'ratings', 'timestamps']
ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
ratings = ratings.filter(regex='UserID|MovieID|ratings')
# 合并三个表
data = pd.merge(pd.merge(ratings, users), movies)
# 将数据分成X和y两张表
target_fields = ['ratings']
features_pd, targets_pd = data.drop(target_fields, axis=1), data[target_fields]
features = features_pd.values
targets_values = targets_pd.values
return title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig
加载数据并保存到本地
- title_count:Title字段的长度(15)
- title_set:Title文本的集合
- genres2int:电影类型转数字的字典
- features:是输入X
- targets_values:是学习目标y
- ratings:评分数据集的Pandas对象
- users:用户数据集的Pandas对象
- movies:电影数据的Pandas对象
- data:三个数据集组合在一起的Pandas对象
- movies_orig:没有做数据处理的原始电影数据
- users_orig:没有做数据处理的原始用户数据
# 加载数据
title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = load_data()
# 存入文件中
pickle.dump((title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig), open('preprocess.p', 'wb'))
预处理后的数据
users.head()
UserID | Gender | Age | JobID | |
---|---|---|---|---|
0 | 1 | 0 | 0 | 10 |
1 | 2 | 1 | 5 | 16 |
2 | 3 | 1 | 6 | 15 |
3 | 4 | 1 | 2 | 7 |
4 | 5 | 1 | 6 | 20 |
movies.head()
MovieID | Title | Genres | |
---|---|---|---|
0 | 1 | [310, 2184, 634, 634, 634, 634, 634, 634, 634,... | [0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17,... |
1 | 2 | [1182, 634, 634, 634, 634, 634, 634, 634, 634,... | [3, 18, 8, 17, 17, 17, 17, 17, 17, 17, 17, 17,... |
2 | 3 | [5011, 4744, 2629, 634, 634, 634, 634, 634, 63... | [7, 9, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,... |
3 | 4 | [4095, 1535, 1886, 634, 634, 634, 634, 634, 63... | [7, 5, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,... |
4 | 5 | [3563, 1725, 3790, 3727, 838, 343, 634, 634, 6... | [7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17... |
movies.values[0]
array([1,
list([310, 2184, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634]),
list([0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17])],
dtype=object)
从本地读取数据
title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = pickle.load(open('preprocess.p', mode='rb'))
模型设计
通过研究数据集中的字段类型,我们发现有一些是类别字段,通常的处理是将这些字段转成one hot编码,但是像UserID、MovieID这样的字段就会变成非常的稀疏,输入的维度急剧膨胀,这是我们不愿意见到的,毕竟我这小笔记本不像大厂动辄能处理数以亿计维度的输入:)
所以在预处理数据时将这些字段转成了数字,我们用这个数字当做嵌入矩阵的索引,在网络的第一层使用了嵌入层,维度是(N,32)和(N,16)。
电影类型的处理要多一步,有时一个电影有多个电影类型,这样从嵌入矩阵索引出来是一个(n,32)的矩阵,因为有多个类型嘛,我们要将这个矩阵求和,变成(1,32)的向量。
电影名的处理比较特殊,没有使用循环神经网络,而是用了文本卷积网络,下文会进行说明。
从嵌入层索引出特征以后,将各特征传入全连接层,将输出再次传入全连接层,最终分别得到(1,200)的用户特征和电影特征两个特征向量。
我们的目的就是要训练出用户特征和电影特征,在实现推荐功能时使用。得到这两个特征以后,就可以选择任意的方式来拟合评分了。我使用了两种方式,一个是上图中画出的将两个特征做向量乘法,将结果与真实评分做回归,采用MSE优化损失。因为本质上这是一个回归问题,另一种方式是,将两个特征作为输入,再次传入全连接层,输出一个值,将输出值回归到真实评分,采用MSE优化损失。
实际上第二个方式的MSE loss在0.8附近,第一个方式在1附近,5次迭代的结果。
文本卷积网络
网络看起来像下面这样
图片来自Kim Yoon的论文:Convolutional Neural Networks for Sentence Classification
将卷积神经网络用于文本的文章建议你阅读Understanding Convolutional Neural Networks for NLP
网络的第一层是词嵌入层,由每一个单词的嵌入向量组成的嵌入矩阵。下一层使用多个不同尺寸(窗口大小)的卷积核在嵌入矩阵上做卷积,窗口大小指的是每次卷积覆盖几个单词。这里跟对图像做卷积不太一样,图像的卷积通常用2x2、3x3、5x5之类的尺寸,而文本卷积要覆盖整个单词的嵌入向量,所以尺寸是(单词数,向量维度),比如每次滑动3个,4个或者5个单词。第三层网络是max pooling得到一个长向量,最后使用dropout做正则化,最终得到了电影Title的特征。
辅助函数
import tensorflow as tf
import os
import pickle
def save_params(params):
"""
保存参数到文件中
"""
pickle.dump(params, open('params.p', 'wb'))
def load_params():
"""
从文件中加载参数
"""
return pickle.load(open('params.p', mode='rb'))
编码实现
# 嵌入矩阵的维度
embed_dim = 32
# 用户ID个数
uid_max = max(features.take(0,1)) + 1 # 6040
# 性别个数
gender_max = max(features.take(2,1)) + 1 # 1 + 1 = 2
# 年龄类别个数
age_max = max(features.take(3,1)) + 1 # 6 + 1 = 7
# 职业个数
job_max = max(features.take(4,1)) + 1# 20 + 1 = 21
# 电影ID个数
movie_id_max = max(features.take(1,1)) + 1 # 3952
# 电影类型个数
movie_categories_max = max(genres2int.values()) + 1 # 18 + 1 = 19
# 电影名单词个数
movie_title_max = len(title_set) # 5216
# 对电影类型嵌入向量做加和操作的标志,考虑过使用mean做平均,但是没实现mean
combiner = "sum"
# 电影名长度
sentences_size = title_count # = 15
# 文本卷积滑动窗口,分别滑动2, 3, 4, 5个单词
window_sizes = {2, 3, 4, 5}
# 文本卷积核数量
filter_num = 8
# 电影ID转下标的字典,数据集中电影ID跟下标不一致,比如第5行的数据电影ID不一定是5
movieid2idx = {val[0]:i for i, val in enumerate(movies.values)}
超参
# Number of Epochs
num_epochs = 5
# Batch Size
batch_size = 256
dropout_keep = 0.5
# Learning Rate
learning_rate = 0.0001
# Show stats for every n number of batches
show_every_n_batches = 20
save_dir = './save'
输入
定义输入的占位符
def get_inputs():
uid = tf.placeholder(tf.int32, [None, 1], name="uid")
user_gender = tf.placeholder(tf.int32, [None, 1], name="user_gender")
user_age = tf.placeholder(tf.int32, [None, 1], name="user_age")
user_job = tf.placeholder(tf.int32, [None, 1], name="user_job")
movie_id = tf.placeholder(tf.int32, [None, 1], name="movie_id")
movie_categories = tf.placeholder(tf.int32, [None, 18], name="movie_categories")
movie_titles = tf.placeholder(tf.int32, [None, 15], name="movie_titles")
targets = tf.placeholder(tf.int32, [None, 1], name="targets")
LearningRate = tf.placeholder(tf.float32, name = "LearningRate")
dropout_keep_prob = tf.placeholder(tf.float32, name = "dropout_keep_prob")
return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, LearningRate, dropout_keep_prob
构建神经网络
定义User的嵌入矩阵
def get_user_embedding(uid, user_gender, user_age, user_job):
with tf.name_scope("user_embedding"):
uid_embed_matrix = tf.Variable(tf.random_uniform([uid_max, embed_dim], -1, 1), name = "uid_embed_matrix")
uid_embed_layer = tf.nn.embedding_lookup(uid_embed_matrix, uid, name = "uid_embed_layer")
gender_embed_matrix = tf.Variable(tf.random_uniform([gender_max, embed_dim // 2], -1, 1), name= "gender_embed_matrix")
gender_embed_layer = tf.nn.embedding_lookup(gender_embed_matrix, user_gender, name = "gender_embed_layer")
age_embed_matrix = tf.Variable(tf.random_uniform([age_max, embed_dim // 2], -1, 1), name="age_embed_matrix")
age_embed_layer = tf.nn.embedding_lookup(age_embed_matrix, user_age, name="age_embed_layer")
job_embed_matrix = tf.Variable(tf.random_uniform([job_max, embed_dim // 2], -1, 1), name = "job_embed_matrix")
job_embed_layer = tf.nn.embedding_lookup(job_embed_matrix, user_job, name = "job_embed_layer")
return uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer
将User的嵌入矩阵一起全连接生成User的特征
def get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer):
with tf.name_scope("user_fc"):
#第一层全连接
uid_fc_layer = tf.layers.dense(uid_embed_layer, embed_dim, name = "uid_fc_layer", activation=tf.nn.relu)
gender_fc_layer = tf.layers.dense(gender_embed_layer, embed_dim, name = "gender_fc_layer", activation=tf.nn.relu)
age_fc_layer = tf.layers.dense(age_embed_layer, embed_dim, name ="age_fc_layer", activation=tf.nn.relu)
job_fc_layer = tf.layers.dense(job_embed_layer, embed_dim, name = "job_fc_layer", activation=tf.nn.relu)
#第二层全连接
user_combine_layer = tf.concat([uid_fc_layer, gender_fc_layer, age_fc_layer, job_fc_layer], 2) #(?, 1, 128)
user_combine_layer = tf.contrib.layers.fully_connected(user_combine_layer, 200, tf.tanh) #(?, 1, 200)
user_combine_layer_flat = tf.reshape(user_combine_layer, [-1, 200])
return user_combine_layer, user_combine_layer_flat
定义Movie ID的嵌入矩阵
def get_movie_id_embed_layer(movie_id):
with tf.name_scope("movie_embedding"):
movie_id_embed_matrix = tf.Variable(tf.random_uniform([movie_id_max, embed_dim], -1, 1), name = "movie_id_embed_matrix")
movie_id_embed_layer = tf.nn.embedding_lookup(movie_id_embed_matrix, movie_id, name = "movie_id_embed_layer")
return movie_id_embed_layer
对电影类型的多个嵌入向量做加和
def get_movie_categories_layers(movie_categories):
with tf.name_scope("movie_categories_layers"):
movie_categories_embed_matrix = tf.Variable(tf.random_uniform([movie_categories_max, embed_dim], -1, 1), name = "movie_categories_embed_matrix")
movie_categories_embed_layer = tf.nn.embedding_lookup(movie_categories_embed_matrix, movie_categories, name = "movie_categories_embed_layer")
if combiner == "sum":
movie_categories_embed_layer = tf.reduce_sum(movie_categories_embed_layer, axis=1, keep_dims=True)
# elif combiner == "mean":
return movie_categories_embed_layer
Movie Title的文本卷积网络实现
def get_movie_cnn_layer(movie_titles):
#从嵌入矩阵中得到电影名对应的各个单词的嵌入向量
with tf.name_scope("movie_embedding"):
movie_title_embed_matrix = tf.Variable(tf.random_uniform([movie_title_max, embed_dim], -1, 1), name = "movie_title_embed_matrix")
movie_title_embed_layer = tf.nn.embedding_lookup(movie_title_embed_matrix, movie_titles, name = "movie_title_embed_layer")
movie_title_embed_layer_expand = tf.expand_dims(movie_title_embed_layer, -1)
#对文本嵌入层使用不同尺寸的卷积核做卷积和最大池化
pool_layer_lst = []
for window_size in window_sizes:
with tf.name_scope("movie_txt_conv_maxpool_{}".format(window_size)):
filter_weights = tf.Variable(tf.truncated_normal([window_size, embed_dim, 1, filter_num],stddev=0.1),name = "filter_weights")
filter_bias = tf.Variable(tf.constant(0.1, shape=[filter_num]), name="filter_bias")
conv_layer = tf.nn.conv2d(movie_title_embed_layer_expand, filter_weights, [1,1,1,1], padding="VALID", name="conv_layer")
relu_layer = tf.nn.relu(tf.nn.bias_add(conv_layer,filter_bias), name ="relu_layer")
maxpool_layer = tf.nn.max_pool(relu_layer, [1,sentences_size - window_size + 1 ,1,1], [1,1,1,1], padding="VALID", name="maxpool_layer")
pool_layer_lst.append(maxpool_layer)
#Dropout层
with tf.name_scope("pool_dropout"):
pool_layer = tf.concat(pool_layer_lst, 3, name ="pool_layer")
max_num = len(window_sizes) * filter_num
pool_layer_flat = tf.reshape(pool_layer , [-1, 1, max_num], name = "pool_layer_flat")
dropout_layer = tf.nn.dropout(pool_layer_flat, dropout_keep_prob, name = "dropout_layer")
return pool_layer_flat, dropout_layer
将Movie的各个层一起做全连接
def get_movie_feature_layer(movie_id_embed_layer, movie_categories_embed_layer, dropout_layer):
with tf.name_scope("movie_fc"):
#第一层全连接
movie_id_fc_layer = tf.layers.dense(movie_id_embed_layer, embed_dim, name = "movie_id_fc_layer", activation=tf.nn.relu)
movie_categories_fc_layer = tf.layers.dense(movie_categories_embed_layer, embed_dim, name = "movie_categories_fc_layer", activation=tf.nn.relu)
#第二层全连接
movie_combine_layer = tf.concat([movie_id_fc_layer, movie_categories_fc_layer, dropout_layer], 2) #(?, 1, 96)
movie_combine_layer = tf.contrib.layers.fully_connected(movie_combine_layer, 200, tf.tanh) #(?, 1, 200)
movie_combine_layer_flat = tf.reshape(movie_combine_layer, [-1, 200])
return movie_combine_layer, movie_combine_layer_flat
构建计算图
tf.reset_default_graph()
train_graph = tf.Graph()
with train_graph.as_default():
#获取输入占位符
uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob = get_inputs()
#获取User的4个嵌入向量
uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer = get_user_embedding(uid, user_gender, user_age, user_job)
#得到用户特征
user_combine_layer, user_combine_layer_flat = get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer)
#获取电影ID的嵌入向量
movie_id_embed_layer = get_movie_id_embed_layer(movie_id)
#获取电影类型的嵌入向量
movie_categories_embed_layer = get_movie_categories_layers(movie_categories)
#获取电影名的特征向量
pool_layer_flat, dropout_layer = get_movie_cnn_layer(movie_titles)
#得到电影特征
movie_combine_layer, movie_combine_layer_flat = get_movie_feature_layer(movie_id_embed_layer,
movie_categories_embed_layer,
dropout_layer)
#计算出评分,要注意两个不同的方案,inference的名字(name值)是不一样的,后面做推荐时要根据name取得tensor
with tf.name_scope("inference"):
#将用户特征和电影特征作为输入,经过全连接,输出一个值的方案
# inference_layer = tf.concat([user_combine_layer_flat, movie_combine_layer_flat], 1) #(?, 200)
# inference = tf.layers.dense(inference_layer, 1,
# kernel_initializer=tf.truncated_normal_initializer(stddev=0.01),
# kernel_regularizer=tf.nn.l2_loss, name="inference")
#简单的将用户特征和电影特征做矩阵乘法得到一个预测评分
# inference = tf.matmul(user_combine_layer_flat, tf.transpose(movie_combine_layer_flat))
inference = tf.reduce_sum(user_combine_layer_flat * movie_combine_layer_flat, axis=1)
inference = tf.expand_dims(inference, axis=1)
with tf.name_scope("loss"):
# MSE损失,将计算值回归到评分
cost = tf.losses.mean_squared_error(targets, inference )
loss = tf.reduce_mean(cost)
# 优化损失
# train_op = tf.train.AdamOptimizer(lr).minimize(loss) #cost
global_step = tf.Variable(0, name="global_step", trainable=False)
optimizer = tf.train.AdamOptimizer(lr)
gradients = optimizer.compute_gradients(loss) #cost
train_op = optimizer.apply_gradients(gradients, global_step=global_step)
WARNING:tensorflow:From <ipython-input-20-559a1ee9ce9e>:6: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
inference
<tf.Tensor 'inference/ExpandDims:0' shape=(?, 1) dtype=float32>
取得batch
def get_batches(Xs, ys, batch_size):
for start in range(0, len(Xs), batch_size):
end = min(start + batch_size, len(Xs))
yield Xs[start:end], ys[start:end]
训练网络
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import time
import datetime
losses = {'train':[], 'test':[]}
with tf.Session(graph=train_graph) as sess:
#搜集数据给tensorBoard用
# Keep track of gradient values and sparsity
grad_summaries = []
for g, v in gradients:
if g is not None:
grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name.replace(':', '_')), g)
sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name.replace(':', '_')), tf.nn.zero_fraction(g))
grad_summaries.append(grad_hist_summary)
grad_summaries.append(sparsity_summary)
grad_summaries_merged = tf.summary.merge(grad_summaries)
# Output directory for models and summaries
timestamp = str(int(time.time()))
out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
print("Writing to {}
".format(out_dir))
# Summaries for loss and accuracy
loss_summary = tf.summary.scalar("loss", loss)
# Train Summaries
train_summary_op = tf.summary.merge([loss_summary, grad_summaries_merged])
train_summary_dir = os.path.join(out_dir, "summaries", "train")
train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)
# Inference summaries
inference_summary_op = tf.summary.merge([loss_summary])
inference_summary_dir = os.path.join(out_dir, "summaries", "inference")
inference_summary_writer = tf.summary.FileWriter(inference_summary_dir, sess.graph)
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver()
for epoch_i in range(num_epochs):
#将数据集分成训练集和测试集,随机种子不固定
train_X,test_X, train_y, test_y = train_test_split(features,
targets_values,
test_size = 0.2,
random_state = 0)
train_batches = get_batches(train_X, train_y, batch_size)
test_batches = get_batches(test_X, test_y, batch_size)
#训练的迭代,保存训练损失
for batch_i in range(len(train_X) // batch_size):
x, y = next(train_batches)
categories = np.zeros([batch_size, 18])
for i in range(batch_size):
categories[i] = x.take(6,1)[i]
titles = np.zeros([batch_size, sentences_size])
for i in range(batch_size):
titles[i] = x.take(5,1)[i]
feed = {
uid: np.reshape(x.take(0,1), [batch_size, 1]),
user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
user_age: np.reshape(x.take(3,1), [batch_size, 1]),
user_job: np.reshape(x.take(4,1), [batch_size, 1]),
movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
movie_categories: categories, #x.take(6,1)
movie_titles: titles, #x.take(5,1)
targets: np.reshape(y, [batch_size, 1]),
dropout_keep_prob: dropout_keep, #dropout_keep
lr: learning_rate}
step, train_loss, summaries, _ = sess.run([global_step, loss, train_summary_op, train_op], feed) #cost
losses['train'].append(train_loss)
train_summary_writer.add_summary(summaries, step) #
# Show every <show_every_n_batches> batches
if (epoch_i * (len(train_X) // batch_size) + batch_i) % show_every_n_batches == 0:
time_str = datetime.datetime.now().isoformat()
print('{}: Epoch {:>3} Batch {:>4}/{} train_loss = {:.3f}'.format(
time_str,
epoch_i,
batch_i,
(len(train_X) // batch_size),
train_loss))
#使用测试数据的迭代
for batch_i in range(len(test_X) // batch_size):
x, y = next(test_batches)
categories = np.zeros([batch_size, 18])
for i in range(batch_size):
categories[i] = x.take(6,1)[i]
titles = np.zeros([batch_size, sentences_size])
for i in range(batch_size):
titles[i] = x.take(5,1)[i]
feed = {
uid: np.reshape(x.take(0,1), [batch_size, 1]),
user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
user_age: np.reshape(x.take(3,1), [batch_size, 1]),
user_job: np.reshape(x.take(4,1), [batch_size, 1]),
movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
movie_categories: categories, #x.take(6,1)
movie_titles: titles, #x.take(5,1)
targets: np.reshape(y, [batch_size, 1]),
dropout_keep_prob: 1,
lr: learning_rate}
step, test_loss, summaries = sess.run([global_step, loss, inference_summary_op], feed) #cost
#保存测试损失
losses['test'].append(test_loss)
inference_summary_writer.add_summary(summaries, step) #
time_str = datetime.datetime.now().isoformat()
if (epoch_i * (len(test_X) // batch_size) + batch_i) % show_every_n_batches == 0:
print('{}: Epoch {:>3} Batch {:>4}/{} test_loss = {:.3f}'.format(
time_str,
epoch_i,
batch_i,
(len(test_X) // batch_size),
test_loss))
# Save Model
saver.save(sess, save_dir) #, global_step=epoch_i
print('Model Trained and Saved')
Writing to F:jupyterworkmovie_recommender-master
uns1554780412
2019-04-09T11:26:53.633627: Epoch 0 Batch 0/3125 train_loss = 8.810
2019-04-09T11:26:54.052240: Epoch 0 Batch 20/3125 train_loss = 3.457
2019-04-09T11:26:54.466181: Epoch 0 Batch 40/3125 train_loss = 2.563
2019-04-09T11:26:54.890814: Epoch 0 Batch 60/3125 train_loss = 1.962
2019-04-09T11:26:55.315803: Epoch 0 Batch 80/3125 train_loss = 1.852
2019-04-09T11:26:55.730125: Epoch 0 Batch 100/3125 train_loss = 1.826
2019-04-09T11:26:56.146734: Epoch 0 Batch 120/3125 train_loss = 1.781
2019-04-09T11:26:56.559145: Epoch 0 Batch 140/3125 train_loss = 1.630
2019-04-09T11:26:56.971689: Epoch 0 Batch 160/3125 train_loss = 1.652
2019-04-09T11:26:57.394125: Epoch 0 Batch 180/3125 train_loss = 1.361
2019-04-09T11:26:57.810824: Epoch 0 Batch 200/3125 train_loss = 1.715
2019-04-09T11:26:58.227455: Epoch 0 Batch 220/3125 train_loss = 1.430
2019-04-09T11:26:58.643714: Epoch 0 Batch 240/3125 train_loss = 1.342
2019-04-09T11:26:59.056816: Epoch 0 Batch 260/3125 train_loss = 1.512
2019-04-09T11:26:59.468409: Epoch 0 Batch 280/3125 train_loss = 1.678
2019-04-09T11:26:59.882126: Epoch 0 Batch 300/3125 train_loss = 1.482
2019-04-09T11:27:00.294685: Epoch 0 Batch 320/3125 train_loss = 1.463
2019-04-09T11:27:00.826546: Epoch 0 Batch 340/3125 train_loss = 1.333
2019-04-09T11:27:01.239302: Epoch 0 Batch 360/3125 train_loss = 1.318
2019-04-09T11:27:01.652219: Epoch 0 Batch 380/3125 train_loss = 1.253
2019-04-09T11:27:02.067588: Epoch 0 Batch 400/3125 train_loss = 1.155
2019-04-09T11:27:02.483490: Epoch 0 Batch 420/3125 train_loss = 1.341
2019-04-09T11:27:02.892079: Epoch 0 Batch 440/3125 train_loss = 1.429
2019-04-09T11:27:03.305331: Epoch 0 Batch 460/3125 train_loss = 1.315
2019-04-09T11:27:03.721028: Epoch 0 Batch 480/3125 train_loss = 1.351
2019-04-09T11:27:04.130622: Epoch 0 Batch 500/3125 train_loss = 1.043
2019-04-09T11:27:04.549775: Epoch 0 Batch 520/3125 train_loss = 1.340
2019-04-09T11:27:04.963936: Epoch 0 Batch 540/3125 train_loss = 1.258
2019-04-09T11:27:05.378772: Epoch 0 Batch 560/3125 train_loss = 1.474
2019-04-09T11:27:05.790245: Epoch 0 Batch 580/3125 train_loss = 1.399
2019-04-09T11:27:06.202342: Epoch 0 Batch 600/3125 train_loss = 1.374
2019-04-09T11:27:06.616239: Epoch 0 Batch 620/3125 train_loss = 1.429
2019-04-09T11:27:07.027259: Epoch 0 Batch 640/3125 train_loss = 1.346
2019-04-09T11:27:07.443480: Epoch 0 Batch 660/3125 train_loss = 1.377
2019-04-09T11:27:07.857450: Epoch 0 Batch 680/3125 train_loss = 1.191
2019-04-09T11:27:08.269326: Epoch 0 Batch 700/3125 train_loss = 1.302
2019-04-09T11:27:08.685203: Epoch 0 Batch 720/3125 train_loss = 1.171
2019-04-09T11:27:09.098769: Epoch 0 Batch 740/3125 train_loss = 1.403
2019-04-09T11:27:09.519383: Epoch 0 Batch 760/3125 train_loss = 1.369
2019-04-09T11:27:09.931100: Epoch 0 Batch 780/3125 train_loss = 1.402
2019-04-09T11:27:10.343018: Epoch 0 Batch 800/3125 train_loss = 1.250
2019-04-09T11:27:10.755994: Epoch 0 Batch 820/3125 train_loss = 1.292
2019-04-09T11:27:11.169596: Epoch 0 Batch 840/3125 train_loss = 1.215
2019-04-09T11:27:11.583017: Epoch 0 Batch 860/3125 train_loss = 1.201
2019-04-09T11:27:11.997121: Epoch 0 Batch 880/3125 train_loss = 1.189
2019-04-09T11:27:12.411392: Epoch 0 Batch 900/3125 train_loss = 1.240
2019-04-09T11:27:12.824492: Epoch 0 Batch 920/3125 train_loss = 1.220
2019-04-09T11:27:13.238173: Epoch 0 Batch 940/3125 train_loss = 1.414
2019-04-09T11:27:13.649014: Epoch 0 Batch 960/3125 train_loss = 1.332
2019-04-09T11:27:14.058947: Epoch 0 Batch 980/3125 train_loss = 1.345
2019-04-09T11:27:14.491861: Epoch 0 Batch 1000/3125 train_loss = 1.275
2019-04-09T11:27:14.920000: Epoch 0 Batch 1020/3125 train_loss = 1.341
2019-04-09T11:27:15.337096: Epoch 0 Batch 1040/3125 train_loss = 1.281
2019-04-09T11:27:15.760618: Epoch 0 Batch 1060/3125 train_loss = 1.478
2019-04-09T11:27:16.174406: Epoch 0 Batch 1080/3125 train_loss = 1.158
2019-04-09T11:27:16.591839: Epoch 0 Batch 1100/3125 train_loss = 1.268
2019-04-09T11:27:17.013498: Epoch 0 Batch 1120/3125 train_loss = 1.270
2019-04-09T11:27:17.438626: Epoch 0 Batch 1140/3125 train_loss = 1.280
2019-04-09T11:27:17.852226: Epoch 0 Batch 1160/3125 train_loss = 1.205
2019-04-09T11:27:18.273478: Epoch 0 Batch 1180/3125 train_loss = 1.274
2019-04-09T11:27:18.696339: Epoch 0 Batch 1200/3125 train_loss = 1.284
2019-04-09T11:27:19.117179: Epoch 0 Batch 1220/3125 train_loss = 1.155
2019-04-09T11:27:19.524543: Epoch 0 Batch 1240/3125 train_loss = 1.143
2019-04-09T11:27:19.938738: Epoch 0 Batch 1260/3125 train_loss = 1.247
2019-04-09T11:27:20.350656: Epoch 0 Batch 1280/3125 train_loss = 1.223
2019-04-09T11:27:20.761388: Epoch 0 Batch 1300/3125 train_loss = 1.267
2019-04-09T11:27:21.177496: Epoch 0 Batch 1320/3125 train_loss = 1.183
2019-04-09T11:27:21.590091: Epoch 0 Batch 1340/3125 train_loss = 1.047
2019-04-09T11:27:22.004788: Epoch 0 Batch 1360/3125 train_loss = 1.149
2019-04-09T11:27:22.414416: Epoch 0 Batch 1380/3125 train_loss = 1.114
2019-04-09T11:27:22.827015: Epoch 0 Batch 1400/3125 train_loss = 1.282
2019-04-09T11:27:23.236719: Epoch 0 Batch 1420/3125 train_loss = 1.256
2019-04-09T11:27:23.645758: Epoch 0 Batch 1440/3125 train_loss = 1.174
2019-04-09T11:27:24.063386: Epoch 0 Batch 1460/3125 train_loss = 1.251
2019-04-09T11:27:24.477184: Epoch 0 Batch 1480/3125 train_loss = 1.180
2019-04-09T11:27:24.890286: Epoch 0 Batch 1500/3125 train_loss = 1.322
2019-04-09T11:27:25.300422: Epoch 0 Batch 1520/3125 train_loss = 1.277
2019-04-09T11:27:25.709640: Epoch 0 Batch 1540/3125 train_loss = 1.270
2019-04-09T11:27:26.122241: Epoch 0 Batch 1560/3125 train_loss = 1.122
2019-04-09T11:27:26.534862: Epoch 0 Batch 1580/3125 train_loss = 1.138
2019-04-09T11:27:26.947461: Epoch 0 Batch 1600/3125 train_loss = 1.274
2019-04-09T11:27:27.359900: Epoch 0 Batch 1620/3125 train_loss = 1.169
2019-04-09T11:27:27.769969: Epoch 0 Batch 1640/3125 train_loss = 1.235
2019-04-09T11:27:28.180519: Epoch 0 Batch 1660/3125 train_loss = 1.282
2019-04-09T11:27:28.592653: Epoch 0 Batch 1680/3125 train_loss = 1.174
2019-04-09T11:27:29.003519: Epoch 0 Batch 1700/3125 train_loss = 1.009
2019-04-09T11:27:29.414262: Epoch 0 Batch 1720/3125 train_loss = 1.149
2019-04-09T11:27:29.828869: Epoch 0 Batch 1740/3125 train_loss = 1.221
2019-04-09T11:27:30.238773: Epoch 0 Batch 1760/3125 train_loss = 1.288
2019-04-09T11:27:30.648342: Epoch 0 Batch 1780/3125 train_loss = 1.067
2019-04-09T11:27:31.188925: Epoch 0 Batch 1800/3125 train_loss = 1.196
2019-04-09T11:27:31.603231: Epoch 0 Batch 1820/3125 train_loss = 1.142
2019-04-09T11:27:32.010926: Epoch 0 Batch 1840/3125 train_loss = 1.256
2019-04-09T11:27:32.425741: Epoch 0 Batch 1860/3125 train_loss = 1.345
2019-04-09T11:27:32.839345: Epoch 0 Batch 1880/3125 train_loss = 1.215
2019-04-09T11:27:33.248900: Epoch 0 Batch 1900/3125 train_loss = 1.048
2019-04-09T11:27:33.663116: Epoch 0 Batch 1920/3125 train_loss = 1.211
2019-04-09T11:27:34.074400: Epoch 0 Batch 1940/3125 train_loss = 1.070
2019-04-09T11:27:34.484302: Epoch 0 Batch 1960/3125 train_loss = 1.131
2019-04-09T11:27:34.894396: Epoch 0 Batch 1980/3125 train_loss = 1.196
2019-04-09T11:27:35.306864: Epoch 0 Batch 2000/3125 train_loss = 1.347
2019-04-09T11:27:35.722043: Epoch 0 Batch 2020/3125 train_loss = 1.297
2019-04-09T11:27:36.135143: Epoch 0 Batch 2040/3125 train_loss = 1.180
2019-04-09T11:27:36.543475: Epoch 0 Batch 2060/3125 train_loss = 1.025
2019-04-09T11:27:36.953066: Epoch 0 Batch 2080/3125 train_loss = 1.265
2019-04-09T11:27:37.370478: Epoch 0 Batch 2100/3125 train_loss = 1.094
2019-04-09T11:27:37.782974: Epoch 0 Batch 2120/3125 train_loss = 1.069
2019-04-09T11:27:38.190560: Epoch 0 Batch 2140/3125 train_loss = 1.132
2019-04-09T11:27:38.604746: Epoch 0 Batch 2160/3125 train_loss = 1.122
2019-04-09T11:27:39.019245: Epoch 0 Batch 2180/3125 train_loss = 1.166
2019-04-09T11:27:39.431946: Epoch 0 Batch 2200/3125 train_loss = 1.137
2019-04-09T11:27:39.847258: Epoch 0 Batch 2220/3125 train_loss = 1.118
2019-04-09T11:27:40.256398: Epoch 0 Batch 2240/3125 train_loss = 1.011
2019-04-09T11:27:40.665478: Epoch 0 Batch 2260/3125 train_loss = 1.160
2019-04-09T11:27:41.078758: Epoch 0 Batch 2280/3125 train_loss = 1.164
2019-04-09T11:27:41.489744: Epoch 0 Batch 2300/3125 train_loss = 1.163
2019-04-09T11:27:41.901845: Epoch 0 Batch 2320/3125 train_loss = 1.288
2019-04-09T11:27:42.312713: Epoch 0 Batch 2340/3125 train_loss = 1.177
2019-04-09T11:27:42.725320: Epoch 0 Batch 2360/3125 train_loss = 1.130
2019-04-09T11:27:43.132848: Epoch 0 Batch 2380/3125 train_loss = 1.163
2019-04-09T11:27:43.541373: Epoch 0 Batch 2400/3125 train_loss = 1.231
2019-04-09T11:27:43.947189: Epoch 0 Batch 2420/3125 train_loss = 1.133
2019-04-09T11:27:44.355782: Epoch 0 Batch 2440/3125 train_loss = 1.272
2019-04-09T11:27:44.768420: Epoch 0 Batch 2460/3125 train_loss = 1.128
2019-04-09T11:27:45.177740: Epoch 0 Batch 2480/3125 train_loss = 1.184
2019-04-09T11:27:45.584471: Epoch 0 Batch 2500/3125 train_loss = 1.161
2019-04-09T11:27:45.993960: Epoch 0 Batch 2520/3125 train_loss = 1.055
2019-04-09T11:27:46.402164: Epoch 0 Batch 2540/3125 train_loss = 1.108
2019-04-09T11:27:46.812056: Epoch 0 Batch 2560/3125 train_loss = 0.977
2019-04-09T11:27:47.230169: Epoch 0 Batch 2580/3125 train_loss = 1.101
2019-04-09T11:27:47.639261: Epoch 0 Batch 2600/3125 train_loss = 1.141
2019-04-09T11:27:48.047294: Epoch 0 Batch 2620/3125 train_loss = 1.098
2019-04-09T11:27:48.457188: Epoch 0 Batch 2640/3125 train_loss = 1.096
2019-04-09T11:27:48.870683: Epoch 0 Batch 2660/3125 train_loss = 1.241
2019-04-09T11:27:49.282413: Epoch 0 Batch 2680/3125 train_loss = 1.001
2019-04-09T11:27:49.690957: Epoch 0 Batch 2700/3125 train_loss = 1.266
2019-04-09T11:27:50.103555: Epoch 0 Batch 2720/3125 train_loss = 1.158
2019-04-09T11:27:50.514897: Epoch 0 Batch 2740/3125 train_loss = 1.210
2019-04-09T11:27:50.924909: Epoch 0 Batch 2760/3125 train_loss = 1.234
2019-04-09T11:27:51.336251: Epoch 0 Batch 2780/3125 train_loss = 1.121
2019-04-09T11:27:51.748175: Epoch 0 Batch 2800/3125 train_loss = 1.377
2019-04-09T11:27:52.164028: Epoch 0 Batch 2820/3125 train_loss = 1.417
2019-04-09T11:27:52.583020: Epoch 0 Batch 2840/3125 train_loss = 1.146
2019-04-09T11:27:53.001214: Epoch 0 Batch 2860/3125 train_loss = 1.067
2019-04-09T11:27:53.413084: Epoch 0 Batch 2880/3125 train_loss = 1.160
2019-04-09T11:27:53.830194: Epoch 0 Batch 2900/3125 train_loss = 1.134
2019-04-09T11:27:54.242290: Epoch 0 Batch 2920/3125 train_loss = 1.188
2019-04-09T11:27:54.657395: Epoch 0 Batch 2940/3125 train_loss = 1.103
2019-04-09T11:27:55.066253: Epoch 0 Batch 2960/3125 train_loss = 1.222
2019-04-09T11:27:55.476481: Epoch 0 Batch 2980/3125 train_loss = 1.197
2019-04-09T11:27:55.891054: Epoch 0 Batch 3000/3125 train_loss = 1.123
2019-04-09T11:27:56.299092: Epoch 0 Batch 3020/3125 train_loss = 1.213
2019-04-09T11:27:56.709737: Epoch 0 Batch 3040/3125 train_loss = 1.128
2019-04-09T11:27:57.121834: Epoch 0 Batch 3060/3125 train_loss = 1.174
2019-04-09T11:27:57.537893: Epoch 0 Batch 3080/3125 train_loss = 1.253
2019-04-09T11:27:57.945981: Epoch 0 Batch 3100/3125 train_loss = 1.169
2019-04-09T11:27:58.355315: Epoch 0 Batch 3120/3125 train_loss = 1.011
2019-04-09T11:27:58.525868: Epoch 0 Batch 0/781 test_loss = 1.003
2019-04-09T11:27:58.655211: Epoch 0 Batch 20/781 test_loss = 1.118
2019-04-09T11:27:58.785057: Epoch 0 Batch 40/781 test_loss = 0.975
2019-04-09T11:27:58.914903: Epoch 0 Batch 60/781 test_loss = 1.317
2019-04-09T11:27:59.043746: Epoch 0 Batch 80/781 test_loss = 1.261
2019-04-09T11:27:59.172589: Epoch 0 Batch 100/781 test_loss = 1.333
2019-04-09T11:27:59.301431: Epoch 0 Batch 120/781 test_loss = 1.186
2019-04-09T11:27:59.429434: Epoch 0 Batch 140/781 test_loss = 1.192
2019-04-09T11:27:59.557775: Epoch 0 Batch 160/781 test_loss = 1.259
2019-04-09T11:27:59.685114: Epoch 0 Batch 180/781 test_loss = 1.189
2019-04-09T11:27:59.813455: Epoch 0 Batch 200/781 test_loss = 1.093
2019-04-09T11:27:59.939791: Epoch 0 Batch 220/781 test_loss = 0.963
2019-04-09T11:28:00.066629: Epoch 0 Batch 240/781 test_loss = 1.173
2019-04-09T11:28:00.194468: Epoch 0 Batch 260/781 test_loss = 1.160
2019-04-09T11:28:00.321306: Epoch 0 Batch 280/781 test_loss = 1.354
2019-04-09T11:28:00.448551: Epoch 0 Batch 300/781 test_loss = 1.140
2019-04-09T11:28:00.576892: Epoch 0 Batch 320/781 test_loss = 1.270
2019-04-09T11:28:00.705735: Epoch 0 Batch 340/781 test_loss = 0.836
2019-04-09T11:28:00.832572: Epoch 0 Batch 360/781 test_loss = 1.297
2019-04-09T11:28:00.961415: Epoch 0 Batch 380/781 test_loss = 1.141
2019-04-09T11:28:01.090257: Epoch 0 Batch 400/781 test_loss = 1.135
2019-04-09T11:28:01.217095: Epoch 0 Batch 420/781 test_loss = 0.986
2019-04-09T11:28:01.344936: Epoch 0 Batch 440/781 test_loss = 1.153
2019-04-09T11:28:01.472184: Epoch 0 Batch 460/781 test_loss = 1.084
2019-04-09T11:28:01.599021: Epoch 0 Batch 480/781 test_loss = 1.101
2019-04-09T11:28:01.726862: Epoch 0 Batch 500/781 test_loss = 0.917
2019-04-09T11:28:01.854702: Epoch 0 Batch 520/781 test_loss = 1.127
2019-04-09T11:28:01.980536: Epoch 0 Batch 540/781 test_loss = 1.025
2019-04-09T11:28:02.108377: Epoch 0 Batch 560/781 test_loss = 1.267
2019-04-09T11:28:02.235214: Epoch 0 Batch 580/781 test_loss = 1.131
2019-04-09T11:28:02.362552: Epoch 0 Batch 600/781 test_loss = 1.179
2019-04-09T11:28:02.490387: Epoch 0 Batch 620/781 test_loss = 1.140
2019-04-09T11:28:02.617224: Epoch 0 Batch 640/781 test_loss = 1.194
2019-04-09T11:28:02.744563: Epoch 0 Batch 660/781 test_loss = 1.135
2019-04-09T11:28:02.875411: Epoch 0 Batch 680/781 test_loss = 1.403
2019-04-09T11:28:03.002248: Epoch 0 Batch 700/781 test_loss = 1.109
2019-04-09T11:28:03.130089: Epoch 0 Batch 720/781 test_loss = 1.243
2019-04-09T11:28:03.256926: Epoch 0 Batch 740/781 test_loss = 1.118
2019-04-09T11:28:03.383769: Epoch 0 Batch 760/781 test_loss = 1.098
2019-04-09T11:28:03.510695: Epoch 0 Batch 780/781 test_loss = 1.155
2019-04-09T11:28:04.289124: Epoch 1 Batch 15/3125 train_loss = 1.266
2019-04-09T11:28:04.711410: Epoch 1 Batch 35/3125 train_loss = 1.142
2019-04-09T11:28:05.124010: Epoch 1 Batch 55/3125 train_loss = 1.165
2019-04-09T11:28:05.539135: Epoch 1 Batch 75/3125 train_loss = 1.079
2019-04-09T11:28:05.955033: Epoch 1 Batch 95/3125 train_loss = 0.929
2019-04-09T11:28:06.374924: Epoch 1 Batch 115/3125 train_loss = 1.166
2019-04-09T11:28:06.784549: Epoch 1 Batch 135/3125 train_loss = 1.015
2019-04-09T11:28:07.202663: Epoch 1 Batch 155/3125 train_loss = 1.129
2019-04-09T11:28:07.622296: Epoch 1 Batch 175/3125 train_loss = 1.051
2019-04-09T11:28:08.044004: Epoch 1 Batch 195/3125 train_loss = 1.215
2019-04-09T11:28:08.464873: Epoch 1 Batch 215/3125 train_loss = 1.127
2019-04-09T11:28:08.882758: Epoch 1 Batch 235/3125 train_loss = 1.092
2019-04-09T11:28:09.302399: Epoch 1 Batch 255/3125 train_loss = 1.211
2019-04-09T11:28:09.718143: Epoch 1 Batch 275/3125 train_loss = 1.005
2019-04-09T11:28:10.135755: Epoch 1 Batch 295/3125 train_loss = 0.973
2019-04-09T11:28:10.556105: Epoch 1 Batch 315/3125 train_loss = 1.039
2019-04-09T11:28:10.968219: Epoch 1 Batch 335/3125 train_loss = 0.990
2019-04-09T11:28:11.382497: Epoch 1 Batch 355/3125 train_loss = 1.110
2019-04-09T11:28:11.792475: Epoch 1 Batch 375/3125 train_loss = 1.187
2019-04-09T11:28:12.203571: Epoch 1 Batch 395/3125 train_loss = 1.056
2019-04-09T11:28:12.616848: Epoch 1 Batch 415/3125 train_loss = 1.314
2019-04-09T11:28:13.031510: Epoch 1 Batch 435/3125 train_loss = 1.136
2019-04-09T11:28:13.442848: Epoch 1 Batch 455/3125 train_loss = 1.054
2019-04-09T11:28:13.860246: Epoch 1 Batch 475/3125 train_loss = 1.144
2019-04-09T11:28:14.274154: Epoch 1 Batch 495/3125 train_loss = 1.056
2019-04-09T11:28:14.692507: Epoch 1 Batch 515/3125 train_loss = 1.161
2019-04-09T11:28:15.109092: Epoch 1 Batch 535/3125 train_loss = 1.140
2019-04-09T11:28:15.524725: Epoch 1 Batch 555/3125 train_loss = 1.257
2019-04-09T11:28:15.938088: Epoch 1 Batch 575/3125 train_loss = 1.070
2019-04-09T11:28:16.350862: Epoch 1 Batch 595/3125 train_loss = 1.285
2019-04-09T11:28:16.761759: Epoch 1 Batch 615/3125 train_loss = 1.101
2019-04-09T11:28:17.182378: Epoch 1 Batch 635/3125 train_loss = 1.138
2019-04-09T11:28:17.599235: Epoch 1 Batch 655/3125 train_loss = 1.057
2019-04-09T11:28:18.019362: Epoch 1 Batch 675/3125 train_loss = 0.876
2019-04-09T11:28:18.438108: Epoch 1 Batch 695/3125 train_loss = 1.045
2019-04-09T11:28:18.849900: Epoch 1 Batch 715/3125 train_loss = 1.098
2019-04-09T11:28:19.261195: Epoch 1 Batch 735/3125 train_loss = 0.914
2019-04-09T11:28:19.812365: Epoch 1 Batch 755/3125 train_loss = 1.162
2019-04-09T11:28:20.222217: Epoch 1 Batch 775/3125 train_loss = 0.998
2019-04-09T11:28:20.645987: Epoch 1 Batch 795/3125 train_loss = 1.218
2019-04-09T11:28:21.064302: Epoch 1 Batch 815/3125 train_loss = 1.102
2019-04-09T11:28:21.482799: Epoch 1 Batch 835/3125 train_loss = 1.071
2019-04-09T11:28:21.907954: Epoch 1 Batch 855/3125 train_loss = 1.297
2019-04-09T11:28:22.327483: Epoch 1 Batch 875/3125 train_loss = 1.248
2019-04-09T11:28:22.741550: Epoch 1 Batch 895/3125 train_loss = 1.080
2019-04-09T11:28:23.157659: Epoch 1 Batch 915/3125 train_loss = 1.059
2019-04-09T11:28:23.571202: Epoch 1 Batch 935/3125 train_loss = 1.163
2019-04-09T11:28:23.984586: Epoch 1 Batch 955/3125 train_loss = 1.102
2019-04-09T11:28:24.396511: Epoch 1 Batch 975/3125 train_loss = 1.100
2019-04-09T11:28:24.824835: Epoch 1 Batch 995/3125 train_loss = 0.890
2019-04-09T11:28:25.242948: Epoch 1 Batch 1015/3125 train_loss = 1.077
2019-04-09T11:28:25.659444: Epoch 1 Batch 1035/3125 train_loss = 1.090
2019-04-09T11:28:26.076601: Epoch 1 Batch 1055/3125 train_loss = 1.154
2019-04-09T11:28:26.489531: Epoch 1 Batch 1075/3125 train_loss = 1.004
2019-04-09T11:28:26.897455: Epoch 1 Batch 1095/3125 train_loss = 1.012
2019-04-09T11:28:27.320553: Epoch 1 Batch 1115/3125 train_loss = 1.165
2019-04-09T11:28:27.739517: Epoch 1 Batch 1135/3125 train_loss = 1.029
2019-04-09T11:28:28.156628: Epoch 1 Batch 1155/3125 train_loss = 1.117
2019-04-09T11:28:28.570595: Epoch 1 Batch 1175/3125 train_loss = 1.103
2019-04-09T11:28:28.980586: Epoch 1 Batch 1195/3125 train_loss = 1.250
2019-04-09T11:28:29.393619: Epoch 1 Batch 1215/3125 train_loss = 0.930
2019-04-09T11:28:29.809238: Epoch 1 Batch 1235/3125 train_loss = 1.077
2019-04-09T11:28:30.219331: Epoch 1 Batch 1255/3125 train_loss = 1.089
2019-04-09T11:28:30.627580: Epoch 1 Batch 1275/3125 train_loss = 1.000
2019-04-09T11:28:31.035136: Epoch 1 Batch 1295/3125 train_loss = 1.006
2019-04-09T11:28:31.448626: Epoch 1 Batch 1315/3125 train_loss = 1.210
2019-04-09T11:28:31.948769: Epoch 1 Batch 1335/3125 train_loss = 1.045
2019-04-09T11:28:32.356933: Epoch 1 Batch 1355/3125 train_loss = 1.058
2019-04-09T11:28:32.771030: Epoch 1 Batch 1375/3125 train_loss = 1.110
2019-04-09T11:28:33.184133: Epoch 1 Batch 1395/3125 train_loss = 1.008
2019-04-09T11:28:33.596132: Epoch 1 Batch 1415/3125 train_loss = 1.086
2019-04-09T11:28:34.007114: Epoch 1 Batch 1435/3125 train_loss = 1.221
2019-04-09T11:28:34.419967: Epoch 1 Batch 1455/3125 train_loss = 1.241
2019-04-09T11:28:34.829988: Epoch 1 Batch 1475/3125 train_loss = 1.154
2019-04-09T11:28:35.241458: Epoch 1 Batch 1495/3125 train_loss = 1.102
2019-04-09T11:28:35.650228: Epoch 1 Batch 1515/3125 train_loss = 0.990
2019-04-09T11:28:36.060708: Epoch 1 Batch 1535/3125 train_loss = 0.907
2019-04-09T11:28:36.472293: Epoch 1 Batch 1555/3125 train_loss = 1.079
2019-04-09T11:28:36.880701: Epoch 1 Batch 1575/3125 train_loss = 0.986
2019-04-09T11:28:37.298235: Epoch 1 Batch 1595/3125 train_loss = 1.052
2019-04-09T11:28:37.710706: Epoch 1 Batch 1615/3125 train_loss = 1.025
2019-04-09T11:28:38.118793: Epoch 1 Batch 1635/3125 train_loss = 1.146
2019-04-09T11:28:38.533452: Epoch 1 Batch 1655/3125 train_loss = 1.123
2019-04-09T11:28:38.948779: Epoch 1 Batch 1675/3125 train_loss = 0.976
2019-04-09T11:28:39.359489: Epoch 1 Batch 1695/3125 train_loss = 1.035
2019-04-09T11:28:39.766989: Epoch 1 Batch 1715/3125 train_loss = 0.945
2019-04-09T11:28:40.179589: Epoch 1 Batch 1735/3125 train_loss = 1.174
2019-04-09T11:28:40.590375: Epoch 1 Batch 1755/3125 train_loss = 1.027
2019-04-09T11:28:40.998865: Epoch 1 Batch 1775/3125 train_loss = 1.026
2019-04-09T11:28:41.408017: Epoch 1 Batch 1795/3125 train_loss = 0.981
2019-04-09T11:28:41.821620: Epoch 1 Batch 1815/3125 train_loss = 0.966
2019-04-09T11:28:42.229169: Epoch 1 Batch 1835/3125 train_loss = 1.074
2019-04-09T11:28:42.642918: Epoch 1 Batch 1855/3125 train_loss = 0.959
2019-04-09T11:28:43.154530: Epoch 1 Batch 1875/3125 train_loss = 1.213
2019-04-09T11:28:43.560385: Epoch 1 Batch 1895/3125 train_loss = 0.935
2019-04-09T11:28:43.974210: Epoch 1 Batch 1915/3125 train_loss = 0.973
2019-04-09T11:28:44.393618: Epoch 1 Batch 1935/3125 train_loss = 1.016
2019-04-09T11:28:44.808725: Epoch 1 Batch 1955/3125 train_loss = 1.006
2019-04-09T11:28:45.224542: Epoch 1 Batch 1975/3125 train_loss = 1.036
2019-04-09T11:28:45.638372: Epoch 1 Batch 1995/3125 train_loss = 1.130
2019-04-09T11:28:46.050876: Epoch 1 Batch 2015/3125 train_loss = 1.092
2019-04-09T11:28:46.466638: Epoch 1 Batch 2035/3125 train_loss = 1.163
2019-04-09T11:28:46.877782: Epoch 1 Batch 2055/3125 train_loss = 0.961
2019-04-09T11:28:47.297977: Epoch 1 Batch 2075/3125 train_loss = 1.154
2019-04-09T11:28:47.707362: Epoch 1 Batch 2095/3125 train_loss = 1.007
2019-04-09T11:28:48.119961: Epoch 1 Batch 2115/3125 train_loss = 1.150
2019-04-09T11:28:48.536958: Epoch 1 Batch 2135/3125 train_loss = 1.026
2019-04-09T11:28:48.955579: Epoch 1 Batch 2155/3125 train_loss = 1.008
2019-04-09T11:28:49.371992: Epoch 1 Batch 2175/3125 train_loss = 1.028
2019-04-09T11:28:49.785513: Epoch 1 Batch 2195/3125 train_loss = 1.013
2019-04-09T11:28:50.199116: Epoch 1 Batch 2215/3125 train_loss = 1.034
2019-04-09T11:28:50.609969: Epoch 1 Batch 2235/3125 train_loss = 1.184
2019-04-09T11:28:51.023581: Epoch 1 Batch 2255/3125 train_loss = 1.135
2019-04-09T11:28:51.436197: Epoch 1 Batch 2275/3125 train_loss = 0.936
2019-04-09T11:28:51.854318: Epoch 1 Batch 2295/3125 train_loss = 1.230
2019-04-09T11:28:52.266593: Epoch 1 Batch 2315/3125 train_loss = 1.180
2019-04-09T11:28:53.027310: Epoch 1 Batch 2335/3125 train_loss = 1.068
2019-04-09T11:28:53.443572: Epoch 1 Batch 2355/3125 train_loss = 1.021
2019-04-09T11:28:53.859233: Epoch 1 Batch 2375/3125 train_loss = 1.241
2019-04-09T11:28:54.268702: Epoch 1 Batch 2395/3125 train_loss = 1.022
2019-04-09T11:28:54.684586: Epoch 1 Batch 2415/3125 train_loss = 1.062
2019-04-09T11:28:55.104188: Epoch 1 Batch 2435/3125 train_loss = 0.978
2019-04-09T11:28:55.517661: Epoch 1 Batch 2455/3125 train_loss = 1.075
2019-04-09T11:28:55.940375: Epoch 1 Batch 2475/3125 train_loss = 0.997
2019-04-09T11:28:56.355446: Epoch 1 Batch 2495/3125 train_loss = 0.991
2019-04-09T11:28:56.767784: Epoch 1 Batch 2515/3125 train_loss = 1.057
2019-04-09T11:28:57.185487: Epoch 1 Batch 2535/3125 train_loss = 1.064
2019-04-09T11:28:57.599402: Epoch 1 Batch 2555/3125 train_loss = 0.883
2019-04-09T11:28:58.012436: Epoch 1 Batch 2575/3125 train_loss = 0.914
2019-04-09T11:28:58.427098: Epoch 1 Batch 2595/3125 train_loss = 0.934
2019-04-09T11:28:58.836389: Epoch 1 Batch 2615/3125 train_loss = 1.151
2019-04-09T11:28:59.262074: Epoch 1 Batch 2635/3125 train_loss = 1.017
2019-04-09T11:28:59.680762: Epoch 1 Batch 2655/3125 train_loss = 1.036
2019-04-09T11:29:00.094884: Epoch 1 Batch 2675/3125 train_loss = 0.960
2019-04-09T11:29:00.510614: Epoch 1 Batch 2695/3125 train_loss = 1.031
2019-04-09T11:29:00.925679: Epoch 1 Batch 2715/3125 train_loss = 1.011
2019-04-09T11:29:01.343105: Epoch 1 Batch 2735/3125 train_loss = 0.876
2019-04-09T11:29:01.762199: Epoch 1 Batch 2755/3125 train_loss = 1.087
2019-04-09T11:29:02.171790: Epoch 1 Batch 2775/3125 train_loss = 1.101
2019-04-09T11:29:02.585480: Epoch 1 Batch 2795/3125 train_loss = 1.064
2019-04-09T11:29:02.995887: Epoch 1 Batch 2815/3125 train_loss = 0.981
2019-04-09T11:29:03.414306: Epoch 1 Batch 2835/3125 train_loss = 1.123
2019-04-09T11:29:03.824405: Epoch 1 Batch 2855/3125 train_loss = 1.069
2019-04-09T11:29:04.236239: Epoch 1 Batch 2875/3125 train_loss = 1.006
2019-04-09T11:29:04.644747: Epoch 1 Batch 2895/3125 train_loss = 1.013
2019-04-09T11:29:05.058545: Epoch 1 Batch 2915/3125 train_loss = 0.985
2019-04-09T11:29:05.473539: Epoch 1 Batch 2935/3125 train_loss = 1.152
2019-04-09T11:29:05.881997: Epoch 1 Batch 2955/3125 train_loss = 1.015
2019-04-09T11:29:06.294405: Epoch 1 Batch 2975/3125 train_loss = 0.977
2019-04-09T11:29:06.707933: Epoch 1 Batch 2995/3125 train_loss = 0.928
2019-04-09T11:29:07.122537: Epoch 1 Batch 3015/3125 train_loss = 1.033
2019-04-09T11:29:07.534921: Epoch 1 Batch 3035/3125 train_loss = 1.097
2019-04-09T11:29:07.945410: Epoch 1 Batch 3055/3125 train_loss = 1.058
2019-04-09T11:29:08.355520: Epoch 1 Batch 3075/3125 train_loss = 1.009
2019-04-09T11:29:08.775390: Epoch 1 Batch 3095/3125 train_loss = 0.946
2019-04-09T11:29:09.190497: Epoch 1 Batch 3115/3125 train_loss = 0.919
2019-04-09T11:29:09.605177: Epoch 1 Batch 19/781 test_loss = 1.005
2019-04-09T11:29:09.737030: Epoch 1 Batch 39/781 test_loss = 0.844
2019-04-09T11:29:09.863600: Epoch 1 Batch 59/781 test_loss = 0.955
2019-04-09T11:29:09.991439: Epoch 1 Batch 79/781 test_loss = 0.980
2019-04-09T11:29:10.118778: Epoch 1 Batch 99/781 test_loss = 0.997
2019-04-09T11:29:10.246117: Epoch 1 Batch 119/781 test_loss = 0.996
2019-04-09T11:29:10.374962: Epoch 1 Batch 139/781 test_loss = 0.988
2019-04-09T11:29:10.503975: Epoch 1 Batch 159/781 test_loss = 0.970
2019-04-09T11:29:10.630812: Epoch 1 Batch 179/781 test_loss = 0.950
2019-04-09T11:29:10.758151: Epoch 1 Batch 199/781 test_loss = 0.939
2019-04-09T11:29:10.885992: Epoch 1 Batch 219/781 test_loss = 0.993
2019-04-09T11:29:11.014332: Epoch 1 Batch 239/781 test_loss = 1.237
2019-04-09T11:29:11.141671: Epoch 1 Batch 259/781 test_loss = 0.976
2019-04-09T11:29:11.270013: Epoch 1 Batch 279/781 test_loss = 1.069
2019-04-09T11:29:11.399713: Epoch 1 Batch 299/781 test_loss = 1.209
2019-04-09T11:29:11.531062: Epoch 1 Batch 319/781 test_loss = 0.913
2019-04-09T11:29:11.661408: Epoch 1 Batch 339/781 test_loss = 0.906
2019-04-09T11:29:11.787744: Epoch 1 Batch 359/781 test_loss = 0.924
2019-04-09T11:29:11.914581: Epoch 1 Batch 379/781 test_loss = 1.030
2019-04-09T11:29:12.043424: Epoch 1 Batch 399/781 test_loss = 0.912
2019-04-09T11:29:12.171264: Epoch 1 Batch 419/781 test_loss = 0.959
2019-04-09T11:29:12.300107: Epoch 1 Batch 439/781 test_loss = 1.026
2019-04-09T11:29:12.428123: Epoch 1 Batch 459/781 test_loss = 1.085
2019-04-09T11:29:12.553965: Epoch 1 Batch 479/781 test_loss = 1.054
2019-04-09T11:29:12.683302: Epoch 1 Batch 499/781 test_loss = 0.919
2019-04-09T11:29:12.810139: Epoch 1 Batch 519/781 test_loss = 1.083
2019-04-09T11:29:12.939483: Epoch 1 Batch 539/781 test_loss = 0.888
2019-04-09T11:29:13.066822: Epoch 1 Batch 559/781 test_loss = 1.165
2019-04-09T11:29:13.195164: Epoch 1 Batch 579/781 test_loss = 1.014
2019-04-09T11:29:13.321500: Epoch 1 Batch 599/781 test_loss = 0.975
2019-04-09T11:29:13.449045: Epoch 1 Batch 619/781 test_loss = 1.152
2019-04-09T11:29:13.578390: Epoch 1 Batch 639/781 test_loss = 0.881
2019-04-09T11:29:13.706229: Epoch 1 Batch 659/781 test_loss = 1.086
2019-04-09T11:29:13.834069: Epoch 1 Batch 679/781 test_loss = 1.149
2019-04-09T11:29:13.964416: Epoch 1 Batch 699/781 test_loss = 0.888
2019-04-09T11:29:14.094763: Epoch 1 Batch 719/781 test_loss = 0.940
2019-04-09T11:29:14.223606: Epoch 1 Batch 739/781 test_loss = 1.001
2019-04-09T11:29:14.350443: Epoch 1 Batch 759/781 test_loss = 0.925
2019-04-09T11:29:14.479091: Epoch 1 Batch 779/781 test_loss = 0.786
2019-04-09T11:29:15.169929: Epoch 2 Batch 10/3125 train_loss = 0.962
2019-04-09T11:29:15.585033: Epoch 2 Batch 30/3125 train_loss = 0.921
2019-04-09T11:29:16.090936: Epoch 2 Batch 50/3125 train_loss = 1.098
2019-04-09T11:29:16.504056: Epoch 2 Batch 70/3125 train_loss = 1.066
2019-04-09T11:29:16.916616: Epoch 2 Batch 90/3125 train_loss = 1.065
2019-04-09T11:29:17.335995: Epoch 2 Batch 110/3125 train_loss = 0.908
2019-04-09T11:29:17.744923: Epoch 2 Batch 130/3125 train_loss = 0.927
2019-04-09T11:29:18.156518: Epoch 2 Batch 150/3125 train_loss = 1.094
2019-04-09T11:29:18.572814: Epoch 2 Batch 170/3125 train_loss = 1.062
2019-04-09T11:29:18.979180: Epoch 2 Batch 190/3125 train_loss = 1.043
2019-04-09T11:29:19.392758: Epoch 2 Batch 210/3125 train_loss = 0.920
2019-04-09T11:29:19.806360: Epoch 2 Batch 230/3125 train_loss = 0.990
2019-04-09T11:29:20.213864: Epoch 2 Batch 250/3125 train_loss = 0.956
2019-04-09T11:29:20.624843: Epoch 2 Batch 270/3125 train_loss = 0.816
2019-04-09T11:29:21.034399: Epoch 2 Batch 290/3125 train_loss = 1.029
2019-04-09T11:29:21.450506: Epoch 2 Batch 310/3125 train_loss = 1.039
2019-04-09T11:29:21.860168: Epoch 2 Batch 330/3125 train_loss = 0.981
2019-04-09T11:29:22.268774: Epoch 2 Batch 350/3125 train_loss = 0.927
2019-04-09T11:29:22.681125: Epoch 2 Batch 370/3125 train_loss = 1.157
2019-04-09T11:29:23.092834: Epoch 2 Batch 390/3125 train_loss = 1.131
2019-04-09T11:29:23.503543: Epoch 2 Batch 410/3125 train_loss = 0.945
2019-04-09T11:29:23.913894: Epoch 2 Batch 430/3125 train_loss = 1.121
2019-04-09T11:29:24.324622: Epoch 2 Batch 450/3125 train_loss = 0.925
2019-04-09T11:29:24.740883: Epoch 2 Batch 470/3125 train_loss = 0.952
2019-04-09T11:29:25.150474: Epoch 2 Batch 490/3125 train_loss = 1.031
2019-04-09T11:29:25.566388: Epoch 2 Batch 510/3125 train_loss = 1.045
2019-04-09T11:29:25.981499: Epoch 2 Batch 530/3125 train_loss = 0.936
2019-04-09T11:29:26.427824: Epoch 2 Batch 550/3125 train_loss = 1.041
2019-04-09T11:29:26.844394: Epoch 2 Batch 570/3125 train_loss = 1.175
2019-04-09T11:29:27.262411: Epoch 2 Batch 590/3125 train_loss = 1.093
2019-04-09T11:29:27.677138: Epoch 2 Batch 610/3125 train_loss = 0.941
2019-04-09T11:29:28.088132: Epoch 2 Batch 630/3125 train_loss = 1.067
2019-04-09T11:29:28.504546: Epoch 2 Batch 650/3125 train_loss = 1.015
2019-04-09T11:29:28.919901: Epoch 2 Batch 670/3125 train_loss = 0.921
2019-04-09T11:29:29.332525: Epoch 2 Batch 690/3125 train_loss = 0.946
2019-04-09T11:29:29.752401: Epoch 2 Batch 710/3125 train_loss = 0.958
2019-04-09T11:29:30.169512: Epoch 2 Batch 730/3125 train_loss = 0.833
2019-04-09T11:29:30.581918: Epoch 2 Batch 750/3125 train_loss = 0.983
2019-04-09T11:29:30.990078: Epoch 2 Batch 770/3125 train_loss = 0.882
2019-04-09T11:29:31.401819: Epoch 2 Batch 790/3125 train_loss = 0.922
2019-04-09T11:29:31.821438: Epoch 2 Batch 810/3125 train_loss = 0.843
2019-04-09T11:29:32.231582: Epoch 2 Batch 830/3125 train_loss = 0.875
2019-04-09T11:29:32.646142: Epoch 2 Batch 850/3125 train_loss = 1.077
2019-04-09T11:29:33.064808: Epoch 2 Batch 870/3125 train_loss = 0.952
2019-04-09T11:29:33.477008: Epoch 2 Batch 890/3125 train_loss = 0.888
2019-04-09T11:29:33.887466: Epoch 2 Batch 910/3125 train_loss = 1.012
2019-04-09T11:29:34.298086: Epoch 2 Batch 930/3125 train_loss = 0.959
2019-04-09T11:29:34.715677: Epoch 2 Batch 950/3125 train_loss = 0.975
2019-04-09T11:29:35.130281: Epoch 2 Batch 970/3125 train_loss = 1.050
2019-04-09T11:29:35.544737: Epoch 2 Batch 990/3125 train_loss = 0.864
2019-04-09T11:29:35.958160: Epoch 2 Batch 1010/3125 train_loss = 1.084
2019-04-09T11:29:36.371777: Epoch 2 Batch 1030/3125 train_loss = 0.946
2019-04-09T11:29:36.780334: Epoch 2 Batch 1050/3125 train_loss = 1.009
2019-04-09T11:29:37.193936: Epoch 2 Batch 1070/3125 train_loss = 0.981
2019-04-09T11:29:37.603917: Epoch 2 Batch 1090/3125 train_loss = 1.081
2019-04-09T11:29:38.014688: Epoch 2 Batch 1110/3125 train_loss = 1.080
2019-04-09T11:29:38.435423: Epoch 2 Batch 1130/3125 train_loss = 0.920
2019-04-09T11:29:38.848851: Epoch 2 Batch 1150/3125 train_loss = 0.949
2019-04-09T11:29:39.260649: Epoch 2 Batch 1170/3125 train_loss = 0.944
2019-04-09T11:29:39.676982: Epoch 2 Batch 1190/3125 train_loss = 1.046
2019-04-09T11:29:40.089421: Epoch 2 Batch 1210/3125 train_loss = 0.873
2019-04-09T11:29:40.501075: Epoch 2 Batch 1230/3125 train_loss = 0.862
2019-04-09T11:29:40.912917: Epoch 2 Batch 1250/3125 train_loss = 0.963
2019-04-09T11:29:41.331306: Epoch 2 Batch 1270/3125 train_loss = 1.041
2019-04-09T11:29:41.745589: Epoch 2 Batch 1290/3125 train_loss = 0.935
2019-04-09T11:29:42.155682: Epoch 2 Batch 1310/3125 train_loss = 1.011
2019-04-09T11:29:42.565230: Epoch 2 Batch 1330/3125 train_loss = 1.089
2019-04-09T11:29:42.972821: Epoch 2 Batch 1350/3125 train_loss = 0.929
2019-04-09T11:29:43.384313: Epoch 2 Batch 1370/3125 train_loss = 0.871
2019-04-09T11:29:43.800679: Epoch 2 Batch 1390/3125 train_loss = 1.056
2019-04-09T11:29:44.212277: Epoch 2 Batch 1410/3125 train_loss = 0.956
2019-04-09T11:29:44.622595: Epoch 2 Batch 1430/3125 train_loss = 0.991
2019-04-09T11:29:45.030926: Epoch 2 Batch 1450/3125 train_loss = 1.019
2019-04-09T11:29:45.446118: Epoch 2 Batch 1470/3125 train_loss = 1.018
2019-04-09T11:29:45.858249: Epoch 2 Batch 1490/3125 train_loss = 1.025
2019-04-09T11:29:46.264877: Epoch 2 Batch 1510/3125 train_loss = 0.987
2019-04-09T11:29:46.680210: Epoch 2 Batch 1530/3125 train_loss = 1.077
2019-04-09T11:29:47.097122: Epoch 2 Batch 1550/3125 train_loss = 0.871
2019-04-09T11:29:47.505701: Epoch 2 Batch 1570/3125 train_loss = 0.963
2019-04-09T11:29:47.915740: Epoch 2 Batch 1590/3125 train_loss = 0.935
2019-04-09T11:29:48.325191: Epoch 2 Batch 1610/3125 train_loss = 1.024
2019-04-09T11:29:48.741050: Epoch 2 Batch 1630/3125 train_loss = 1.033
2019-04-09T11:29:49.303637: Epoch 2 Batch 1650/3125 train_loss = 0.892
2019-04-09T11:29:49.716688: Epoch 2 Batch 1670/3125 train_loss = 0.828
2019-04-09T11:29:50.127782: Epoch 2 Batch 1690/3125 train_loss = 0.886
2019-04-09T11:29:50.541466: Epoch 2 Batch 1710/3125 train_loss = 1.033
2019-04-09T11:29:50.952638: Epoch 2 Batch 1730/3125 train_loss = 0.990
2019-04-09T11:29:51.366254: Epoch 2 Batch 1750/3125 train_loss = 0.851
2019-04-09T11:29:51.779575: Epoch 2 Batch 1770/3125 train_loss = 1.130
2019-04-09T11:29:52.189667: Epoch 2 Batch 1790/3125 train_loss = 0.970
2019-04-09T11:29:52.600989: Epoch 2 Batch 1810/3125 train_loss = 1.004
2019-04-09T11:29:53.010986: Epoch 2 Batch 1830/3125 train_loss = 1.035
2019-04-09T11:29:53.428213: Epoch 2 Batch 1850/3125 train_loss = 0.935
2019-04-09T11:29:53.847839: Epoch 2 Batch 1870/3125 train_loss = 1.039
2019-04-09T11:29:54.260999: Epoch 2 Batch 1890/3125 train_loss = 0.822
2019-04-09T11:29:54.670587: Epoch 2 Batch 1910/3125 train_loss = 0.885
2019-04-09T11:29:55.079904: Epoch 2 Batch 1930/3125 train_loss = 1.038
2019-04-09T11:29:55.492941: Epoch 2 Batch 1950/3125 train_loss = 0.887
2019-04-09T11:29:55.909977: Epoch 2 Batch 1970/3125 train_loss = 0.998
2019-04-09T11:29:56.321669: Epoch 2 Batch 1990/3125 train_loss = 0.864
2019-04-09T11:29:56.731912: Epoch 2 Batch 2010/3125 train_loss = 0.792
2019-04-09T11:29:57.143008: Epoch 2 Batch 2030/3125 train_loss = 0.907
2019-04-09T11:29:57.555451: Epoch 2 Batch 2050/3125 train_loss = 0.952
2019-04-09T11:29:57.967763: Epoch 2 Batch 2070/3125 train_loss = 0.882
2019-04-09T11:29:58.396253: Epoch 2 Batch 2090/3125 train_loss = 0.831
2019-04-09T11:29:58.810290: Epoch 2 Batch 2110/3125 train_loss = 1.050
2019-04-09T11:29:59.220382: Epoch 2 Batch 2130/3125 train_loss = 0.973
2019-04-09T11:29:59.638177: Epoch 2 Batch 2150/3125 train_loss = 1.009
2019-04-09T11:30:00.054094: Epoch 2 Batch 2170/3125 train_loss = 0.862
2019-04-09T11:30:00.465054: Epoch 2 Batch 2190/3125 train_loss = 0.967
2019-04-09T11:30:00.875581: Epoch 2 Batch 2210/3125 train_loss = 0.950
2019-04-09T11:30:01.283669: Epoch 2 Batch 2230/3125 train_loss = 0.843
2019-04-09T11:30:01.702253: Epoch 2 Batch 2250/3125 train_loss = 0.933
2019-04-09T11:30:02.116357: Epoch 2 Batch 2270/3125 train_loss = 0.917
2019-04-09T11:30:02.530943: Epoch 2 Batch 2290/3125 train_loss = 0.856
2019-04-09T11:30:02.942953: Epoch 2 Batch 2310/3125 train_loss = 0.851
2019-04-09T11:30:03.360388: Epoch 2 Batch 2330/3125 train_loss = 1.097
2019-04-09T11:30:03.770799: Epoch 2 Batch 2350/3125 train_loss = 0.989
2019-04-09T11:30:04.189427: Epoch 2 Batch 2370/3125 train_loss = 0.886
2019-04-09T11:30:04.602910: Epoch 2 Batch 2390/3125 train_loss = 1.017
2019-04-09T11:30:05.013193: Epoch 2 Batch 2410/3125 train_loss = 1.025
2019-04-09T11:30:05.426136: Epoch 2 Batch 2430/3125 train_loss = 0.885
2019-04-09T11:30:05.834048: Epoch 2 Batch 2450/3125 train_loss = 0.968
2019-04-09T11:30:06.246209: Epoch 2 Batch 2470/3125 train_loss = 1.042
2019-04-09T11:30:06.661647: Epoch 2 Batch 2490/3125 train_loss = 1.003
2019-04-09T11:30:07.071296: Epoch 2 Batch 2510/3125 train_loss = 1.084
2019-04-09T11:30:07.490192: Epoch 2 Batch 2530/3125 train_loss = 0.793
2019-04-09T11:30:07.904515: Epoch 2 Batch 2550/3125 train_loss = 0.954
2019-04-09T11:30:08.315032: Epoch 2 Batch 2570/3125 train_loss = 0.957
2019-04-09T11:30:08.733158: Epoch 2 Batch 2590/3125 train_loss = 0.984
2019-04-09T11:30:09.146760: Epoch 2 Batch 2610/3125 train_loss = 1.043
2019-04-09T11:30:09.564414: Epoch 2 Batch 2630/3125 train_loss = 0.660
2019-04-09T11:30:09.977708: Epoch 2 Batch 2650/3125 train_loss = 0.913
2019-04-09T11:30:10.392227: Epoch 2 Batch 2670/3125 train_loss = 1.051
2019-04-09T11:30:10.803323: Epoch 2 Batch 2690/3125 train_loss = 0.980
2019-04-09T11:30:11.221892: Epoch 2 Batch 2710/3125 train_loss = 0.845
2019-04-09T11:30:11.636832: Epoch 2 Batch 2730/3125 train_loss = 1.067
2019-04-09T11:30:12.048855: Epoch 2 Batch 2750/3125 train_loss = 1.020
2019-04-09T11:30:12.466622: Epoch 2 Batch 2770/3125 train_loss = 0.894
2019-04-09T11:30:12.877228: Epoch 2 Batch 2790/3125 train_loss = 0.881
2019-04-09T11:30:13.292940: Epoch 2 Batch 2810/3125 train_loss = 0.958
2019-04-09T11:30:13.707370: Epoch 2 Batch 2830/3125 train_loss = 0.816
2019-04-09T11:30:14.115458: Epoch 2 Batch 2850/3125 train_loss = 1.005
2019-04-09T11:30:14.527402: Epoch 2 Batch 2870/3125 train_loss = 0.792
2019-04-09T11:30:14.941006: Epoch 2 Batch 2890/3125 train_loss = 0.779
2019-04-09T11:30:15.351115: Epoch 2 Batch 2910/3125 train_loss = 1.007
2019-04-09T11:30:15.761429: Epoch 2 Batch 2930/3125 train_loss = 0.813
2019-04-09T11:30:16.174529: Epoch 2 Batch 2950/3125 train_loss = 1.069
2019-04-09T11:30:16.592845: Epoch 2 Batch 2970/3125 train_loss = 0.993
2019-04-09T11:30:17.005062: Epoch 2 Batch 2990/3125 train_loss = 0.862
2019-04-09T11:30:17.425470: Epoch 2 Batch 3010/3125 train_loss = 0.936
2019-04-09T11:30:17.837640: Epoch 2 Batch 3030/3125 train_loss = 0.968
2019-04-09T11:30:18.248424: Epoch 2 Batch 3050/3125 train_loss = 0.980
2019-04-09T11:30:18.666115: Epoch 2 Batch 3070/3125 train_loss = 0.896
2019-04-09T11:30:19.074163: Epoch 2 Batch 3090/3125 train_loss = 0.774
2019-04-09T11:30:19.491628: Epoch 2 Batch 3110/3125 train_loss = 0.837
2019-04-09T11:30:19.895275: Epoch 2 Batch 18/781 test_loss = 0.808
2019-04-09T11:30:20.023969: Epoch 2 Batch 38/781 test_loss = 0.915
2019-04-09T11:30:20.152310: Epoch 2 Batch 58/781 test_loss = 0.851
2019-04-09T11:30:20.280151: Epoch 2 Batch 78/781 test_loss = 0.905
2019-04-09T11:30:20.408187: Epoch 2 Batch 98/781 test_loss = 0.903
2019-04-09T11:30:20.536028: Epoch 2 Batch 118/781 test_loss = 0.884
2019-04-09T11:30:20.663366: Epoch 2 Batch 138/781 test_loss = 1.000
2019-04-09T11:30:20.791206: Epoch 2 Batch 158/781 test_loss = 0.904
2019-04-09T11:30:20.918545: Epoch 2 Batch 178/781 test_loss = 0.785
2019-04-09T11:30:21.045884: Epoch 2 Batch 198/781 test_loss = 0.922
2019-04-09T11:30:21.177736: Epoch 2 Batch 218/781 test_loss = 0.997
2019-04-09T11:30:21.310087: Epoch 2 Batch 238/781 test_loss = 0.998
2019-04-09T11:30:21.437625: Epoch 2 Batch 258/781 test_loss = 0.959
2019-04-09T11:30:21.565465: Epoch 2 Batch 278/781 test_loss = 1.074
2019-04-09T11:30:21.692804: Epoch 2 Batch 298/781 test_loss = 0.915
2019-04-09T11:30:21.821646: Epoch 2 Batch 318/781 test_loss = 0.889
2019-04-09T11:30:21.952495: Epoch 2 Batch 338/781 test_loss = 0.941
2019-04-09T11:30:22.081338: Epoch 2 Batch 358/781 test_loss = 0.913
2019-04-09T11:30:22.210686: Epoch 2 Batch 378/781 test_loss = 0.890
2019-04-09T11:30:22.344036: Epoch 2 Batch 398/781 test_loss = 0.833
2019-04-09T11:30:22.471957: Epoch 2 Batch 418/781 test_loss = 0.941
2019-04-09T11:30:22.599296: Epoch 2 Batch 438/781 test_loss = 1.013
2019-04-09T11:30:22.728139: Epoch 2 Batch 458/781 test_loss = 0.919
2019-04-09T11:30:22.855992: Epoch 2 Batch 478/781 test_loss = 0.965
2019-04-09T11:30:22.982816: Epoch 2 Batch 498/781 test_loss = 0.813
2019-04-09T11:30:23.110155: Epoch 2 Batch 518/781 test_loss = 0.919
2019-04-09T11:30:23.238497: Epoch 2 Batch 538/781 test_loss = 0.795
2019-04-09T11:30:23.366838: Epoch 2 Batch 558/781 test_loss = 0.830
2019-04-09T11:30:23.495883: Epoch 2 Batch 578/781 test_loss = 0.915
2019-04-09T11:30:23.623225: Epoch 2 Batch 598/781 test_loss = 1.055
2019-04-09T11:30:23.751062: Epoch 2 Batch 618/781 test_loss = 0.850
2019-04-09T11:30:23.879905: Epoch 2 Batch 638/781 test_loss = 0.845
2019-04-09T11:30:24.007243: Epoch 2 Batch 658/781 test_loss = 1.026
2019-04-09T11:30:24.138091: Epoch 2 Batch 678/781 test_loss = 0.926
2019-04-09T11:30:24.266433: Epoch 2 Batch 698/781 test_loss = 0.875
2019-04-09T11:30:24.395604: Epoch 2 Batch 718/781 test_loss = 1.006
2019-04-09T11:30:24.523445: Epoch 2 Batch 738/781 test_loss = 0.850
2019-04-09T11:30:24.651786: Epoch 2 Batch 758/781 test_loss = 0.892
2019-04-09T11:30:24.779626: Epoch 2 Batch 778/781 test_loss = 0.913
2019-04-09T11:30:25.360700: Epoch 3 Batch 5/3125 train_loss = 0.900
2019-04-09T11:30:25.776594: Epoch 3 Batch 25/3125 train_loss = 0.995
2019-04-09T11:30:26.190195: Epoch 3 Batch 45/3125 train_loss = 0.823
2019-04-09T11:30:26.605221: Epoch 3 Batch 65/3125 train_loss = 0.936
2019-04-09T11:30:27.017575: Epoch 3 Batch 85/3125 train_loss = 0.811
2019-04-09T11:30:27.433325: Epoch 3 Batch 105/3125 train_loss = 0.735
2019-04-09T11:30:27.845489: Epoch 3 Batch 125/3125 train_loss = 0.883
2019-04-09T11:30:28.255902: Epoch 3 Batch 145/3125 train_loss = 0.946
2019-04-09T11:30:28.676186: Epoch 3 Batch 165/3125 train_loss = 0.907
2019-04-09T11:30:29.086028: Epoch 3 Batch 185/3125 train_loss = 0.843
2019-04-09T11:30:29.498049: Epoch 3 Batch 205/3125 train_loss = 0.782
2019-04-09T11:30:29.910137: Epoch 3 Batch 225/3125 train_loss = 0.818
2019-04-09T11:30:30.321717: Epoch 3 Batch 245/3125 train_loss = 1.094
2019-04-09T11:30:30.732822: Epoch 3 Batch 265/3125 train_loss = 0.907
2019-04-09T11:30:31.144919: Epoch 3 Batch 285/3125 train_loss = 0.899
2019-04-09T11:30:31.564878: Epoch 3 Batch 305/3125 train_loss = 0.886
2019-04-09T11:30:31.986450: Epoch 3 Batch 325/3125 train_loss = 0.900
2019-04-09T11:30:32.402943: Epoch 3 Batch 345/3125 train_loss = 0.966
2019-04-09T11:30:32.817756: Epoch 3 Batch 365/3125 train_loss = 0.897
2019-04-09T11:30:33.231358: Epoch 3 Batch 385/3125 train_loss = 0.854
2019-04-09T11:30:33.642523: Epoch 3 Batch 405/3125 train_loss = 0.854
2019-04-09T11:30:34.052009: Epoch 3 Batch 425/3125 train_loss = 0.950
2019-04-09T11:30:34.463651: Epoch 3 Batch 445/3125 train_loss = 0.963
2019-04-09T11:30:34.877612: Epoch 3 Batch 465/3125 train_loss = 0.840
2019-04-09T11:30:35.291041: Epoch 3 Batch 485/3125 train_loss = 1.043
2019-04-09T11:30:35.701510: Epoch 3 Batch 505/3125 train_loss = 0.820
2019-04-09T11:30:36.113107: Epoch 3 Batch 525/3125 train_loss = 0.977
2019-04-09T11:30:36.526067: Epoch 3 Batch 545/3125 train_loss = 0.785
2019-04-09T11:30:36.938504: Epoch 3 Batch 565/3125 train_loss = 1.138
2019-04-09T11:30:37.354627: Epoch 3 Batch 585/3125 train_loss = 0.877
2019-04-09T11:30:37.769480: Epoch 3 Batch 605/3125 train_loss = 0.865
2019-04-09T11:30:38.180576: Epoch 3 Batch 625/3125 train_loss = 0.931
2019-04-09T11:30:38.595414: Epoch 3 Batch 645/3125 train_loss = 1.007
2019-04-09T11:30:39.007112: Epoch 3 Batch 665/3125 train_loss = 0.960
2019-04-09T11:30:39.427161: Epoch 3 Batch 685/3125 train_loss = 0.908
2019-04-09T11:30:39.841768: Epoch 3 Batch 705/3125 train_loss = 1.001
2019-04-09T11:30:40.258352: Epoch 3 Batch 725/3125 train_loss = 0.888
2019-04-09T11:30:40.672977: Epoch 3 Batch 745/3125 train_loss = 0.834
2019-04-09T11:30:41.090307: Epoch 3 Batch 765/3125 train_loss = 0.864
2019-04-09T11:30:41.504196: Epoch 3 Batch 785/3125 train_loss = 1.046
2019-04-09T11:30:41.912423: Epoch 3 Batch 805/3125 train_loss = 0.816
2019-04-09T11:30:42.328090: Epoch 3 Batch 825/3125 train_loss = 0.904
2019-04-09T11:30:42.740677: Epoch 3 Batch 845/3125 train_loss = 0.932
2019-04-09T11:30:43.153777: Epoch 3 Batch 865/3125 train_loss = 1.004
2019-04-09T11:30:43.566946: Epoch 3 Batch 885/3125 train_loss = 0.968
2019-04-09T11:30:43.981050: Epoch 3 Batch 905/3125 train_loss = 0.998
2019-04-09T11:30:44.394270: Epoch 3 Batch 925/3125 train_loss = 0.896
2019-04-09T11:30:44.807669: Epoch 3 Batch 945/3125 train_loss = 0.978
2019-04-09T11:30:45.224278: Epoch 3 Batch 965/3125 train_loss = 0.731
2019-04-09T11:30:45.644716: Epoch 3 Batch 985/3125 train_loss = 1.003
2019-04-09T11:30:46.056218: Epoch 3 Batch 1005/3125 train_loss = 0.794
2019-04-09T11:30:46.465616: Epoch 3 Batch 1025/3125 train_loss = 0.879
2019-04-09T11:30:46.878718: Epoch 3 Batch 1045/3125 train_loss = 1.127
2019-04-09T11:30:47.297579: Epoch 3 Batch 1065/3125 train_loss = 0.875
2019-04-09T11:30:47.709534: Epoch 3 Batch 1085/3125 train_loss = 0.834
2019-04-09T11:30:48.125642: Epoch 3 Batch 1105/3125 train_loss = 0.842
2019-04-09T11:30:48.538103: Epoch 3 Batch 1125/3125 train_loss = 0.859
2019-04-09T11:30:48.952197: Epoch 3 Batch 1145/3125 train_loss = 0.905
2019-04-09T11:30:49.366261: Epoch 3 Batch 1165/3125 train_loss = 0.964
2019-04-09T11:30:49.774853: Epoch 3 Batch 1185/3125 train_loss = 0.869
2019-04-09T11:30:50.190392: Epoch 3 Batch 1205/3125 train_loss = 0.836
2019-04-09T11:30:50.605998: Epoch 3 Batch 1225/3125 train_loss = 1.002
2019-04-09T11:30:51.020181: Epoch 3 Batch 1245/3125 train_loss = 1.006
2019-04-09T11:30:51.434899: Epoch 3 Batch 1265/3125 train_loss = 0.896
2019-04-09T11:30:51.850872: Epoch 3 Batch 1285/3125 train_loss = 0.960
2019-04-09T11:30:52.265731: Epoch 3 Batch 1305/3125 train_loss = 0.802
2019-04-09T11:30:53.236710: Epoch 3 Batch 1325/3125 train_loss = 0.886
2019-04-09T11:30:53.650278: Epoch 3 Batch 1345/3125 train_loss = 0.928
2019-04-09T11:30:54.066153: Epoch 3 Batch 1365/3125 train_loss = 0.761
2019-04-09T11:30:54.481716: Epoch 3 Batch 1385/3125 train_loss = 0.779
2019-04-09T11:30:54.890807: Epoch 3 Batch 1405/3125 train_loss = 0.857
2019-04-09T11:30:55.303205: Epoch 3 Batch 1425/3125 train_loss = 1.106
2019-04-09T11:30:55.713796: Epoch 3 Batch 1445/3125 train_loss = 1.002
2019-04-09T11:30:56.127899: Epoch 3 Batch 1465/3125 train_loss = 0.887
2019-04-09T11:30:56.544126: Epoch 3 Batch 1485/3125 train_loss = 0.920
2019-04-09T11:30:56.952476: Epoch 3 Batch 1505/3125 train_loss = 0.745
2019-04-09T11:30:57.370433: Epoch 3 Batch 1525/3125 train_loss = 0.759
2019-04-09T11:30:57.781531: Epoch 3 Batch 1545/3125 train_loss = 0.843
2019-04-09T11:30:58.194632: Epoch 3 Batch 1565/3125 train_loss = 0.983
2019-04-09T11:30:58.613587: Epoch 3 Batch 1585/3125 train_loss = 0.827
2019-04-09T11:30:59.029585: Epoch 3 Batch 1605/3125 train_loss = 0.971
2019-04-09T11:30:59.443109: Epoch 3 Batch 1625/3125 train_loss = 0.950
2019-04-09T11:30:59.862969: Epoch 3 Batch 1645/3125 train_loss = 0.978
2019-04-09T11:31:00.280054: Epoch 3 Batch 1665/3125 train_loss = 0.916
2019-04-09T11:31:00.697972: Epoch 3 Batch 1685/3125 train_loss = 0.893
2019-04-09T11:31:01.120406: Epoch 3 Batch 1705/3125 train_loss = 0.883
2019-04-09T11:31:01.540523: Epoch 3 Batch 1725/3125 train_loss = 0.834
2019-04-09T11:31:01.957635: Epoch 3 Batch 1745/3125 train_loss = 0.775
2019-04-09T11:31:02.372311: Epoch 3 Batch 1765/3125 train_loss = 0.825
2019-04-09T11:31:02.786676: Epoch 3 Batch 1785/3125 train_loss = 1.015
2019-04-09T11:31:03.204288: Epoch 3 Batch 1805/3125 train_loss = 0.958
2019-04-09T11:31:03.616851: Epoch 3 Batch 1825/3125 train_loss = 1.031
2019-04-09T11:31:04.029497: Epoch 3 Batch 1845/3125 train_loss = 0.922
2019-04-09T11:31:04.442097: Epoch 3 Batch 1865/3125 train_loss = 0.753
2019-04-09T11:31:04.856887: Epoch 3 Batch 1885/3125 train_loss = 0.986
2019-04-09T11:31:05.271825: Epoch 3 Batch 1905/3125 train_loss = 0.799
2019-04-09T11:31:05.688152: Epoch 3 Batch 1925/3125 train_loss = 0.830
2019-04-09T11:31:06.097059: Epoch 3 Batch 1945/3125 train_loss = 0.865
2019-04-09T11:31:06.510931: Epoch 3 Batch 1965/3125 train_loss = 0.867
2019-04-09T11:31:06.924666: Epoch 3 Batch 1985/3125 train_loss = 0.840
2019-04-09T11:31:07.341276: Epoch 3 Batch 2005/3125 train_loss = 0.881
2019-04-09T11:31:07.755738: Epoch 3 Batch 2025/3125 train_loss = 0.951
2019-04-09T11:31:08.168337: Epoch 3 Batch 2045/3125 train_loss = 0.754
2019-04-09T11:31:08.583280: Epoch 3 Batch 2065/3125 train_loss = 0.727
2019-04-09T11:31:08.998421: Epoch 3 Batch 2085/3125 train_loss = 1.058
2019-04-09T11:31:09.415818: Epoch 3 Batch 2105/3125 train_loss = 0.891
2019-04-09T11:31:09.827917: Epoch 3 Batch 2125/3125 train_loss = 0.976
2019-04-09T11:31:10.237408: Epoch 3 Batch 2145/3125 train_loss = 1.002
2019-04-09T11:31:10.652222: Epoch 3 Batch 2165/3125 train_loss = 0.862
2019-04-09T11:31:11.061610: Epoch 3 Batch 2185/3125 train_loss = 0.948
2019-04-09T11:31:11.476691: Epoch 3 Batch 2205/3125 train_loss = 0.958
2019-04-09T11:31:11.893028: Epoch 3 Batch 2225/3125 train_loss = 0.811
2019-04-09T11:31:12.428069: Epoch 3 Batch 2245/3125 train_loss = 0.798
2019-04-09T11:31:12.840171: Epoch 3 Batch 2265/3125 train_loss = 0.896
2019-04-09T11:31:13.254127: Epoch 3 Batch 2285/3125 train_loss = 1.099
2019-04-09T11:31:13.671868: Epoch 3 Batch 2305/3125 train_loss = 0.812
2019-04-09T11:31:14.083559: Epoch 3 Batch 2325/3125 train_loss = 0.788
2019-04-09T11:31:14.499758: Epoch 3 Batch 2345/3125 train_loss = 0.885
2019-04-09T11:31:14.912859: Epoch 3 Batch 2365/3125 train_loss = 0.702
2019-04-09T11:31:15.331776: Epoch 3 Batch 2385/3125 train_loss = 0.915
2019-04-09T11:31:15.749019: Epoch 3 Batch 2405/3125 train_loss = 0.908
2019-04-09T11:31:16.161618: Epoch 3 Batch 2425/3125 train_loss = 0.875
2019-04-09T11:31:16.583581: Epoch 3 Batch 2445/3125 train_loss = 1.002
2019-04-09T11:31:17.000198: Epoch 3 Batch 2465/3125 train_loss = 0.748
2019-04-09T11:31:17.420234: Epoch 3 Batch 2485/3125 train_loss = 0.880
2019-04-09T11:31:17.834288: Epoch 3 Batch 2505/3125 train_loss = 0.852
2019-04-09T11:31:18.247812: Epoch 3 Batch 2525/3125 train_loss = 0.849
2019-04-09T11:31:18.663700: Epoch 3 Batch 2545/3125 train_loss = 1.010
2019-04-09T11:31:19.076134: Epoch 3 Batch 2565/3125 train_loss = 0.851
2019-04-09T11:31:19.490451: Epoch 3 Batch 2585/3125 train_loss = 0.768
2019-04-09T11:31:19.905388: Epoch 3 Batch 2605/3125 train_loss = 0.867
2019-04-09T11:31:20.318355: Epoch 3 Batch 2625/3125 train_loss = 1.004
2019-04-09T11:31:20.732786: Epoch 3 Batch 2645/3125 train_loss = 0.906
2019-04-09T11:31:21.146894: Epoch 3 Batch 2665/3125 train_loss = 0.984
2019-04-09T11:31:21.566102: Epoch 3 Batch 2685/3125 train_loss = 0.920
2019-04-09T11:31:21.981681: Epoch 3 Batch 2705/3125 train_loss = 0.784
2019-04-09T11:31:22.399609: Epoch 3 Batch 2725/3125 train_loss = 0.916
2019-04-09T11:31:22.817940: Epoch 3 Batch 2745/3125 train_loss = 0.925
2019-04-09T11:31:23.266133: Epoch 3 Batch 2765/3125 train_loss = 0.837
2019-04-09T11:31:23.679262: Epoch 3 Batch 2785/3125 train_loss = 0.935
2019-04-09T11:31:24.097862: Epoch 3 Batch 2805/3125 train_loss = 0.839
2019-04-09T11:31:24.511944: Epoch 3 Batch 2825/3125 train_loss = 0.844
2019-04-09T11:31:24.926787: Epoch 3 Batch 2845/3125 train_loss = 0.858
2019-04-09T11:31:25.347381: Epoch 3 Batch 2865/3125 train_loss = 0.853
2019-04-09T11:31:25.764592: Epoch 3 Batch 2885/3125 train_loss = 0.939
2019-04-09T11:31:26.184209: Epoch 3 Batch 2905/3125 train_loss = 0.969
2019-04-09T11:31:26.601925: Epoch 3 Batch 2925/3125 train_loss = 0.868
2019-04-09T11:31:27.016711: Epoch 3 Batch 2945/3125 train_loss = 0.900
2019-04-09T11:31:27.435058: Epoch 3 Batch 2965/3125 train_loss = 0.939
2019-04-09T11:31:27.848061: Epoch 3 Batch 2985/3125 train_loss = 0.843
2019-04-09T11:31:28.261955: Epoch 3 Batch 3005/3125 train_loss = 0.860
2019-04-09T11:31:28.677308: Epoch 3 Batch 3025/3125 train_loss = 0.917
2019-04-09T11:31:29.091668: Epoch 3 Batch 3045/3125 train_loss = 0.883
2019-04-09T11:31:29.505770: Epoch 3 Batch 3065/3125 train_loss = 0.864
2019-04-09T11:31:29.920149: Epoch 3 Batch 3085/3125 train_loss = 0.867
2019-04-09T11:31:30.335191: Epoch 3 Batch 3105/3125 train_loss = 0.929
2019-04-09T11:31:30.978022: Epoch 3 Batch 17/781 test_loss = 0.866
2019-04-09T11:31:31.112380: Epoch 3 Batch 37/781 test_loss = 0.868
2019-04-09T11:31:31.248741: Epoch 3 Batch 57/781 test_loss = 0.894
2019-04-09T11:31:31.387784: Epoch 3 Batch 77/781 test_loss = 0.898
2019-04-09T11:31:31.519144: Epoch 3 Batch 97/781 test_loss = 0.790
2019-04-09T11:31:31.648478: Epoch 3 Batch 117/781 test_loss = 0.950
2019-04-09T11:31:31.787347: Epoch 3 Batch 137/781 test_loss = 0.922
2019-04-09T11:31:31.934742: Epoch 3 Batch 157/781 test_loss = 0.919
2019-04-09T11:31:32.076115: Epoch 3 Batch 177/781 test_loss = 0.873
2019-04-09T11:31:32.206462: Epoch 3 Batch 197/781 test_loss = 0.928
2019-04-09T11:31:32.347500: Epoch 3 Batch 217/781 test_loss = 0.699
2019-04-09T11:31:32.483362: Epoch 3 Batch 237/781 test_loss = 0.752
2019-04-09T11:31:32.612205: Epoch 3 Batch 257/781 test_loss = 1.014
2019-04-09T11:31:32.754584: Epoch 3 Batch 277/781 test_loss = 0.979
2019-04-09T11:31:32.897965: Epoch 3 Batch 297/781 test_loss = 0.961
2019-04-09T11:31:33.031821: Epoch 3 Batch 317/781 test_loss = 1.030
2019-04-09T11:31:33.166680: Epoch 3 Batch 337/781 test_loss = 0.906
2019-04-09T11:31:33.308477: Epoch 3 Batch 357/781 test_loss = 0.883
2019-04-09T11:31:33.450355: Epoch 3 Batch 377/781 test_loss = 0.932
2019-04-09T11:31:33.580701: Epoch 3 Batch 397/781 test_loss = 0.918
2019-04-09T11:31:33.721075: Epoch 3 Batch 417/781 test_loss = 0.842
2019-04-09T11:31:33.859944: Epoch 3 Batch 437/781 test_loss = 0.808
2019-04-09T11:31:33.988286: Epoch 3 Batch 457/781 test_loss = 0.690
2019-04-09T11:31:34.116627: Epoch 3 Batch 477/781 test_loss = 0.923
2019-04-09T11:31:34.256500: Epoch 3 Batch 497/781 test_loss = 0.807
2019-04-09T11:31:34.394868: Epoch 3 Batch 517/781 test_loss = 0.805
2019-04-09T11:31:34.522207: Epoch 3 Batch 537/781 test_loss = 0.802
2019-04-09T11:31:34.650046: Epoch 3 Batch 557/781 test_loss = 1.050
2019-04-09T11:31:34.792425: Epoch 3 Batch 577/781 test_loss = 0.912
2019-04-09T11:31:34.930292: Epoch 3 Batch 597/781 test_loss = 0.875
2019-04-09T11:31:35.058634: Epoch 3 Batch 617/781 test_loss = 0.862
2019-04-09T11:31:35.184973: Epoch 3 Batch 637/781 test_loss = 0.781
2019-04-09T11:31:35.314815: Epoch 3 Batch 657/781 test_loss = 1.008
2019-04-09T11:31:35.444363: Epoch 3 Batch 677/781 test_loss = 0.931
2019-04-09T11:31:35.578721: Epoch 3 Batch 697/781 test_loss = 0.907
2019-04-09T11:31:35.712076: Epoch 3 Batch 717/781 test_loss = 0.812
2019-04-09T11:31:35.841921: Epoch 3 Batch 737/781 test_loss = 0.764
2019-04-09T11:31:35.983800: Epoch 3 Batch 757/781 test_loss = 1.099
2019-04-09T11:31:36.119660: Epoch 3 Batch 777/781 test_loss = 0.960
2019-04-09T11:31:36.666392: Epoch 4 Batch 0/3125 train_loss = 0.960
2019-04-09T11:31:37.108038: Epoch 4 Batch 20/3125 train_loss = 0.848
2019-04-09T11:31:37.523644: Epoch 4 Batch 40/3125 train_loss = 0.929
2019-04-09T11:31:37.940279: Epoch 4 Batch 60/3125 train_loss = 0.729
2019-04-09T11:31:38.360397: Epoch 4 Batch 80/3125 train_loss = 0.870
2019-04-09T11:31:38.783226: Epoch 4 Batch 100/3125 train_loss = 0.972
2019-04-09T11:31:39.208774: Epoch 4 Batch 120/3125 train_loss = 1.008
2019-04-09T11:31:39.670500: Epoch 4 Batch 140/3125 train_loss = 0.932
2019-04-09T11:31:40.130223: Epoch 4 Batch 160/3125 train_loss = 0.786
2019-04-09T11:31:40.578223: Epoch 4 Batch 180/3125 train_loss = 0.829
2019-04-09T11:31:40.994831: Epoch 4 Batch 200/3125 train_loss = 1.105
2019-04-09T11:31:41.423976: Epoch 4 Batch 220/3125 train_loss = 0.862
2019-04-09T11:31:41.847103: Epoch 4 Batch 240/3125 train_loss = 0.981
2019-04-09T11:31:42.273237: Epoch 4 Batch 260/3125 train_loss = 0.926
2019-04-09T11:31:42.696015: Epoch 4 Batch 280/3125 train_loss = 0.991
2019-04-09T11:31:43.118928: Epoch 4 Batch 300/3125 train_loss = 1.056
2019-04-09T11:31:43.543558: Epoch 4 Batch 320/3125 train_loss = 0.991
2019-04-09T11:31:43.963668: Epoch 4 Batch 340/3125 train_loss = 0.723
2019-04-09T11:31:44.405001: Epoch 4 Batch 360/3125 train_loss = 0.811
2019-04-09T11:31:44.837830: Epoch 4 Batch 380/3125 train_loss = 0.903
2019-04-09T11:31:45.256898: Epoch 4 Batch 400/3125 train_loss = 0.788
2019-04-09T11:31:45.684205: Epoch 4 Batch 420/3125 train_loss = 0.845
2019-04-09T11:31:46.114850: Epoch 4 Batch 440/3125 train_loss = 0.845
2019-04-09T11:31:46.554569: Epoch 4 Batch 460/3125 train_loss = 0.917
2019-04-09T11:31:46.990729: Epoch 4 Batch 480/3125 train_loss = 0.982
2019-04-09T11:31:47.417146: Epoch 4 Batch 500/3125 train_loss = 0.671
2019-04-09T11:31:47.851802: Epoch 4 Batch 520/3125 train_loss = 0.905
2019-04-09T11:31:48.283919: Epoch 4 Batch 540/3125 train_loss = 0.806
2019-04-09T11:31:48.718582: Epoch 4 Batch 560/3125 train_loss = 1.032
2019-04-09T11:31:49.138201: Epoch 4 Batch 580/3125 train_loss = 0.989
2019-04-09T11:31:49.559825: Epoch 4 Batch 600/3125 train_loss = 0.909
2019-04-09T11:31:49.989670: Epoch 4 Batch 620/3125 train_loss = 0.941
2019-04-09T11:31:50.406780: Epoch 4 Batch 640/3125 train_loss = 0.862
2019-04-09T11:31:50.859348: Epoch 4 Batch 660/3125 train_loss = 0.912
2019-04-09T11:31:51.275455: Epoch 4 Batch 680/3125 train_loss = 0.932
2019-04-09T11:31:51.691919: Epoch 4 Batch 700/3125 train_loss = 0.911
2019-04-09T11:31:52.107926: Epoch 4 Batch 720/3125 train_loss = 0.782
2019-04-09T11:31:52.527656: Epoch 4 Batch 740/3125 train_loss = 0.911
2019-04-09T11:31:52.969684: Epoch 4 Batch 760/3125 train_loss = 0.782
2019-04-09T11:31:53.409955: Epoch 4 Batch 780/3125 train_loss = 0.905
2019-04-09T11:31:53.832580: Epoch 4 Batch 800/3125 train_loss = 0.798
2019-04-09T11:31:54.247683: Epoch 4 Batch 820/3125 train_loss = 0.871
2019-04-09T11:31:54.668933: Epoch 4 Batch 840/3125 train_loss = 0.808
2019-04-09T11:31:55.088550: Epoch 4 Batch 860/3125 train_loss = 0.828
2019-04-09T11:31:55.506010: Epoch 4 Batch 880/3125 train_loss = 0.811
2019-04-09T11:31:55.953370: Epoch 4 Batch 900/3125 train_loss = 0.888
2019-04-09T11:31:56.475762: Epoch 4 Batch 920/3125 train_loss = 0.953
2019-04-09T11:31:56.895627: Epoch 4 Batch 940/3125 train_loss = 0.898
2019-04-09T11:31:57.314926: Epoch 4 Batch 960/3125 train_loss = 0.927
2019-04-09T11:31:57.736404: Epoch 4 Batch 980/3125 train_loss = 1.019
2019-04-09T11:31:58.155519: Epoch 4 Batch 1000/3125 train_loss = 0.972
2019-04-09T11:31:58.571659: Epoch 4 Batch 1020/3125 train_loss = 0.885
2019-04-09T11:31:58.987239: Epoch 4 Batch 1040/3125 train_loss = 0.766
2019-04-09T11:31:59.407857: Epoch 4 Batch 1060/3125 train_loss = 0.975
2019-04-09T11:31:59.827189: Epoch 4 Batch 1080/3125 train_loss = 0.890
2019-04-09T11:32:00.250485: Epoch 4 Batch 1100/3125 train_loss = 0.794
2019-04-09T11:32:00.665686: Epoch 4 Batch 1120/3125 train_loss = 0.830
2019-04-09T11:32:01.076280: Epoch 4 Batch 1140/3125 train_loss = 0.850
2019-04-09T11:32:01.495207: Epoch 4 Batch 1160/3125 train_loss = 0.826
2019-04-09T11:32:01.909009: Epoch 4 Batch 1180/3125 train_loss = 0.813
2019-04-09T11:32:02.325685: Epoch 4 Batch 1200/3125 train_loss = 1.011
2019-04-09T11:32:02.747689: Epoch 4 Batch 1220/3125 train_loss = 0.964
2019-04-09T11:32:03.171817: Epoch 4 Batch 1240/3125 train_loss = 0.782
2019-04-09T11:32:03.593569: Epoch 4 Batch 1260/3125 train_loss = 0.848
2019-04-09T11:32:04.011798: Epoch 4 Batch 1280/3125 train_loss = 0.908
2019-04-09T11:32:04.430913: Epoch 4 Batch 1300/3125 train_loss = 0.794
2019-04-09T11:32:04.846453: Epoch 4 Batch 1320/3125 train_loss = 0.872
2019-04-09T11:32:05.263562: Epoch 4 Batch 1340/3125 train_loss = 0.716
2019-04-09T11:32:05.679810: Epoch 4 Batch 1360/3125 train_loss = 0.847
2019-04-09T11:32:06.099427: Epoch 4 Batch 1380/3125 train_loss = 0.831
2019-04-09T11:32:06.515033: Epoch 4 Batch 1400/3125 train_loss = 0.932
2019-04-09T11:32:06.932977: Epoch 4 Batch 1420/3125 train_loss = 0.911
2019-04-09T11:32:07.349584: Epoch 4 Batch 1440/3125 train_loss = 0.767
2019-04-09T11:32:07.768391: Epoch 4 Batch 1460/3125 train_loss = 0.885
2019-04-09T11:32:08.186503: Epoch 4 Batch 1480/3125 train_loss = 0.855
2019-04-09T11:32:08.610562: Epoch 4 Batch 1500/3125 train_loss = 0.890
2019-04-09T11:32:09.027935: Epoch 4 Batch 1520/3125 train_loss = 0.807
2019-04-09T11:32:09.448052: Epoch 4 Batch 1540/3125 train_loss = 0.970
2019-04-09T11:32:09.864802: Epoch 4 Batch 1560/3125 train_loss = 0.786
2019-04-09T11:32:10.279906: Epoch 4 Batch 1580/3125 train_loss = 0.913
2019-04-09T11:32:10.694227: Epoch 4 Batch 1600/3125 train_loss = 0.830
2019-04-09T11:32:11.113843: Epoch 4 Batch 1620/3125 train_loss = 0.764
2019-04-09T11:32:11.535264: Epoch 4 Batch 1640/3125 train_loss = 0.948
2019-04-09T11:32:11.951873: Epoch 4 Batch 1660/3125 train_loss = 1.003
2019-04-09T11:32:12.368324: Epoch 4 Batch 1680/3125 train_loss = 0.899
2019-04-09T11:32:12.877578: Epoch 4 Batch 1700/3125 train_loss = 0.787
2019-04-09T11:32:13.293848: Epoch 4 Batch 1720/3125 train_loss = 0.872
2019-04-09T11:32:13.710885: Epoch 4 Batch 1740/3125 train_loss = 0.929
2019-04-09T11:32:14.120976: Epoch 4 Batch 1760/3125 train_loss = 0.887
2019-04-09T11:32:14.538451: Epoch 4 Batch 1780/3125 train_loss = 0.851
2019-04-09T11:32:14.959239: Epoch 4 Batch 1800/3125 train_loss = 0.820
2019-04-09T11:32:15.374844: Epoch 4 Batch 1820/3125 train_loss = 0.807
2019-04-09T11:32:15.787555: Epoch 4 Batch 1840/3125 train_loss = 0.903
2019-04-09T11:32:16.206090: Epoch 4 Batch 1860/3125 train_loss = 0.977
2019-04-09T11:32:16.620547: Epoch 4 Batch 1880/3125 train_loss = 0.887
2019-04-09T11:32:17.036185: Epoch 4 Batch 1900/3125 train_loss = 0.734
2019-04-09T11:32:17.454960: Epoch 4 Batch 1920/3125 train_loss = 0.883
2019-04-09T11:32:17.870896: Epoch 4 Batch 1940/3125 train_loss = 0.792
2019-04-09T11:32:18.287611: Epoch 4 Batch 1960/3125 train_loss = 0.756
2019-04-09T11:32:18.708944: Epoch 4 Batch 1980/3125 train_loss = 0.856
2019-04-09T11:32:19.124550: Epoch 4 Batch 2000/3125 train_loss = 0.989
2019-04-09T11:32:19.539524: Epoch 4 Batch 2020/3125 train_loss = 0.987
2019-04-09T11:32:19.955392: Epoch 4 Batch 2040/3125 train_loss = 0.793
2019-04-09T11:32:20.373002: Epoch 4 Batch 2060/3125 train_loss = 0.851
2019-04-09T11:32:20.788365: Epoch 4 Batch 2080/3125 train_loss = 0.980
2019-04-09T11:32:21.207642: Epoch 4 Batch 2100/3125 train_loss = 0.782
2019-04-09T11:32:21.628621: Epoch 4 Batch 2120/3125 train_loss = 0.808
2019-04-09T11:32:22.042255: Epoch 4 Batch 2140/3125 train_loss = 0.840
2019-04-09T11:32:22.456976: Epoch 4 Batch 2160/3125 train_loss = 0.829
2019-04-09T11:32:22.867969: Epoch 4 Batch 2180/3125 train_loss = 0.917
2019-04-09T11:32:23.281501: Epoch 4 Batch 2200/3125 train_loss = 0.803
2019-04-09T11:32:23.696260: Epoch 4 Batch 2220/3125 train_loss = 0.832
2019-04-09T11:32:24.112367: Epoch 4 Batch 2240/3125 train_loss = 0.797
2019-04-09T11:32:24.528127: Epoch 4 Batch 2260/3125 train_loss = 0.872
2019-04-09T11:32:24.944427: Epoch 4 Batch 2280/3125 train_loss = 0.880
2019-04-09T11:32:25.362539: Epoch 4 Batch 2300/3125 train_loss = 0.847
2019-04-09T11:32:25.776624: Epoch 4 Batch 2320/3125 train_loss = 0.908
2019-04-09T11:32:26.191315: Epoch 4 Batch 2340/3125 train_loss = 0.849
2019-04-09T11:32:26.607493: Epoch 4 Batch 2360/3125 train_loss = 0.881
2019-04-09T11:32:27.021723: Epoch 4 Batch 2380/3125 train_loss = 0.835
2019-04-09T11:32:27.440410: Epoch 4 Batch 2400/3125 train_loss = 0.915
2019-04-09T11:32:27.850694: Epoch 4 Batch 2420/3125 train_loss = 0.794
2019-04-09T11:32:28.265448: Epoch 4 Batch 2440/3125 train_loss = 0.800
2019-04-09T11:32:28.684222: Epoch 4 Batch 2460/3125 train_loss = 0.852
2019-04-09T11:32:29.103336: Epoch 4 Batch 2480/3125 train_loss = 0.954
2019-04-09T11:32:29.520448: Epoch 4 Batch 2500/3125 train_loss = 0.811
2019-04-09T11:32:29.941087: Epoch 4 Batch 2520/3125 train_loss = 0.885
2019-04-09T11:32:30.357195: Epoch 4 Batch 2540/3125 train_loss = 0.845
2019-04-09T11:32:30.780301: Epoch 4 Batch 2560/3125 train_loss = 0.665
2019-04-09T11:32:31.195065: Epoch 4 Batch 2580/3125 train_loss = 0.825
2019-04-09T11:32:31.604654: Epoch 4 Batch 2600/3125 train_loss = 0.868
2019-04-09T11:32:32.018648: Epoch 4 Batch 2620/3125 train_loss = 0.813
2019-04-09T11:32:32.435706: Epoch 4 Batch 2640/3125 train_loss = 0.826
2019-04-09T11:32:32.853230: Epoch 4 Batch 2660/3125 train_loss = 1.017
2019-04-09T11:32:33.270841: Epoch 4 Batch 2680/3125 train_loss = 0.769
2019-04-09T11:32:33.692010: Epoch 4 Batch 2700/3125 train_loss = 0.922
2019-04-09T11:32:34.136192: Epoch 4 Batch 2720/3125 train_loss = 0.796
2019-04-09T11:32:34.551979: Epoch 4 Batch 2740/3125 train_loss = 0.870
2019-04-09T11:32:34.968683: Epoch 4 Batch 2760/3125 train_loss = 0.799
2019-04-09T11:32:35.385795: Epoch 4 Batch 2780/3125 train_loss = 0.842
2019-04-09T11:32:35.803009: Epoch 4 Batch 2800/3125 train_loss = 1.050
2019-04-09T11:32:36.220554: Epoch 4 Batch 2820/3125 train_loss = 1.034
2019-04-09T11:32:36.638668: Epoch 4 Batch 2840/3125 train_loss = 0.822
2019-04-09T11:32:37.057100: Epoch 4 Batch 2860/3125 train_loss = 0.789
2019-04-09T11:32:37.477429: Epoch 4 Batch 2880/3125 train_loss = 0.858
2019-04-09T11:32:37.894122: Epoch 4 Batch 2900/3125 train_loss = 0.833
2019-04-09T11:32:38.309463: Epoch 4 Batch 2920/3125 train_loss = 0.849
2019-04-09T11:32:38.727701: Epoch 4 Batch 2940/3125 train_loss = 0.879
2019-04-09T11:32:39.142808: Epoch 4 Batch 2960/3125 train_loss = 0.877
2019-04-09T11:32:39.560118: Epoch 4 Batch 2980/3125 train_loss = 0.827
2019-04-09T11:32:39.978247: Epoch 4 Batch 3000/3125 train_loss = 0.920
2019-04-09T11:32:40.396863: Epoch 4 Batch 3020/3125 train_loss = 1.001
2019-04-09T11:32:40.812059: Epoch 4 Batch 3040/3125 train_loss = 0.956
2019-04-09T11:32:41.228167: Epoch 4 Batch 3060/3125 train_loss = 0.814
2019-04-09T11:32:41.643774: Epoch 4 Batch 3080/3125 train_loss = 1.017
2019-04-09T11:32:42.059833: Epoch 4 Batch 3100/3125 train_loss = 1.032
2019-04-09T11:32:42.478235: Epoch 4 Batch 3120/3125 train_loss = 0.816
2019-04-09T11:32:42.674176: Epoch 4 Batch 16/781 test_loss = 0.830
2019-04-09T11:32:42.806027: Epoch 4 Batch 36/781 test_loss = 0.903
2019-04-09T11:32:42.936875: Epoch 4 Batch 56/781 test_loss = 0.934
2019-04-09T11:32:43.067222: Epoch 4 Batch 76/781 test_loss = 0.974
2019-04-09T11:32:43.197569: Epoch 4 Batch 96/781 test_loss = 1.000
2019-04-09T11:32:43.326913: Epoch 4 Batch 116/781 test_loss = 0.887
2019-04-09T11:32:43.457535: Epoch 4 Batch 136/781 test_loss = 0.811
2019-04-09T11:32:43.588383: Epoch 4 Batch 156/781 test_loss = 0.876
2019-04-09T11:32:43.716224: Epoch 4 Batch 176/781 test_loss = 0.865
2019-04-09T11:32:43.846583: Epoch 4 Batch 196/781 test_loss = 0.786
2019-04-09T11:32:43.975413: Epoch 4 Batch 216/781 test_loss = 0.974
2019-04-09T11:32:44.105258: Epoch 4 Batch 236/781 test_loss = 0.793
2019-04-09T11:32:44.235605: Epoch 4 Batch 256/781 test_loss = 0.827
2019-04-09T11:32:44.367456: Epoch 4 Batch 276/781 test_loss = 1.097
2019-04-09T11:32:44.496146: Epoch 4 Batch 296/781 test_loss = 0.813
2019-04-09T11:32:44.625489: Epoch 4 Batch 316/781 test_loss = 0.820
2019-04-09T11:32:44.754834: Epoch 4 Batch 336/781 test_loss = 0.760
2019-04-09T11:32:44.884178: Epoch 4 Batch 356/781 test_loss = 0.885
2019-04-09T11:32:45.013021: Epoch 4 Batch 376/781 test_loss = 0.872
2019-04-09T11:32:45.141362: Epoch 4 Batch 396/781 test_loss = 0.807
2019-04-09T11:32:45.273213: Epoch 4 Batch 416/781 test_loss = 0.935
2019-04-09T11:32:45.402058: Epoch 4 Batch 436/781 test_loss = 0.955
2019-04-09T11:32:45.533244: Epoch 4 Batch 456/781 test_loss = 0.735
2019-04-09T11:32:45.666098: Epoch 4 Batch 476/781 test_loss = 0.931
2019-04-09T11:32:45.795442: Epoch 4 Batch 496/781 test_loss = 0.966
2019-04-09T11:32:45.925789: Epoch 4 Batch 516/781 test_loss = 0.760
2019-04-09T11:32:46.054130: Epoch 4 Batch 536/781 test_loss = 0.990
2019-04-09T11:32:46.183474: Epoch 4 Batch 556/781 test_loss = 0.868
2019-04-09T11:32:46.312818: Epoch 4 Batch 576/781 test_loss = 0.940
2019-04-09T11:32:46.441522: Epoch 4 Batch 596/781 test_loss = 0.959
2019-04-09T11:32:46.571869: Epoch 4 Batch 616/781 test_loss = 0.930
2019-04-09T11:32:46.702216: Epoch 4 Batch 636/781 test_loss = 0.809
2019-04-09T11:32:46.831560: Epoch 4 Batch 656/781 test_loss = 0.876
2019-04-09T11:32:46.960904: Epoch 4 Batch 676/781 test_loss = 1.057
2019-04-09T11:32:47.092254: Epoch 4 Batch 696/781 test_loss = 0.856
2019-04-09T11:32:47.222600: Epoch 4 Batch 716/781 test_loss = 0.852
2019-04-09T11:32:47.351944: Epoch 4 Batch 736/781 test_loss = 1.075
2019-04-09T11:32:47.480286: Epoch 4 Batch 756/781 test_loss = 0.809
2019-04-09T11:32:47.610131: Epoch 4 Batch 776/781 test_loss = 0.753
Model Trained and Saved
在 TensorBoard 中查看可视化结果
tensorboard --logdir=/PATH_TO_CODE/runs/1513402825/summaries/
保存参数
保存save_dir
在生成预测时使用。
save_params((save_dir))
load_dir = load_params()
显示训练Loss
plt.plot(losses['train'], label='Training loss')
plt.legend()
_ = plt.ylim()
显示测试Loss
迭代次数再增加一些,下降的趋势会明显一些
plt.plot(losses['test'], label='Test loss')
plt.legend()
_ = plt.ylim()
获取 Tensors
使用函数 get_tensor_by_name()
从 loaded_graph
中获取tensors,后面的推荐功能要用到。
def get_tensors(loaded_graph):
uid = loaded_graph.get_tensor_by_name("uid:0")
user_gender = loaded_graph.get_tensor_by_name("user_gender:0")
user_age = loaded_graph.get_tensor_by_name("user_age:0")
user_job = loaded_graph.get_tensor_by_name("user_job:0")
movie_id = loaded_graph.get_tensor_by_name("movie_id:0")
movie_categories = loaded_graph.get_tensor_by_name("movie_categories:0")
movie_titles = loaded_graph.get_tensor_by_name("movie_titles:0")
targets = loaded_graph.get_tensor_by_name("targets:0")
dropout_keep_prob = loaded_graph.get_tensor_by_name("dropout_keep_prob:0")
lr = loaded_graph.get_tensor_by_name("LearningRate:0")
#两种不同计算预测评分的方案使用不同的name获取tensor inference
# inference = loaded_graph.get_tensor_by_name("inference/inference/BiasAdd:0")
inference = loaded_graph.get_tensor_by_name("inference/ExpandDims:0") # 之前是MatMul:0 因为inference代码修改了 这里也要修改 感谢网友 @清歌 指出问题
movie_combine_layer_flat = loaded_graph.get_tensor_by_name("movie_fc/Reshape:0")
user_combine_layer_flat = loaded_graph.get_tensor_by_name("user_fc/Reshape:0")
return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference, movie_combine_layer_flat, user_combine_layer_flat
指定用户和电影进行评分
这部分就是对网络做正向传播,计算得到预测的评分
def rating_movie(user_id_val, movie_id_val):
loaded_graph = tf.Graph() #
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
# Get Tensors from loaded model
uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference,_, __ = get_tensors(loaded_graph) #loaded_graph
categories = np.zeros([1, 18])
categories[0] = movies.values[movieid2idx[movie_id_val]][2]
titles = np.zeros([1, sentences_size])
titles[0] = movies.values[movieid2idx[movie_id_val]][1]
feed = {
uid: np.reshape(users.values[user_id_val-1][0], [1, 1]),
user_gender: np.reshape(users.values[user_id_val-1][1], [1, 1]),
user_age: np.reshape(users.values[user_id_val-1][2], [1, 1]),
user_job: np.reshape(users.values[user_id_val-1][3], [1, 1]),
movie_id: np.reshape(movies.values[movieid2idx[movie_id_val]][0], [1, 1]),
movie_categories: categories, #x.take(6,1)
movie_titles: titles, #x.take(5,1)
dropout_keep_prob: 1}
# Get Prediction
inference_val = sess.run([inference], feed)
return (inference_val)
rating_movie(234, 1401)
INFO:tensorflow:Restoring parameters from ./save
[array([[3.1157281]], dtype=float32)]
生成Movie特征矩阵
将训练好的电影特征组合成电影特征矩阵并保存到本地
loaded_graph = tf.Graph() #
movie_matrics = []
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
# Get Tensors from loaded model
uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, movie_combine_layer_flat, __ = get_tensors(loaded_graph) #loaded_graph
for item in movies.values:
categories = np.zeros([1, 18])
categories[0] = item.take(2)
titles = np.zeros([1, sentences_size])
titles[0] = item.take(1)
feed = {
movie_id: np.reshape(item.take(0), [1, 1]),
movie_categories: categories, #x.take(6,1)
movie_titles: titles, #x.take(5,1)
dropout_keep_prob: 1}
movie_combine_layer_flat_val = sess.run([movie_combine_layer_flat], feed)
movie_matrics.append(movie_combine_layer_flat_val)
pickle.dump((np.array(movie_matrics).reshape(-1, 200)), open('movie_matrics.p', 'wb'))
movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))
INFO:tensorflow:Restoring parameters from ./save
movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))
生成User特征矩阵
将训练好的用户特征组合成用户特征矩阵并保存到本地
loaded_graph = tf.Graph() #
users_matrics = []
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
# Get Tensors from loaded model
uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, __,user_combine_layer_flat = get_tensors(loaded_graph) #loaded_graph
for item in users.values:
feed = {
uid: np.reshape(item.take(0), [1, 1]),
user_gender: np.reshape(item.take(1), [1, 1]),
user_age: np.reshape(item.take(2), [1, 1]),
user_job: np.reshape(item.take(3), [1, 1]),
dropout_keep_prob: 1}
user_combine_layer_flat_val = sess.run([user_combine_layer_flat], feed)
users_matrics.append(user_combine_layer_flat_val)
pickle.dump((np.array(users_matrics).reshape(-1, 200)), open('users_matrics.p', 'wb'))
users_matrics = pickle.load(open('users_matrics.p', mode='rb'))
INFO:tensorflow:Restoring parameters from ./save
users_matrics = pickle.load(open('users_matrics.p', mode='rb'))
开始推荐电影
使用生产的用户特征矩阵和电影特征矩阵做电影推荐
推荐同类型的电影
思路是计算当前看的电影特征向量与整个电影特征矩阵的余弦相似度,取相似度最大的top_k个,这里加了些随机选择在里面,保证每次的推荐稍稍有些不同。
def recommend_same_type_movie(movie_id_val, top_k = 20):
loaded_graph = tf.Graph() #
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
norm_movie_matrics = tf.sqrt(tf.reduce_sum(tf.square(movie_matrics), 1, keep_dims=True))
normalized_movie_matrics = movie_matrics / norm_movie_matrics
#推荐同类型的电影
probs_embeddings = (movie_matrics[movieid2idx[movie_id_val]]).reshape([1, 200])
probs_similarity = tf.matmul(probs_embeddings, tf.transpose(normalized_movie_matrics))
sim = (probs_similarity.eval())
# results = (-sim[0]).argsort()[0:top_k]
# print(results)
print("您看的电影是:{}".format(movies_orig[movieid2idx[movie_id_val]]))
print("以下是给您的推荐:")
p = np.squeeze(sim)
p[np.argsort(p)[:-top_k]] = 0
p = p / np.sum(p)
results = set()
while len(results) != 5:
c = np.random.choice(3883, 1, p=p)[0]
results.add(c)
for val in (results):
print(val)
print(movies_orig[val])
return results
recommend_same_type_movie(1401, 20)
INFO:tensorflow:Restoring parameters from ./save
您看的电影是:[1401 'Ghosts of Mississippi (1996)' 'Drama']
以下是给您的推荐:
3009
[3078 'Liberty Heights (1999)' 'Drama']
1123
[1139 'Everything Relative (1996)' 'Drama']
2919
[2988 'Melvin and Howard (1980)' 'Drama']
3122
[3191 'Quarry, The (1998)' 'Drama']
405
[409 'Above the Rim (1994)' 'Drama']
{405, 1123, 2919, 3009, 3122}
推荐您喜欢的电影
思路是使用用户特征向量与电影特征矩阵计算所有电影的评分,取评分最高的top_k个,同样加了些随机选择部分。
def recommend_your_favorite_movie(user_id_val, top_k = 10):
loaded_graph = tf.Graph() #
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
#推荐您喜欢的电影
probs_embeddings = (users_matrics[user_id_val-1]).reshape([1, 200])
probs_similarity = tf.matmul(probs_embeddings, tf.transpose(movie_matrics))
sim = (probs_similarity.eval())
# print(sim.shape)
# results = (-sim[0]).argsort()[0:top_k]
# print(results)
# sim_norm = probs_norm_similarity.eval()
# print((-sim_norm[0]).argsort()[0:top_k])
print("以下是给您的推荐:")
p = np.squeeze(sim)
p[np.argsort(p)[:-top_k]] = 0
p = p / np.sum(p)
results = set()
while len(results) != 5:
c = np.random.choice(3883, 1, p=p)[0]
results.add(c)
for val in (results):
print(val)
print(movies_orig[val])
return results
recommend_your_favorite_movie(234, 10)
INFO:tensorflow:Restoring parameters from ./save
以下是给您的推荐:
523
[527 "Schindler's List (1993)" 'Drama|War']
910
[922 'Sunset Blvd. (a.k.a. Sunset Boulevard) (1950)' 'Film-Noir']
315
[318 'Shawshank Redemption, The (1994)' 'Drama']
892
[904 'Rear Window (1954)' 'Mystery|Thriller']
763
[773 'Touki Bouki (Journey of the Hyena) (1973)' 'Drama']
{315, 523, 763, 892, 910}
看过这个电影的人还看了(喜欢)哪些电影
- 首先选出喜欢某个电影的top_k个人,得到这几个人的用户特征向量。
- 然后计算这几个人对所有电影的评分
- 选择每个人评分最高的电影作为推荐
- 同样加入了随机选择
import random
def recommend_other_favorite_movie(movie_id_val, top_k = 20):
loaded_graph = tf.Graph() #
with tf.Session(graph=loaded_graph) as sess: #
# Load saved model
loader = tf.train.import_meta_graph(load_dir + '.meta')
loader.restore(sess, load_dir)
probs_movie_embeddings = (movie_matrics[movieid2idx[movie_id_val]]).reshape([1, 200])
probs_user_favorite_similarity = tf.matmul(probs_movie_embeddings, tf.transpose(users_matrics))
favorite_user_id = np.argsort(probs_user_favorite_similarity.eval())[0][-top_k:]
# print(normalized_users_matrics.eval().shape)
# print(probs_user_favorite_similarity.eval()[0][favorite_user_id])
# print(favorite_user_id.shape)
print("您看的电影是:{}".format(movies_orig[movieid2idx[movie_id_val]]))
print("喜欢看这个电影的人是:{}".format(users_orig[favorite_user_id-1]))
probs_users_embeddings = (users_matrics[favorite_user_id-1]).reshape([-1, 200])
probs_similarity = tf.matmul(probs_users_embeddings, tf.transpose(movie_matrics))
sim = (probs_similarity.eval())
# results = (-sim[0]).argsort()[0:top_k]
# print(results)
# print(sim.shape)
# print(np.argmax(sim, 1))
p = np.argmax(sim, 1)
print("喜欢看这个电影的人还喜欢看:")
results = set()
while len(results) != 5:
c = p[random.randrange(top_k)]
results.add(c)
for val in (results):
print(val)
print(movies_orig[val])
return results
recommend_other_favorite_movie(1401, 20)
INFO:tensorflow:Restoring parameters from ./save
您看的电影是:[1401 'Ghosts of Mississippi (1996)' 'Drama']
喜欢看这个电影的人是:[[1568 'F' 1 10]
[4814 'M' 18 14]
[5217 'M' 25 17]
[1745 'M' 45 0]
[1763 'M' 35 7]
[5861 'F' 50 1]
[493 'M' 50 7]
[3031 'M' 18 4]
[2144 'M' 18 0]
[1644 'M' 18 12]
[3833 'M' 25 1]
[5678 'M' 35 17]
[1701 'F' 25 4]
[3297 'M' 18 4]
[4800 'M' 18 4]
[1109 'M' 18 10]
[2496 'M' 50 1]
[100 'M' 35 17]
[2154 'M' 25 12]
[4085 'F' 25 6]]
喜欢看这个电影的人还喜欢看:
1132
[1148 'Wrong Trousers, The (1993)' 'Animation|Comedy']
1133
[1149 'JLG/JLG - autoportrait de d閏embre (1994)' 'Documentary|Drama']
847
[858 'Godfather, The (1972)' 'Action|Crime|Drama']
763
[773 'Touki Bouki (Journey of the Hyena) (1973)' 'Drama']
1950
[2019
'Seven Samurai (The Magnificent Seven) (Shichinin no samurai) (1954)'
'Action|Drama']
{763, 847, 1132, 1133, 1950}
结论
以上就是实现的常用的推荐功能,将网络模型作为回归问题进行训练,得到训练好的用户特征矩阵和电影特征矩阵进行推荐。
扩展阅读
如果你对个性化推荐感兴趣,以下资料建议你看看:
今天的分享就到这里,请多指教!