|
AI100_机器学习日报 2017-10-14
@ 好东西传送门 出品,由@AI100运营, 过往目录 见 http://ai100.com.cn
订阅:关注微信公众号 AI100(ID:rgznai100,扫二维码),回复“机器学习日报”,加你进日报群
今日焦点 (5)
爱可可-爱生活 网页链接 2017-10-14 06:52
自然语言处理
【自然语言处理(NLP)任务与参考资源精选】’Natural Language Processing Tasks and References' by Kyubyong GitHub: https ://github .com/Kyubyong/nlp_tasks
[img=20,20][/img] wx:Coldwings 网页链接 2017-10-15 03:27
深度学习 算法 应用 资源 自然语言处理 Max Iterations Max Pooling Python Yann Lecun 代码 可视化 课程 神经网络 王蓁 预测
「开发 | 如何利用微信监管你的TF训练」AI科技评论按:本文作者Coldwings,AI科技评论获其授权发布。之前回答问题【在机器学习模型的训练期间,大概几十分钟到几小时不等,大家都会在等实验的时候做什么?】的时候,说到可以用微信来管着训练,完全不用守着。没想到这么受欢迎……原问题下的回答如下 不知道有哪些朋友是在TF/keras/chainer/mxnet等框架下用python撸的….…这可是python啊……上itchat,弄个微信号加自己为好友(或者自己发自己),训练进展跟着一路发消息给自己就好了,做了可视化的话顺便把图也一并发过来。然后就能安心睡觉/逛街/泡妞/写答案了。讲道理,甚至简单的参数调整都可以照着用手机来……大体效果如下当然可以做得更全面一些。最可靠的办法自然是干脆地做一个http服务或者一个rpc,然而这样往往太麻烦。本着简单高效的原则,几行代码能起到效果方便自己当然是最好的,接入微信或者web真就是不错的选择了。只是查看的话,TensorBoard就很好,但是如果想加入一些自定义操作,还是自行定制的。echat.js做成web,或者itchat做个微信服务,都是挺不赖的选择。 正文如下这里折腾一个例子。以TensorFlow的example中,利用CNN处理MNIST的程序为例,我们做一点点小小的修改。首先这里放上写完的代码:#!/usr/bin/env python# coding: utf-8 '''A Convolutional Network implementation example using TensorFlow library.This example is using the MNIST database of handwritten digits(http://yann.lecun.com/exdb/mnist/)Author: Aymeric DamienProject: https://github.com/aymericdamien/TensorFlow-Examples/ Add a itchat controller with multi thread''' from __future__ import print_function import tensorflow as tf # Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_data # Import itchat & threadingimport itchatimport threading # Create a running status flaglock = threading.Lock()running = False # Parameterslearning_rate = 0.001training_iters = 200000batch_size = 128display_step = 10 def nn_train(wechat_name, param): global lock, running # Lock with lock: running = True # mnist data reading mnist = input_data.read_data_sets("data/", one_hot=True) # Parameters # learning_rate = 0.001 # training_iters = 200000 # batch_size = 128 # display_step = 10 learning_rate, training_iters, batch_size, display_step = param # Network Parameters n_input = 784 # MNIST data input (img shape: 28*28) n_classes = 10 # MNIST total classes (0-9 digits) dropout = 0.75 # Dropout, probability to keep units # tf Graph input x = tf.placeholder(tf.float32, [None, n_input]) y = tf.placeholder(tf.float32, [None, n_classes]) keep_prob = tf.placeholder(tf.float32) #dropout (keep probability) # Create some wrappers for simplicity def conv2d(x, W, b, strides=1): # Conv2D wrapper, with bias and relu activation x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME') x = tf.nn.bias_add(x, b) return tf.nn.relu(x) def maxpool2d(x, k=2): # MaxPool2D wrapper return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME') # Create model def conv_net(x, weights, biases, dropout): # Reshape input picture x = tf.reshape(x, shape=[-1, 28, 28, 1]) # Convolution Layer conv1 = conv2d(x, weights['wc1'], biases['bc1']) # Max Pooling (down-sampling) conv1 = maxpool2d(conv1, k=2) # Convolution Layer conv2 = conv2d(conv1, weights['wc2'], biases['bc2']) # Max Pooling (down-sampling) conv2 = maxpool2d(conv2, k=2) # Fully connected layer # Reshape conv2 output to fit fully connected layer input fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]]) fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1']) fc1 = tf.nn.relu(fc1) # Apply Dropout fc1 = tf.nn.dropout(fc1, dropout) # Output, class prediction out = tf.add(tf.matmul(fc1, weights['out']), biases['out']) return out # Store layers weight & bias weights = { # 5x5 conv, 1 input, 32 outputs 'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])), # 5x5 conv, 32 inputs, 64 outputs 'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])), # fully connected, 7*7*64 inputs, 1024 outputs 'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])), # 1024 inputs, 10 outputs (class prediction) 'out': tf.Variable(tf.random_normal([1024, n_classes])) } biases = { 'bc1': tf.Variable(tf.random_normal([32])), 'bc2': tf.Variable(tf.random_normal([64])), 'bd1': tf.Variable(tf.random_normal([1024])), 'out': tf.Variable(tf.random_normal([n_classes])) } # Construct model pred = conv_net(x, weights, biases, keep_prob) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Initializing the variables init = tf.global_variables_initializer() # Launch the graph with tf.Session() as sess: sess.run(init) step = 1 # Keep training until reach max iterations print('Wait for lock') with lock: run_state = running print('Start') while step * batch_size < training_iters and run_state: batch_x, batch_y = mnist.train.next_batch(batch_size) # Run optimization op (backprop) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout}) if step % display_step == 0: # Calculate batch loss and accuracy loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.}) print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy= " + "{:.5f}".format(acc)) itchat.send("Iter " + str(step*batch_size) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy= " + "{:.5f}".format(acc), wechat_name) step += 1 with lock: run_state = running print("Optimization Finished!") itchat.send("Optimization Finished!", wechat_name) # Calculate accuracy for 256 mnist test images print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: mnist.test.images[:256], y: mnist.test.labels[:256], keep_prob: 1.})) itchat.send("Testing Accuracy: %s" % sess.run(accuracy, feed_dict={x: mnist.test.images[:256], y: mnist.test.labels[:256], keep_prob: 1.}), wechat_name) with lock: running = False @itchat.msg_register([itchat.content.TEXT])def chat_trigger(msg): global lock, running, learning_rate, training_iters, batch_size, display_step if msg['Text'] == u'开始': print('Starting') with lock: run_state = running if not run_state: try: threading.Thread(target=nn_train, args=(msg['FromUserName'], (learning_rate, training_iters, batch_size, display_step))).start() except: msg.reply('Running') elif msg['Text'] == u'停止': print('Stopping') with lock: running = False elif msg['Text'] == u'参数': itchat.send('lr=%f, ti=%d, bs=%d, ds=%d'%(learning_rate, training_iters, batch_size, display_step),msg['FromUserName']) else: try: param = msg['Text'].split() key, value = param print(key, value) if key == 'lr': learning_rate = float(value) elif key == 'ti': training_iters = int(value) elif key == 'bs': batch_size = int(value) elif key == 'ds': display_step = int(value) except: pass if __name__ == '__main__': itchat.auto_login(hotReload=True) itchat.run() 这段代码里面,我所做的修改主要是:0.导入了itchat和threading1. 把原本的脚本里网络构成和训练的部分甩到了一个函数nn_train里def nn_train(wechat_name, param): global lock, running # Lock with lock: running = True # mnist data reading mnist = input_data.read_data_sets("data/", one_hot=True) # Parameters # learning_rate = 0.001 # training_iters = 200000 # batch_size = 128 # display_step = 10 learning_rate, training_iters, batch_size, display_step = param # Network Parameters n_input = 784 # MNIST data input (img shape: 28*28) n_classes = 10 # MNIST total classes (0-9 digits) dropout = 0.75 # Dropout, probability to keep units # tf Graph input x = tf.placeholder(tf.float32, [None, n_input]) y = tf.placeholder(tf.float32, [None, n_classes]) keep_prob = tf.placeholder(tf.float32) #dropout (keep probability) # Create some wrappers for simplicity def conv2d(x, W, b, strides=1): # Conv2D wrapper, with bias and relu activation x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME') x = tf.nn.bias_add(x, b) return tf.nn.relu(x) def maxpool2d(x, k=2): # MaxPool2D wrapper return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME') # Create model def conv_net(x, weights, biases, dropout): # Reshape input picture x = tf.reshape(x, shape=[-1, 28, 28, 1]) # Convolution Layer conv1 = conv2d(x, weights['wc1'], biases['bc1']) # Max Pooling (down-sampling) conv1 = maxpool2d(conv1, k=2) # Convolution Layer conv2 = conv2d(conv1, weights['wc2'], biases['bc2']) # Max Pooling (down-sampling) conv2 = maxpool2d(conv2, k=2) # Fully connected layer # Reshape conv2 output to fit fully connected layer input fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]]) fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1']) fc1 = tf.nn.relu(fc1) # Apply Dropout fc1 = tf.nn.dropout(fc1, dropout) # Output, class prediction out = tf.add(tf.matmul(fc1, weights['out']), biases['out']) return out # Store layers weight & bias weights = { # 5x5 conv, 1 input, 32 outputs 'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])), # 5x5 conv, 32 inputs, 64 outputs 'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])), # fully connected, 7*7*64 inputs, 1024 outputs 'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])), # 1024 inputs, 10 outputs (class prediction) 'out': tf.Variable(tf.random_normal([1024, n_classes])) } biases = { 'bc1': tf.Variable(tf.random_normal([32])), 'bc2': tf.Variable(tf.random_normal([64])), 'bd1': tf.Variable(tf.random_normal([1024])), 'out': tf.Variable(tf.random_normal([n_classes])) } # Construct model pred = conv_net(x, weights, biases, keep_prob) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Initializing the variables init = tf.global_variables_initializer() # Launch the graph with tf.Session() as sess: sess.run(init) step = 1 # Keep training until reach max iterations print('Wait for lock') with lock: run_state = running print('Start') while step * batch_size < training_iters and run_state: batch_x, batch_y = mnist.train.next_batch(batch_size) # Run optimization op (backprop) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout}) if step % display_step == 0: # Calculate batch loss and accuracy loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.}) print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy= " + "{:.5f}".format(acc)) itchat.send("Iter " + str(step*batch_size) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy= " + "{:.5f}".format(acc), wechat_name) step += 1 with lock: run_state = running print("Optimization Finished!") itchat.send("Optimization Finished!", wechat_name) # Calculate accuracy for 256 mnist test images print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: mnist.test.images[:256], y: mnist.test.labels[:256], keep_prob: 1.})) itchat.send("Testing Accuracy: %s" % sess.run(accuracy, feed_dict={x: mnist.test.images[:256], y: mnist.test.labels[:256], keep_prob: 1.}), wechat_name) with lock: running = False 这里大部分是跟原本的代码一样的,不过首先所有print的地方都加了个itchat.send来输出日志,此外加了个带锁的状态量running用来做运行开关。此外,部分参数是通过函数参数传入的。然后呢,写了个itchat的handler@itchat.msg_register([itchat.content.TEXT])def chat_trigger(msg): global lock, running, learning_rate, training_iters, batch_size, display_step if msg['Text'] == u'开始': print('Starting') with lock: run_state = running if not run_state: try: threading.Thread(target=nn_train, args=(msg['FromUserName'], (learning_rate, training_iters, batch_size, display_step))).start() except: msg.reply('Running') 作用是,如果收到微信消息,内容为『开始』,那就跑训练的函数(当然,为了防止阻塞,放在了另一个线程里)最后再在脚本主流程里使用itchat登录微信并且启动itchat的服务,这样就实现了基本的控制。if __name__ == '__main__': itchat.auto_login(hotReload=True) itchat.run()但是我们不满足于此,我还希望可以对流程进行一些控制,对参数进行一些修改,于是乎:@itchat.msg_register([itchat.content.TEXT])def chat_trigger(msg): global lock, running, learning_rate, training_iters, batch_size, display_step if msg['Text'] == u'开始': print('Starting') with lock: run_state = running if not run_state: try: threading.Thread(target=nn_train, args=(msg['FromUserName'], (learning_rate, training_iters, batch_size, display_step))).start() except: msg.reply('Running') elif msg['Text'] == u'停止': print('Stopping') with lock: running = False elif msg['Text'] == u'参数': itchat.send('lr=%f, ti=%d, bs=%d, ds=%d'%(learning_rate, training_iters, batch_size, display_step),msg['FromUserName']) else: try: param = msg['Text'].split() key, value = param print(key, value) if key == 'lr': learning_rate = float(value) elif key == 'ti': training_iters = int(value) elif key == 'bs': batch_size = int(value) elif key == 'ds': display_step = int(value) except: pass 通过这个,我们可以在epoch中途停止(因为nn_train里通过检查running标志来确定是否需要停下来),也可以在训练开始前调整learning_rate等几个参数。实在是很简单……————— AI 科技评论招人啦! —————你即将从事的工作内容:报道海内外人工智能相关学术会议,形成具有影响力的报道内容;采访高校学术青年领袖,输出人工智能领域的深度观点;跟进国内外学术热点,深入剖析学术动态;我们希望你是这样的小伙伴:英语好,有阅读英文科技网站的习惯;兴趣广,对人工智能有关注及了解;态度佳,有求知欲,善于学习;欢迎发送简历到 guoyixin@leiphone.com————— 给爱学习的你的福利 —————随着大众互联网理财观念的逐步普及,理财规模随之扩大,应运而生的智能投顾,成本低、风险分散、无情绪化,越来越多的中产阶层、大众富裕阶层已然在慢慢接受。王蓁博士将以真实项目带你走上智能投顾之路,详情请识别下图二维码或点击文末阅读原文———————————————————— via: http://mp.weixin.qq.com/s?__biz= ... e=0#wechat_redirect
南京轻搜 网页链接 2017-10-14 10:15
迁移学习
【OpenAI竞争性自我对抗训练:简单环境下获得复杂的智能体】 OpenAI 近日表示通过自我对抗训练的竞争性多智能体可以产生比环境本身复杂得多的行为。该研究基于 Dota2 的自我对抗训练成果进一步研究了该机制的特点与优势。此外,OpenAI 表明自我对抗训练有助于实现迁移学习,将会成为 AI 系统的核心部分...全文: http://m.weibo.cn/5897818869/4162695060348394
爱可可-爱生活 网页链接 2017-10-14 08:57
经验总结 算法 Emil Wallner 博客 神经网络
【用神经网络给黑白照片上色】《Colorizing B&W Photos with Neural Networks》by Emil Wallnerhttp://t.cn/RO0sYT1
爱可可-爱生活 网页链接 2017-10-09 06:18
Jason Brownlee
【词袋模型通俗介绍】《A Gentle Introduction to the Bag-of-Words Model | Machine Learning Mastery》by Jason Brownlee http://t.cn/ROqUz2B
爱可可-爱生活 网页链接 转发于2017-10-14 09:31
《词袋模型的通俗介绍》via:@阿里云云栖社区 http://t.cn/ROO2ZeE
最新动态
[img=20,20][/img] wx: 网页链接 2017-10-14 21:12
会议活动 深度学习 视觉 算法 Eric Jang ICLR 常佩琦 谷歌 行业动态 会议 神经网络
「【热门争议论文解读】谷歌工程师回答Bengio深度学习论文为何重要」 【AI WORLD 2017世界人工智能大会倒计时 25 天】 抢票链接:http://www.huodongxing.com/event/2405852054900?td=4231978320026 大会官网:http://www.aiworld2017.com 新智元编译 来源:Quora译者:常佩琦【新智元导读】“为什么论文《理解深度学习需要重新思考泛化》如此重要?”这个问题最先在Quora上展开讨论。本文选自Google Brain工程师Eric Jang的回答。 2017年,很多机器学习研究人员都在试图解决一个问题:深度神经网络是如何运作的?为什么它们能够很好地解决实际问题? 即使人们不太关心理论分析和代数,但理解深度学习的工作机制,能够帮助我们促进深度学习在现实生活中的应用。 论文《理解深度学习需要重新思考泛化》(Understanding deep learning requires rethinking generalization)展现了神经网络的一些有趣特征,特别需要指出的一点是,神经网络有足够的能力来记忆随机输入的数据。在SGD优化设置中,训练集误差完全可以缩小到ImageNet大小的数据集。 这与以下经典叙述背道而驰:“深度学习奇迹般地发现了低级、中级和高级特征,就像哺乳动物大脑V1系统在学习压缩数据时展现出的行为一样。” 2012-2015年间,很多研究人员使用“归纳偏差”来解释深度网络如何减少测试误差,暗示了某种形式的泛化。 但是,如果一个深度网络能够记忆随机数据,这表明由于归纳偏差也与记忆兼容,并不能完全解释泛化能力,(例如卷积/池化架构,Dropout、batchnorm等正则化使用)。 这篇论文备受瞩目的部分原因在于,它在ICLR评论中获得了”Perfect score”和ICLR2017最佳论文奖。这引发了人们的热议,所以有一点反馈循环。我认为这是一篇很好的论文,因为这篇论文提出了一个没人问过的问题,并提供了强有力的实验证据来证明一些非常有趣的结果。 然而,我认为深度学习界达成一致来判定一篇论文是否重要,还需要1-2年。尤其是对于那些非分析性、通过实证研究得出的结论。 Tapabrata Ghosh指出,一些研究人员认为,尽管深度网络有记忆功能,这可能并不是深度网络在实践中做的事。这是因为“记住”有语义意义的数据集所需要的时间比记住随机数据需要的时间更短,说明深度网络可以利用训练集中已有的语义规律。 我认为Zhang et al.2016在理解深度网络运作方式上可能会成为一个重要的风向标,但并没有解决深度网络泛化的问题。也许马上就会有人挑战这篇论文的观点。这就是实验科学的本质。 简而言之,这篇论文被认为非常重要,是因为展现了深度学习以记忆的方式学习随机数据库。然后提出了深度网络如何学习非随机数据集的问题。 以下是我对于泛化问题的意见: 具有良好优化目标的高容量参数模型像海绵一样吸收数据。我认为深度网络优化目标非常“懒惰“但功能强大:在提供正确模型偏差并与输入数据兼容的情况下,深度网络能够具有语义意义的特征层次结构。但如果不方便优化,深度学习网络将会以只记忆数据的方式进行优化。 现在我们缺少的是控制记忆程度vs泛化程度的方法,还有无法使用像权重正则化和dropout这样强力的工具。 原文地址:https://www.forbes.com/sites/quo ... -date/#78f1c9b55531 【AI WORLD 2017世界人工智能大会倒计时 25 天】点击图片查看嘉宾与日程。大会门票销售火热,抢票链接:http://www.huodongxing.com/event/2405852054900?td=4231978320026【扫一扫或点击阅读原文抢购大会门票】AI WORLD 2017 世界人工智能大会购票二维码: via: http://mp.weixin.qq.com/s?__biz= ... e=0#wechat_redirect
机器之心Synced 网页链接 2017-10-14 18:11
视觉 Python
【Luminoth:基于TensorFlow的开源计算机视觉工具包】Luminoth 是一个开源的计算机视觉工具包,目前支持目标检测和图像分类,以后还会有更多的扩展。该工具包在 TensorFlow 和 Sonnet 上用 Python 搭建而成,易于使用、训练、理解结果。http://t.cn/ROlf7Hj
网路冷眼 网页链接 2017-10-14 09:18
深度学习 算法 神经网络
【New Theory Cracks Open the Black Box of Deep Neural Networks】http://t.cn/ROOZQE3 新理论崩裂深度神经网络的黑盒子。
python爱好者 网页链接 2017-10-14 09:12
入门 资源 PDF 教育网站
机器学习数学基础讲义 18.657:MathematicsofMachineLearning http://t.cn/ROO71DC
爱可可-爱生活 网页链接 2017-10-14 07:27
资源 Robert Tibshirani Trevor Hastie 教育网站 课程 统计
【“统计学习与数据挖掘”两日课程】“Statistical Learning and Data Mining - November 2-3, 2017” by Trevor Hastie, Robert Tibshirani [Stanford University] http://t.cn/RO0uaoV
爱可可-爱生活 网页链接 2017-10-14 05:38
论文
《Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo》R Ge, H Lee, A Risteski [Duke University & Princeton University & MIT] (2017) http://t.cn/RO0Qpw2
爱可可-爱生活 网页链接 2017-10-14 05:29
算法 SVM 论文
《Self-Taught Support Vector Machine》P Razzaghi [Institute for Advanced Studies in Basic Sciences & Institute for Research in Fundamental Sciences] (2017) http://t.cn/RO0QMmF
|
|