10月28日论文推荐(附下载地址)

论文名:

Mastering the game of Go with deep neural networks and tree search

来源:Nature

推荐理由:

本文介绍了围棋AI程序AlphaGo 使用的技术。围棋AI的挑战主要来自两方面:一方面是庞大的搜索空间;另一方面是围棋的局面和走棋难以评估。AlphaGo设计了value networks和policy networks两个深度神经网络分别用于评估局面和选择下一步的走棋位置。这两个深度网络采用监督学习和强化学习两种方式训练,并通过蒙特卡洛树搜索(Monte Carlo Tree Search, MCTS)将两者结合到一起。文章发表时,AlphaGo和其他围棋AI程序对弈能达到99.8%的胜率,并以5:0的比分击败了欧洲冠军。

Abstract

he game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of randomT games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

论文下载链接

https://www.aminer.cn/archive/mastering-the-game-of-go-with-deep-neural-networks-and-tree-search/56ab70cd0cf2c98bf5bc717a

分享干货

常用的9个人脸数据库

8种Python文本处理工具集(附代码页)

【第一期】20篇强化学习论文总结(附下载链接)

【汇总】AMiner发布的13期人工智能研究报告

AMiner知识图谱数据集开源,欢迎大家下载使用

12种Python 机器学习 & 数据挖掘 工具包(附链接)

50年间,高水平论文数量排名前20的国家是怎样变化的?

当机器人已经会跑酷还会热舞......而你还什么都不会时......

CNCC2018技术论坛|6场报告引爆“认知图谱与推理”现场

CNCC2018|图灵奖获得者Robert E.Kahn谈“数字对象与互联网发展”

AMiner

发掘科技创新的原动力

(0)

相关推荐