查看: 5282|回复: 0

2018 杰弗里·辛顿

发表于 2022-4-23 23:31:15 | 显示全部楼层 |阅读模式

马上注册 与译者交流

您需要 登录 才可以下载或查看,没有帐号?立即注册

GEOFFREY E HINTON DL Author Profile link
Canada – 2018
For conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.

When Geoffrey Everest Hinton decided to study science he was following in the tradition of ancestors such as George Boole, the Victorian logician whose work underpins the study of computer science and probability. Geoffrey’s great grandfather, the mathematician and bigamist Charles Hinton, coined the word “tesseract” and popularized the idea of higher dimensions, while his father, Howard Everest Hinton, was a distinguished entomologist. Their shared middle name, Everest, celebrates a relative after whom the mountain was also named (to commemorate his service as Surveyor General of India).

Having begun his time at Cambridge University with plans to study physiology and physics, before dabbling in philosophy on his way to receiving a degree in experimental psychology in 1970, Hinton concluded that none of these sciences had yet done much to explain human thought. He made a brief career shift into carpentry, in search of more tangible satisfactions, before being drawn back to academia in 1972 by the promise of artificial intelligence, which he studied at the University of Edinburgh.

By the mid-1970s an “AI winter” of high profile failures had reduced funding and enthusiasm for artificial intelligence research. Hinton was drawn to a particularly unfashionable area: the development of networks of simulated neural nodes to mimic the capabilities of human thought. This willingness to ignore conventional wisdom was to characterize his career. As he put it, “If you think it’s a really good idea and other people tell you it’s complete nonsense then you know you are really onto something.”[1]

The relationship of computers to brains had captivated many computer pioneers of the 1940s, including John von Neumann who used biological terms such as “memory,” “organ” and “neuron” when first describing the crucial architectural concepts of modern computing in the “First Draft of a Report on the EDVAC.” This was influenced by the emerging cybernetics movement, particularly the efforts of Warren McCulloch and Walter Pitts to equate networks of stylized neurons with statements in boolean logic. That inspired the idea that similar networks might, like human brains, be able to learn to recognize objects or carry out other tasks. Interest in this approach had declined after Turing Award winner Marvin Minsky, working with Seymour Papert, demonstrated that a heavily promoted class of neural networks, in which inputs were connected directly to outputs, had severe limits on its capabilities.

Graduating in 1978, Hinton followed in the footsteps of many of his forebears by seeking opportunities in the United States. Joining a group of cognitive psychologists as a Sloan Foundation postdoctoral researcher at the University of California, San Diego.  Their work on neural networks drew on a broad shift in the decades after the Second World War towards Bayesian approaches to statistics, which treat probabilities as degrees of belief, updating estimates as data accumulates.

Most work on neural networks relies on what is now called a “supervised learning” approach, exposing an initially random network configuration to a “training set” of input data. Its initial responses would have no systematic relationship to the features of the input data, but the algorithm would reconfigure the network as each guess was scored against the labels provided. Thus, for example, a network trained on a large set of photographs of different species of fish might develop a reliable ability to recognize whether a new picture showed a carp or a tuna. This required a learning algorithm to automatically reconfigure the network to identify “features” in the input data that correlated with correct outputs.

Working with David Rumelhart and Ronald J. Williams, Hinton popularized what they termed a “back-propagation” algorithm in a pair of landmark papers published in 1986. The term reflected a phase in which the algorithm propagated measures of the errors produced by the network’s guesses backwards through its neurons, starting with those directly connected to the outputs. This allowed networks with intermediate “hidden” neurons between input and output layers to learn efficiently, overcoming the limitations noted by Minsky and Papert.

Their paper describes the use of the technique to perform tasks including logical and arithmetic operations, shape recognition, and sequence generation. Others had worked independently along similar lines, including Paul J. Werbos, without much impact. Hinton attributes the impact of his work with Rumelhart and Williams to the publication of a summary of their work in Nature, and the efforts they made to provide compelling demonstrations of the power of the new approach. Their findings began to revive enthusiasm for the neural network approach, which has increasingly challenged other approaches to AI such as the symbol processing work of Turing Award winners John McCarthy and Marvin Minsky and the rule-based expert systems championed by Edward Feigenbaum.

By the time the papers with Rumelhart and William were published, Hinton had begun his first faculty position, in Carnegie-Mellon’s computer science department. This was one of the leading computer science programs, with a particular focus on artificial intelligence going back to the work of Herb Simon and Allen Newell in the 1950s. But after five years there Hinton left the United States in part because of his opposition to the “Star Wars” missile defense initiative. The Defense Advanced Research Projects Agency was a major sponsor of work on AI, including Carnegie-Mellon projects on speech recognition, computer vision, and autonomous vehicles.  Hinton first became a fellow of the Canadian Institute for Advanced Research (CIFAR) and moved to the Department of Computer Science at the University of Toronto. He spent three years from 1998 until 2001 setting up the Gatsby Computational Neuroscience Unit at University College London and then returned to Toronto.

Hinton’s research group in Toronto made a string of advances in what came to be known as “deep learning”, named as such because it relied on neural networks with multiple layers of hidden neurons to extract higher level features from input data. Hinton, working with David Ackley and Terry Sejnowski, had previously introduced a class of network known as the Boltzmann machine, which in a restricted form was particularly well-suited to this layered approach. His ongoing work to develop machine learning algorithms spanned a broad range of approaches to improve the power and efficiency of systems for probabilistic inference. In particular, his joint work with Radford Neal and Richard Zemel in the early 1990s introduced variational methods to the machine learning community.

Hinton carried this work out with dozens of dozens of Ph.D. students and post-doctoral collaborators, many of whom went on to distinguished careers in their own right. He shared the Turing award with one of them, Yann LeCun, who spent 1987-88 as a post-doctoral fellow in Toronto after Hinton served as the external examiner on his Ph.D. in Paris. From 2004 until 2013 he was the director of the program on "Neural Computation and Adaptive Perception" funded by the Canadian Institute for Advanced Research. That program included LeCun and his other coawardee, Yoshua Bengio. The three met regularly to share ideas as part of a small group. Hinton has advocated for the importance of senior researchers continuing to do hands-on programming work to effectively supervise student teams.

Hinton has long been recognized as a leading researcher in his field, receiving his first honorary doctorate from the University of Edinburgh in 2001, three years after he became a fellow of the Royal Society. In the 2010s his career began to shift from academia to practice as the group’s breakthroughs underpinned new capabilities for object classification and speech recognition appearing in widely used systems produced by cloud computing companies such as Google and Facebook. Their potential was vividly demonstrated in 2012 when a program developed by Hinton with his students Alex Krizhevsky and Ilya Sutskever greatly outperformed all other entrants to ImageNet, an image recognition competition involving a thousand different object types. It used graphics processor chips to run code combining several of the group’s techniques in a network of “60 million parameters and 650,000 neurons” composed of “five convolutional layers, some of which are followed by max-pooling layers, and three globally-connected layers with a final 1000-way softmax.”[2] The “convolutional layers” were an approach originally conceived of by LeCun, to which Hinton’s team had made substantial improvements.

This success prompted Google to acquire a company, DDNresearch, founded by Hinton and the two students to commercialize their achievements. The system allowed Google to greatly improve its automatic classification of photographs. Following the acquisition, Hinton became a vice president and engineering fellow at Google. In 2014 he retired from teaching at the university to establish a Toronto branch of Google Brain. Since 2017, he has held a volunteer position as chief scientific advisor to Toronto’s Vector Institute for the application of machine learning in Canadian health care and other industries. Hinton thinks that in the future teaching people how to train computers to perform tasks will be at least as important as teaching them how to program computers.

Hinton has been increasingly vocal in advocating for his long-standing belief in the potential of “unsupervised” training systems, in which the learning algorithm attempts to identify features without being provided large numbers of labelled examples. As well as being useful these unsupervised learning methods have, Hinton believes, brought us closer to understanding the learning mechanisms used by human brains.

Author: Thomas Haigh

[1]Hinton's discussion with Andrew Ng for the Coursera "Neutral Networks and Deep Learning" class

[2] Results of the 2012 Image Net competition.

加拿大 - 2018年

当Geoffrey Everest Hinton决定学习科学时,他继承了祖先的传统,如维多利亚时代的逻辑学家George Boole,他的工作支撑了计算机科学和概率的研究。杰弗里的曾祖父,数学家和重婚主义者查尔斯-辛顿,创造了 "魔方 "一词,并推广了高维度的概念,而他的父亲霍华德-埃弗里斯特-辛顿是一位杰出的昆虫学家。他们共同的中间名--埃弗勒斯,是为了纪念一位亲戚,这座山也是以他的名字命名的(为了纪念他作为印度测量局局长的服务)。


到20世纪70年代中期,"人工智能之冬 "的高调失败减少了对人工智能研究的资金和热情。辛顿被吸引到一个特别不合时宜的领域:开发模拟神经节点的网络以模仿人类思维的能力。这种无视传统智慧的意愿是他职业生涯的特点。正如他所说:"如果你认为这是一个真正的好主意,而其他人却告诉你这完全是一派胡言,那么你就知道你真的发现了什么。

计算机与大脑的关系吸引了20世纪40年代的许多计算机先驱,包括约翰-冯-诺伊曼,他在《关于EDVAC的报告初稿》中首次描述现代计算的关键架构概念时,使用了 "记忆"、"器官 "和 "神经元 "等生物学术语。这受到了新兴控制论运动的影响,特别是沃伦-麦库洛赫和沃尔特-皮茨将风格化的神经元网络等同于布尔逻辑中的语句的努力。这激发了这样的想法:类似的网络可能像人类的大脑一样,能够学习识别物体或执行其他任务。在图灵奖得主马文-明斯基(Marvin Minsky)与西摩-帕珀特(Seymour Papert)合作,证明了一类被大力推广的神经网络(其中输入直接连接到输出)的能力受到严重限制后,人们对这种方法的兴趣有所下降。

1978年毕业后,辛顿追随他的许多前辈的脚步,在美国寻找机会。作为斯隆基金会的博士后研究员,他加入了一个认知心理学家小组,在加利福尼亚大学圣地亚哥分校工作。 他们在神经网络方面的工作借鉴了二战后几十年来向贝叶斯统计学方法的广泛转变,该方法将概率视为信仰度,随着数据的积累更新估计值。

大多数关于神经网络的工作依赖于现在所谓的 "监督学习 "方法,将最初的随机网络配置暴露在输入数据的 "训练集 "中。它最初的反应与输入数据的特征没有系统的关系,但算法会在每次猜测与所提供的标签进行评分时重新配置网络。因此,例如,在一大组不同种类的鱼的照片上训练的网络可能会发展出一种可靠的能力来识别一张新的图片是鲤鱼还是金枪鱼。这需要一种学习算法来自动重新配置网络,以识别输入数据中与正确输出相关的 "特征"。

Hinton与David Rumelhart和Ronald J. Williams合作,在1986年发表的一对里程碑式的论文中普及了他们所谓的 "反向传播 "算法。这个术语反映了一个阶段,在这个阶段中,算法通过其神经元向后传播网络猜测所产生的错误,从那些直接连接到输出的神经元开始。这使得在输入和输出层之间有中间 "隐藏 "神经元的网络能够有效地学习,克服了明斯基和帕珀特指出的限制。


在与鲁梅尔哈特和威廉的论文发表时,辛顿已经开始了他的第一个教职,在卡内基-梅隆的计算机科学系。这是一个领先的计算机科学项目,特别关注人工智能,可以追溯到20世纪50年代赫伯-西蒙和艾伦-纽维尔的工作。但在那里工作五年后,辛顿离开了美国,部分原因是他反对 "星球大战 "的导弹防御计划。国防高级研究计划局是人工智能工作的主要赞助商,包括卡内基-梅隆大学的语音识别、计算机视觉和自动驾驶汽车项目。 辛顿首先成为加拿大高级研究所(CIFAR)的研究员,并转到多伦多大学的计算机科学系。从1998年到2001年,他花了三年时间在伦敦大学学院建立了盖茨比计算神经科学小组,然后回到了多伦多。

辛顿在多伦多的研究小组在后来被称为 "深度学习 "的领域取得了一系列进展,之所以这样命名是因为它依赖于具有多层隐藏神经元的神经网络,从输入数据中提取更高层次的特征。Hinton与David Ackley和Terry Sejnowski合作,之前引入了一类被称为玻尔兹曼机的网络,其限制形式特别适合于这种分层方法。他正在进行的开发机器学习算法的工作涵盖了广泛的方法,以提高概率推理系统的能力和效率。特别是,他与拉德福德-尼尔和理查德-泽梅尔在20世纪90年代初的联合工作将变异方法引入了机器学习界。

辛顿与几十名博士生和博士后合作开展了这项工作,其中许多人在自己的职业生涯中取得了杰出的成就。他与其中一个人Yann LeCun分享了图灵奖,后者在1987-1988年期间在多伦多担任博士后研究员,此前Hinton在巴黎担任他的博士学位的外部审查员。从2004年到2013年,他是由加拿大高级研究所资助的 "神经计算和适应性感知 "项目的主任。该项目包括LeCun和他的另一位共同获奖者Yoshua Bengio。这三个人定期会面,作为一个小团体的一部分分享想法。辛顿主张高级研究人员继续进行实际的编程工作以有效监督学生团队的重要性。

长期以来,辛顿一直被认为是他所在领域的领先研究人员,在2001年获得爱丁堡大学的第一个荣誉博士学位,在他成为英国皇家学会会员的三年后。在2010年代,他的事业开始从学术界转向实践,因为该小组的突破支撑了物体分类和语音识别的新能力,并出现在谷歌和Facebook等云计算公司生产的广泛使用的系统中。他们的潜力在2012年得到了生动的证明,当时亨顿与他的学生亚历克斯-克里切夫斯基(Alex Krizhevsky)和伊利亚-苏茨克沃(Ilya Sutskever)开发的一个程序大大超过了ImageNet的所有其他参赛者,这是一个涉及一千种不同物体类型的图像识别比赛。它使用图形处理器芯片来运行代码,在一个由 "5个卷积层,其中一些是最大集合层,以及3个全局连接层和最后的1000路softmax "组成的 "6000万个参数和65万个神经元 "的网络中结合了该小组的多项技术。


辛顿一直在倡导他对 "无监督 "训练系统潜力的长期信念,在这种情况下,学习算法试图在不提供大量标记实例的情况下识别特征。辛顿认为,这些无监督的学习方法不仅有用,而且使我们更接近于理解人类大脑所使用的学习机制。


[1]Hinton与Andrew Ng在Coursera的 "中立网络和深度学习 "课上的讨论。


您需要登录后才可以回帖 登录 | 立即注册


QQ|小黑屋|手机版|网站地图|关于我们|ECO中文网 ( 京ICP备06039041号  

GMT+8, 2024-7-21 22:04 , Processed in 0.073254 second(s), 20 queries .

Powered by Discuz! X3.3

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表