【译文】
大脑是怎样认知事物的
在MIT的McGovern 大脑研究协会的研究者们建立了一个新的数学模型去描述人类的大脑是怎样从视觉上分辨物体的。这个模型精确地预测了人类在确定的视觉感知任务中的表现。这个模型提出了对于了解确实发生在人脑里面的事物是一个很好的迹象。并且,它也可以帮助提高电脑的物体识别系统。
一个用于了解主要的大脑是怎样认识物体的新的计算机模型,它会对一张给定的图像,创建了一个“感兴趣的”要素图形(右图)。这个模型根据适当的实验数据预测了图像的哪个部分(左图绿色区域)会吸引观看者的注意力。(黄色和红色的点)
这个模型是被设计用来感应在主要大脑中的神经的证明,物体识别——决定了一个物体是什么——和物体的方位——确定了它在哪里——被一一解决了。Sharat Chikkerur说:“虽然‘是什么’和‘在哪里’运行在大脑中两个分离的部分,但是他们在为分析图片而感知的期间被整合了。”一个报纸上的主要作者这个星期在日记视觉调查这样评论这个发明:“这是我们试着去解释信息是怎样整合的模型”。
研究者辩论说:机器整合是个注意。根据他们的模型,当大脑面对了一个包括很多不同的物体的画面,它不可以一下子保持追踪他们所有的东西。所以在大脑里面创建了一个看到物体的框架可以简单地识别出一些外表上更加令人感兴趣的区域。如果大脑接着被调去确定这个画面中具体包换的东西是一个什么特定的类型,它是由搜索开始的——接着使它的注意更靠近——然后找到最感兴趣的区域。Eugene McDermott 的教授: 在人脑和认知科学研究所和计算机科学和人工智能实验室的Chikkerur and Tomaso Poggio与毕业生Cheston Tan和博士后Thomas Serre,将这个模型应用于软件。接着依靠人体课题当中的试验数据测试了它的预测能力。这个课题刚开始被简单的认为是一个街道影像在计算机屏幕上的显示。然后去对屏幕上的汽车数量计数。然后去计算行人数,一个电子眼系统会记录下他们的行动路径。这个软件的在每次任务区域的不同时也能提供较高预测精度。
以下视频摘自记录下用OpenCV做Demo时候用到的小代码
【原文】
Researchers at MIT’s McGovern Institute for Brain Research have developed a new mathematical model to describe how the human brain visually identifies objects. The model accurately predicts human performance on certain visual-perception tasks, which suggests that it’s a good indication of what actually happens in the brain, and it could also help improve computer object-recognition systems.
A new computational model of how the primate brain recognizes objects creates a map of “interesting” features (right) for a given image. The model’s predictions of which parts of the image will attract a viewer’s attention (green clouds, left) accord well with experimental data (yellow and red dots).
The model was designed to reflect neurological evidence that in the primate brain, object identification — deciding what an object is — and object location — deciding where it is — are handled separately. “Although what and where are processed in two separate parts of the brain, they are integrated during perception to analyze the image,” says Sharat Chikkerur, lead author on a paper appearing this week in the journal Vision Research, which describes the work. “The model that we have tries to explain how this information is integrated.”
The mechanism of integration, the researchers argue, is attention. According to their model, when the brain is confronted by a scene containing a number of different objects, it can’t keep track of all of them at once. So instead it creates a rough map of the scene that simply identifies some regions as being more visually interesting than others. If it’s then called upon to determine whether the scene contains an object of a particular type, it begins by searching — turning its attention toward — the regions of greatest interest.
Chikkerur and Tomaso Poggio, the Eugene McDermott Professor in the Department of Brain and Cognitive Sciences and at the Computer Science and Artificial Intelligence Laboratory, together with graduate student Cheston Tan and former postdoc Thomas Serre, implemented the model in software, then tested its predictions against data from experiments with human subjects. The subjects were asked first to simply regard a street scene depicted on a computer screen, then to count the cars in the scene, and then to count the pedestrians, while an eye-tracking system recorded their eye movements. The software predicted with great accuracy which regions of the image the subjects would attend to during each task.