zoukankan      html  css  js  c++  java
  • Transforming Auto-encoders

    http://www.cs.toronto.edu/~hinton/absps/transauto6.pdf

    The artificial neural networks that are used to recognize shapes typically use one or more layers of learned feature detectors that produce scalar outputs. By contrast, the computer vision community uses complicated, hand-engineered features, like SIFT [6], that producea whole vector of outputs including an explicit representation of the pose of the feature. We show how neural networks can be used to learn features that output a whole vector of instantiation parameters and we argue that this is a much more promising way of dealing with variations in position,orientation, scale and lighting than the methods currently employed in the neural networks community. It is also more promising than the hand-engineered features currently used in computer vision because it provides an efficient way of adapting the features to the domain.

    Current methods for recognizing objects in images perform poorly and use methods that are intellectually unsatisfying. Some of the best computer vision systems use histograms of oriented gradients as “visual words” and model the spatial distribution of these elements using a crude spatial pyramid. Such methods can recognize objects correctly without knowing exactly where they are – an ability that is used to diagnose brain damage in humans. The best artifical neural net-works [4, 5, 10] use hand-coded weight-sharing schemes to reduce the number of free parameters and they achieve local translational invariance by subsampling the activities of local pools of translated replicas of the same kernel. This method of dealing with the changes in images caused by changes in view point is much better than no method at all, but it is clearly incapable of dealing with recognition tasks, such as facial identity recognition, that require knowledge of the precise spatial relationships between high-level parts like a nose and a mouth.After several stages of subsampling in a convolutional net, high-level feature shave a lot of uncertainty in their poses. This is generally regarded as a desireable property because it amounts to invariance to pose over some limited range,but it makes it impossible to compute precise spatial relationships.

    【This paper argues that convolutional neural networks are misguided in what they are trying to achieve.】

    【capsules】

    This paper argues that convolutional neural networks are misguided in what they are trying to achieve. Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artifical neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. When the capsule is working properly, the probability of the visual entity being present is locally invariant – it does not change as the entity moves over the manifold of possible appearances within the limited domain covered by the capsule. The instantiation parameters, however, are “equivariant” – as the viewing conditions change and the entity moves over the appearance manifold, the instantiation parameters change by a corresponding amount because they are representing the intrinsic coordinates of the entity on the appearance manifold.

  • 相关阅读:
    程序人生2008年(49)
    多种方式实现字符串/无符号数反向输出_栈_递归_头尾指针
    Ebusiness suite system service management ( EBS服务管理)
    文件系统FatFsR0.09a翻译(三):ff.h
    cocurrent request,program,process 并发请求,程序,进程的概念
    Laravel 5.* 执行seeder命令出现错误的解决方法
    Laravel修改配置后一定要清理缓存 "php artisan config:clear"!
    laravel构造函数和中间件执行顺序问题
    Laravel5.3使用学习笔记中间件
    laravel 是怎么做到运行 composer dumpautoload 不清空 classmap 映射关系的呢?
  • 原文地址:https://www.cnblogs.com/rsapaper/p/7767316.html
Copyright © 2011-2022 走看看