zoukankan      html  css  js  c++  java
  • Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission

    郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!

    Neuron, no. 6 (2003): 1063-1073

    Summary

      众所周知,化学突触传递是一个不可靠的过程,但这种不可靠的函数尚不清楚。在此,我考虑了一个假设,即突触传递的随机性是由大脑用来学习的,类似于达尔文进化论利用基因突变的方式。这是可能的,如果突触是"享乐主义的",通过增加它们的囊泡释放或失败的概率来响应一个全局奖励信号,这取决于奖励之前的动作。享乐主义突触通过计算平均奖励的梯度的随机近似来学习。它们与突触动态(如短期促进和抑制)以及树突状细胞整合和动作电位生成的复杂性相一致。一个享乐主义突触网络可以通过适当地给予奖励来训练以执行所需的计算,如这里通过IF神经元模型的数值仿真所示。

    Introduction

      许多类型的学习可以被视为优化。例如,操作性条件可以被视为动物适应其动作以最大化奖励的过程。“实践使之完美”的格言是指反复提高复杂的动作技能,例如弹钢琴或打网球。人们普遍认为,学习至少部分基于大脑突触组织的可塑性。因此,似乎存在为优化神经回路函数而量身定制的突触可塑性类型。

      这种突触可塑性可以采取什么具体形式?为了激发想像力,从进化中汲取灵感是很有帮助的,进化是生物学优化过程的最著名例子。进化的一个令人着迷的方面是,它需要不完美的基因复制。这种不可靠性可能在其他方面似乎是不可取的,但是随机突变和重组对于产生变异实际上是必不可少的,变异允许进化以寻找改良的基因型。

    Results

    Training a Multilayer Network

    Release-Failure Antagonism

    The Matching Law

    Dynamic Synapses

    Postsynaptic Voltage Dependence

    Postsynaptic Locus of Plasticity

    Temporal Antagonism

    Discussion

    Hedonistic synapses are just a mechanism for stochastic gradient learning, a topic that has been studied extensively in the field of neural networks. What is new here?

    I'm a synaptic physiologist. How can I look for hedonistic synapses in the brain?

    There are many sources of randomness in the brain. Why do you single out stochastic vesicle release as the basis for stochastic gradient learning?

    You've demonstrated learning with hedonistic synapses for some toy problems, but it will never scale up to really large networks

    Even if stochastic gradient learning were applied to small neural circuits in the brain, wouldn't it still be too slow?

    What if the reward signal is delayed in time? Won't that be catastrophic for the learning time of hedonistic synapses?

    Do you believe that hedonistic synapses are the explanation of operant conditioning?

    Could temporal antagonism alone be sufficient for stochastic gradient learning?

    In your example, two out of three synapses would change in the right direction. Couldn't the circuit end up increasing its average reward anyway, in spite of the errant synapse?

    Do hedonistic synapses require that reward and punishment be balanced so that the average reinforcement is zero?

    What do you regard as the greatest weakness of your model?

    Experimental Procedures

    Hedonistic Synapse Model

    REINFORCE Learning

    REINFORCE Learning for Stochastic Synapses

    REINFORCE as Stochastic Gradient Learning

    Bias-Variance Tradeoff

    Related Learning Rules

    Variable-Interval Reward Schedule

    Numerical Simulations

  • 相关阅读:
    Java单例模式(Singleton)以及实现
    golang 垃圾回收机制
    MySQL索引背后的数据结构及算法原理
    简述拥塞控制的四种基本算法
    分库分表
    lvalue & rvalue
    理解linux cpu load
    android使用百度地图SDK获取定位信息
    iOSUIWebView---快停下啦,你的愚蠢的行为
    【翻译自mos文章】当/var/tmp文件夹被remove掉之后,GI crash,并启动失败,原因是ohasd can not create named pipe
  • 原文地址:https://www.cnblogs.com/lucifer1997/p/14457013.html
Copyright © 2011-2022 走看看