zoukankan      html  css  js  c++  java
  • Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission

    郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!

    Neuron, no. 6 (2003): 1063-1073

    Summary

      众所周知,化学突触传递是一个不可靠的过程,但这种不可靠的函数尚不清楚。在此,我考虑了一个假设,即突触传递的随机性是由大脑用来学习的,类似于达尔文进化论利用基因突变的方式。这是可能的,如果突触是"享乐主义的",通过增加它们的囊泡释放或失败的概率来响应一个全局奖励信号,这取决于奖励之前的动作。享乐主义突触通过计算平均奖励的梯度的随机近似来学习。它们与突触动态(如短期促进和抑制)以及树突状细胞整合和动作电位生成的复杂性相一致。一个享乐主义突触网络可以通过适当地给予奖励来训练以执行所需的计算,如这里通过IF神经元模型的数值仿真所示。

    Introduction

      许多类型的学习可以被视为优化。例如,操作性条件可以被视为动物适应其动作以最大化奖励的过程。“实践使之完美”的格言是指反复提高复杂的动作技能,例如弹钢琴或打网球。人们普遍认为,学习至少部分基于大脑突触组织的可塑性。因此,似乎存在为优化神经回路函数而量身定制的突触可塑性类型。

      这种突触可塑性可以采取什么具体形式?为了激发想像力,从进化中汲取灵感是很有帮助的,进化是生物学优化过程的最著名例子。进化的一个令人着迷的方面是,它需要不完美的基因复制。这种不可靠性可能在其他方面似乎是不可取的,但是随机突变和重组对于产生变异实际上是必不可少的,变异允许进化以寻找改良的基因型。

    Results

    Training a Multilayer Network

    Release-Failure Antagonism

    The Matching Law

    Dynamic Synapses

    Postsynaptic Voltage Dependence

    Postsynaptic Locus of Plasticity

    Temporal Antagonism

    Discussion

    Hedonistic synapses are just a mechanism for stochastic gradient learning, a topic that has been studied extensively in the field of neural networks. What is new here?

    I'm a synaptic physiologist. How can I look for hedonistic synapses in the brain?

    There are many sources of randomness in the brain. Why do you single out stochastic vesicle release as the basis for stochastic gradient learning?

    You've demonstrated learning with hedonistic synapses for some toy problems, but it will never scale up to really large networks

    Even if stochastic gradient learning were applied to small neural circuits in the brain, wouldn't it still be too slow?

    What if the reward signal is delayed in time? Won't that be catastrophic for the learning time of hedonistic synapses?

    Do you believe that hedonistic synapses are the explanation of operant conditioning?

    Could temporal antagonism alone be sufficient for stochastic gradient learning?

    In your example, two out of three synapses would change in the right direction. Couldn't the circuit end up increasing its average reward anyway, in spite of the errant synapse?

    Do hedonistic synapses require that reward and punishment be balanced so that the average reinforcement is zero?

    What do you regard as the greatest weakness of your model?

    Experimental Procedures

    Hedonistic Synapse Model

    REINFORCE Learning

    REINFORCE Learning for Stochastic Synapses

    REINFORCE as Stochastic Gradient Learning

    Bias-Variance Tradeoff

    Related Learning Rules

    Variable-Interval Reward Schedule

    Numerical Simulations

  • 相关阅读:
    mouse without borders无界鼠标使用教程
    动态令牌-(OTP,HOTP,TOTP)-基本原理
    sha256C代码例子
    常用的前端设计工具分享
    PHP获取搜索引擎关键字来源(百度、谷歌、雅虎、搜狗、搜搜、必应、有道)
    为 Web 设计师准备的 25+ 款扁平 UI 工具包
    万能字段使用技巧整理
    css中overflow:hidden的属性 可能会导致js下拉菜单无法显示
    QQ空间g_tk加密算法PHP版
    QQ聊天机器人for PHP版 (登录,收、发消息)
  • 原文地址:https://www.cnblogs.com/lucifer1997/p/14457013.html
Copyright © 2011-2022 走看看