zoukankan      html  css  js  c++  java
  • Random Thoughts on Deep Reinforcement Learning

    About model-based and model-free

    • Model-free methods cannot be the future of reinforcement learnig, even though these algorithms perform better than model-based methods at the present time. The fatal flaw lies in the lack of interpretability. We cannot trust the policy without knowing why it takes a specific action, especially since it always takes some actions that are stupid and obviously wrong in our view. Model-based methods relieve our concerns to some extent, because we can get some knowledge about future states and outcomes. However, the model should be learned in most of the time and it cannot be accurate like the real environment. A way we can solve it must be planning methods especially tree search methods like Monte Carlo Tree Search (MCTS). Tree search methods can reduce the variance of the learned model using bootstrapping at each node, which is something like TD methods. It also presents us with better interpretability which is very critical.

    • Another thing is about the generalization. My idea is that the generalization of a learned model is better than a policy. When we learn a policy in an environment and apply it to another one, it will collapse because usually the policy is overfitted about the environment and any wrong actions in an trajectory can mess up the whole policy. But if we learn a model in an environment and uses it to predict in an similar environment, it usually performs well because it is just a case of supervised learning and some data augmentation methods can be easily applied. So, in my view, model-based methods combine with tree search methods can improve the interpretability and generalization simultaneously.

  • 相关阅读:
    Mysql升级过程的问题
    关于SSM项目注解事务不回滚的问题
    Linux环境下tomcat如何热部署
    Windows系统下python3中安装pyMysql
    jvm性能监控及故障处理工具(《深入理解java虚拟机》)
    jvm垃圾回收器(《深入理解java虚拟机》)
    jvm-运行时数据区域(《深入理解java虚拟机》)
    java源码分析-String
    java源码分析-Object
    2019秋季PAT甲级题解(无第一题)
  • 原文地址:https://www.cnblogs.com/initial-h/p/12208038.html
Copyright © 2011-2022 走看看