zoukankan      html  css  js  c++  java
  • 信息熵

    That transfer of information, from what we don’t know about the system to what we know, represents a change in entropy. Insight decreases the entropy of the system. Get information, reduce entropy. This is information gain. And yes, this type of entropy is subjective, in that it depends on what we know about the system at hand. (Fwiw, information gain is synonymous with Kullback-Leibler divergence, which we explored briefly in this tutorial on restricted Boltzmann machines.)

    So each principal component cutting through the scatterplot represents a decrease in the system’s entropy, in its unpredictability.

    It so happens that explaining the shape of the data one principal component at a time, beginning with the component that accounts for the most variance, is similar to walking data through a decision tree. The first component of PCA, like the first if-then-else split in a properly formed decision tree, will be along the dimension that reduces unpredictability the most.

    KL 散度定义:

    交叉熵公式: 

    信息熵定义:

    相对熵达到最小值的时候,也意味着交叉熵达到了最小值,原因是假设真实分布p(x)是一个常数。 

    准备翻译一下: 

    https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained

    https://www.zhihu.com/question/41252833

  • 相关阅读:
    Python_day1
    12/04
    Linux基础笔记
    八:动态规划-未名湖边的烦恼
    七:动态规划-数字三角形
    六:大数运算-减法运算
    五:大数运算-加法运算
    四:大数运算-乘法运算
    三:排序-幸运数字
    二:排序-果园
  • 原文地址:https://www.cnblogs.com/xinping-study/p/7058803.html
Copyright © 2011-2022 走看看