读书笔记: 博弈论导论

zoukankan html css js c++ java

读书笔记: 博弈论导论
读书笔记: 博弈论导论 - 12 - 不完整信息的静态博弈贝叶斯博弈

贝叶斯博弈(Bayesian Games)

本文是Game Theory An Introduction (by Steven Tadelis) 的学习笔记。

不完整信息的静态博弈(Incomplete information static games)

不完整信息博弈意味着玩家之间缺乏共识(common knowledge)，具体指的是其它对手的行动集、结果集和收益函数等信息。
对不完整信息博弈的处理方法来自于Harsanyi。
他引进了两个概念来解决这个问题。
type space: 将对手隐藏的信息(行动集、结果集和收益函数等)转化为多个types，每个type中的信息都是可知的。
belief: 由于不知道对手的具体type是什么，因此使用分布概率表示对手选择某个type的可能性。
这样就可以通过概率统计来计算可能的收益。
- 静态不完整信息贝叶斯博弈(static Bayesian game of incomplete information)的normal-form描述
[left langle N, { A_i }_{i=1}^n, { Theta_i }_{i=1}^n, { v_i(cdot; heta_i), heta_i in Theta_i }_{i=1}^n, { phi_i }_{i=1}^n ight angle \ where \ N = { 1,2,cdots, n} ext{ : is the set of players} \ A_i ext{ : the action set of player i} \ Theta_i ext{ : the type space of player i} \ v_i : A imes Theta_i o mathbb{R} ext{ : type dependent pay of function of player i} \ phi ext{ : the belief of player i with respect to the uncertainty over the other players' types} \ phi( heta_{-i} | heta_i) ext{ : the posterior conditional distribution on } heta_{-i} ]
- 静态不完整信息贝叶斯博弈处理流程：
  
  自然选择一个类型组合(profile of types)( heta_1, heta_2, cdots, heta_n)。
  
  每个玩家知道自己( heta_i)，使用先前的(phi_i)来形成对对手type的分布概率。
  
  玩家选择行动。
  
  根据玩家们的行动(a = (a_i, a_2, cdots, a_n))，可以或者收益(v_i(a; heta)).
- 条件概率(conditional probability)
  当事件S发生时，事件H发生的条件概率为：
[Pr{H|S} = frac{phi(S land H)}{phi(S)} ]
- 静态不完整信息贝叶斯博弈 - 纯策略
[left langle N, { A_i }_{i=1}^n, { Theta_i }_{i=1}^n, { v_i(cdot; heta_i), heta_i in Theta_i }_{i=1}^n, { phi_i }_{i=1}^n ight angle \ ]
玩家i的一个纯策略(s_i( heta_i) o a_i)
- 静态不完整信息贝叶斯博弈 - 混合策略
  玩家i的一个混合策略是一个在纯策略之上的概率分布。
- 静态不完整信息贝叶斯博弈 - 纯策略贝叶斯纳什均衡(pure-strategy Bayesian Nash equilibrium)
  一个纯策略贝叶斯纳什均衡(s^* = (s_1^*, cdots, s_n^*))，如果对于每个玩家i，每个玩家的类型( heta_i in Theta_i)，每个行动(a_i in A_i)，满足：
[sum_{ heta_{-i} in Theta_{-i}} phi_i( heta_{-i}| heta_i) v_i(s_i^*( heta_i), s_{-i}^*( heta_{-i}); heta_i) geq sum_{ heta_{-i} in Theta_{-i}} phi_i( heta_{-i}| heta_i) v_i(a_i, s_{-i}^*( heta_{-i}); heta_i) \ where \ v_i(a_i, s_{-i}^*( heta_{-i}); heta_i) ext{ : only on type } heta_i ext{, the player i's payoff function} ]
其含义：对于每个玩家，其行动(s_i^*( heta_i))的分布概率收益总和总是最大的。

关于这章（甚至整本书），重要的是学会如何使用这些理论，书中提供了很好的示例。但这里就不介绍了。

参照
查看全文

相关阅读:
微信小程序，答题问卷，单选，多选混合在一个借口，前端怎么循环
 react 从0到1
react从0到0
es6 系统总结
 点击页面的某个元素，弹出这个元素的索引（是第几个）
js return的用法
 安装golang.org/x/*
完美解决从github上下载东西慢的方法
 初探golang和应用其框架 gin 的使用教程（一）安装篇
 CentOS7安装go开发环境

原文地址：https://www.cnblogs.com/steven-yang/p/8321756.html

读书笔记: 博弈论导论

读书笔记: 博弈论导论 - 12 - 不完整信息的静态博弈 贝叶斯博弈

贝叶斯博弈(Bayesian Games)

不完整信息的静态博弈(Incomplete information static games)

参照

读书笔记: 博弈论导论 - 12 - 不完整信息的静态博弈贝叶斯博弈