zoukankan      html  css  js  c++  java
  • 贝叶斯网络——看来我要的是参数评估

    Python的贝叶斯网络学习库pgmpy介绍和使用


    pgmpy

    Parameter learning: Given a set of data samples and a DAG that captures the dependencies between the variables, estimate the (conditional) probability distributions of the individual variables.

    Structure learning: Given a set of data samples, estimate a DAG that captures the dependencies between the variables.

    pgmpy.org
    github.com/pgmpy/pgmpy_notebook/blob/master/blob/master/notebooks

    代码记录

    """
    学习链接 :
    http://pgmpy.org/
    https://github.com/pgmpy/pgmpy_notebook/blob/master/notebooks/9.%20Learning%20Bayesian%20Networks%20from%20Data.ipynb
    """
    # ====================BN模型=========================
    # 贝叶斯模型
    from pgmpy.models import BayesianModel
    # ====================参数学习=========================
    # 参数估计
    from pgmpy.estimators import ParameterEstimator
    # MLE参数估计
    from pgmpy.estimators import MaximumLikelihoodEstimator
    # Bayesian参数估计
    from pgmpy.estimators import BayesianEstimator
    # ====================结构学习=========================
    # ========评分搜索=========================
    # 评分
    from pgmpy.estimators import BdeuScore, K2Score, BicScore
    # 穷举搜索
    from pgmpy.estimators import ExhaustiveSearch
    # 爬山搜索
    from pgmpy.estimators import HillClimbSearch
    # ========  约束  =========================
    from pgmpy.estimators import ConstraintBasedEstimator
    # 独立性
    from pgmpy.independencies import Independencies
    # ========  混合  =========================
    from pgmpy.estimators import MmhcEstimator
    # ==================== 通用库 =========================
    import pandas as pd
    import numpy as np
    

    parameter Learning

    
    def parameterLearning():
        data = pd.DataFrame(data={'fruit': ["banana", "apple", "banana", "apple", "banana","apple", "banana",
                                            "apple", "apple", "apple", "banana", "banana", "apple", "banana",],
                                  'tasty': ["yes", "no", "yes", "yes", "yes", "yes", "yes",
                                            "yes", "yes", "yes", "yes", "no", "no", "no"],
                                  'size': ["large", "large", "large", "small", "large", "large", "large",
                                            "small", "large", "large", "large", "large", "small", "small"]})
        model = BayesianModel([('fruit', 'tasty'), ('size', 'tasty')])  # fruit -> tasty <- size
    
        print("========================================================")
        pe = ParameterEstimator(model, data)
        print("
    ", pe.state_counts('fruit'))  # unconditional
        print("
    ", pe.state_counts('size'))  # unconditional
        print("
    ", pe.state_counts('tasty'))  # conditional on fruit and size
        print("========================================================")
        mle = MaximumLikelihoodEstimator(model, data)
        print(mle.estimate_cpd('fruit'))  # unconditional
        print(mle.estimate_cpd('tasty'))  # conditional
    
        print("========================================================")
        est = BayesianEstimator(model, data)
        print(est.estimate_cpd('tasty', prior_type='BDeu', equivalent_sample_size=10))
        # Setting equivalent_sample_size to 10 means
        # that for each parent configuration, we add the equivalent of 10 uniform samples
        # (here: +5 small bananas that are tasty and +5 that aren't).
    
        print("========================================================")
        # Calibrate all CPDs of `model` using MLE:
        model.fit(data, estimator=MaximumLikelihoodEstimator)
        print("========================================================")
        # generate data
        data = pd.DataFrame(np.random.randint(low=0, high=2, size=(5000, 4)), columns=['A', 'B', 'C', 'D'])
        model = BayesianModel([('A', 'B'), ('A', 'C'), ('D', 'C'), ('B', 'D')])
        model.fit(data, estimator=BayesianEstimator, prior_type="BDeu") # default equivalent_sample_size=5
        for cpd in model.get_cpds():
            print(cpd)
    
    

    从数据中学习贝叶斯网络的CPD参数

    正常情况下,我们手头上有的只是数据,对一些CPD的参数值我们通常情况下无法获取,或者获取的代价比较大,那么怎么从数据中学习到贝叶斯网络的参数以及结构呢?这里,我们首先讲解一下参数的学习,即CPD的参数学习。通常采用的方式有:极大似然估计和贝叶斯估计,极大似然估计对样本数量的要求比较高,特别是当数据分布不均匀时,容易发生过拟合线性,为了解决这一问题,通常是采用贝叶斯估计的方式进行参数学习…

    首先,先造一些数据

    import pandas as pd
    data = pd.DataFrame(data={'fruit': ["banana", "apple", "banana", "apple", "banana","apple", "banana", 
                                        "apple", "apple", "apple", "banana", "banana", "apple", "banana",], 
                              'tasty': ["yes", "no", "yes", "yes", "yes", "yes", "yes", 
                                        "yes", "yes", "yes", "yes", "no", "no", "no"], 
                              'size': ["large", "large", "large", "small", "large", "large", "large",
                                        "small", "large", "large", "large", "large", "small", "small"]})
    data
    

     
     fruitsizetasty
    0 banana large yes
    1 apple large no
    2 banana large yes
    3 apple small yes
    4 banana large yes
    5 apple large yes
    6 banana large yes
    7 apple small yes
    8 apple large yes
    9 apple large yes
    10 banana large yes
    11 banana large no
    12 apple small no
    13 banana small no

    在知道模型结构的情况下,构建贝叶斯网络模型

    from pgmpy.models import BayesianModel
    
    model = BayesianModel([('fruit', 'tasty'), ('size', 'tasty')])
    

    参数学习是一项估计条件概率分布(CPD)值的任务,它涉及到水果、大小和美味等变量。

    为了理解给定的数据,我们可以从计算变量的每个状态发生的频率开始。如果变量依赖于父级,则根据父级状态有条件地进行计数,即对于每个父级配置分别进行计数:

    from pgmpy.estimators import ParameterEstimator
    
    pe = ParameterEstimator(model, data)
    print("
    ", pe.state_counts('fruit'))
    print("
    ", pe.state_counts('tasty'))  # 在fruit和size的条件下,tasty的频数
    

             fruit
    apple       7
    banana      7
    
     fruit apple       banana      
    size  large small  large small
    tasty                         
    no      1.0   1.0    1.0   1.0
    yes     3.0   2.0    5.0   0.0
    

    极大似然估计

    from pgmpy.estimators import MaximumLikelihoodEstimator
    
    mle = MaximumLikelihoodEstimator(model, data)
    
    print("
    ", mle.estimate_cpd('fruit'))
    print("
    ", mle.estimate_cpd('tasty'))  # 在fruit和size的条件下,tasty的概率分布
    
    mle.get_parameters()
    

     +---------------+-----+
    | fruit(apple)  | 0.5 |
    +---------------+-----+
    | fruit(banana) | 0.5 |
    +---------------+-----+
    
     +------------+--------------+--------------------+---------------------+---------------+
    | fruit      | fruit(apple) | fruit(apple)       | fruit(banana)       | fruit(banana) |
    +------------+--------------+--------------------+---------------------+---------------+
    | size       | size(large)  | size(small)        | size(large)         | size(small)   |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(no)  | 0.25         | 0.3333333333333333 | 0.16666666666666666 | 1.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(yes) | 0.75         | 0.6666666666666666 | 0.8333333333333334  | 0.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    
    [<TabularCPD representing P(fruit:2) at 0x24daed0e278>,
     <TabularCPD representing P(size:2) at 0x24daed0e400>,
     <TabularCPD representing P(tasty:2 | fruit:2, size:2) at 0x24daed0e518>]
    

    model.fit(data, estimator=MaximumLikelihoodEstimator)
    
    • 1
    print(model.get_cpds('fruit'))
    print(model.get_cpds('size'))
    print(model.get_cpds('tasty'))
    

    +---------------+-----+
    | fruit(apple)  | 0.5 |
    +---------------+-----+
    | fruit(banana) | 0.5 |
    +---------------+-----+
    +-------------+----------+
    | size(large) | 0.714286 |
    +-------------+----------+
    | size(small) | 0.285714 |
    +-------------+----------+
    +------------+--------------+--------------------+---------------------+---------------+
    | fruit      | fruit(apple) | fruit(apple)       | fruit(banana)       | fruit(banana) |
    +------------+--------------+--------------------+---------------------+---------------+
    | size       | size(large)  | size(small)        | size(large)         | size(small)   |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(no)  | 0.25         | 0.3333333333333333 | 0.16666666666666666 | 1.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(yes) | 0.75         | 0.6666666666666666 | 0.8333333333333334  | 0.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    

    变量估计

    from pgmpy.inference import VariableElimination
    
    infer = VariableElimination(model)
    
    for i in infer.query(['tasty', 'size', 'fruit']).values():  # 打印P(tasty), P(size), P(fruit)
        print(i)
        
    print('大,香蕉是美味的概率:
    ', infer.query(['tasty'], evidence={'fruit': 1, 'size': 0})['tasty']) # 大,香蕉是否美味的概率
    

    +---------+--------------+
    | tasty   |   phi(tasty) |
    +=========+==============+
    | tasty_0 |       0.3393 |
    +---------+--------------+
    | tasty_1 |       0.6607 |
    +---------+--------------+
    +---------+--------------+
    | fruit   |   phi(fruit) |
    +=========+==============+
    | fruit_0 |       0.5000 |
    +---------+--------------+
    | fruit_1 |       0.5000 |
    +---------+--------------+
    +--------+-------------+
    | size   |   phi(size) |
    +========+=============+
    | size_0 |      0.7143 |
    +--------+-------------+
    | size_1 |      0.2857 |
    +--------+-------------+
    大,香蕉是美味的概率:
     +---------+--------------+
    | tasty   |   phi(tasty) |
    +=========+==============+
    | tasty_0 |       0.1667 |
    +---------+--------------+
    | tasty_1 |       0.8333 |
    +---------+--------------+
    

    虽然非常简单,但ML估计量存在数据拟合过度的问题。在上述CPD中,大型香蕉美味的概率估计为0.833,因为6个观察到的大型香蕉中有5个美味。但是请注意,一个小香蕉好吃的概率估计为0.0,因为我们只观察到一个小香蕉,而它恰好不好吃。但这很难让我们确定小香蕉不好吃!我们只是没有足够的观测数据来依赖观测到的频率。如果观察到的数据不能代表潜在分布,那么ML估计将非常遥远。

    在估计贝叶斯网络参数时, 数据不足是一个常见的问题。即使总样本量非常大, 则对于每个父配置, 状态计数是有条件地完成的, 这将导致巨大的碎片。如果一个变量有3个父项, 每个父项可以采用10个状态, 则 10^3 = 1000个父级配置将分别进行状态计数。这使得 MLE 在学习贝叶斯网络参数时非常脆弱和不稳定。一种缓解 MLE 超拟合的方法是贝叶斯参数估计。

    贝叶斯参数估计

    贝叶斯参数估计是从已有的CPD开始的,它表达了我们在观察数据之前对变量的看法。然后,利用观测数据中的状态计数更新这些“先验”。

    我们可以认为先验由伪状态计数组成,这些伪状态计数在标准化之前添加到实际计数中。除非要对变量分布的特定信念进行编码,否则人们通常会选择统一的先验,即认为所有状态都是可均等的先验。

    一个非常简单的先验是所谓的k2先验,它只是在每个状态的计数上加1。一个更明智的先验选择是bdeu(bayesian-dirichlet等价一致先验)。对于bdeu,我们需要指定一个等效的样本大小n,然后伪计数等于观察到每个变量的n个均匀样本(以及每个父配置)。

    补充:Dirichlet分布

    简单的例子来说明。假设你手上有一枚六面骰子。你抛掷1000次,得到一个朝向的分布p1 = <H1, H2, H3, H4, H5, H6>。H1是指数字1朝上次数,H2是指数字2朝上次数, H3, H4, H5, H6依次类推。你再抛掷1000次,又会得到一个朝向的分布p2。重复N次之后,你就会得到N个分布:p1, p2, p3, ... , pn. 假如有这样一个分布D,能够描述抛这枚骰子1000次,得到p1的概率是多少,那么我们就可以简单地把D理解为分布在pi之上的分布。而pi本身又是一个分布,所以D就是分布的分布。Dirichlet分布,可以理解为多项式分布的分布。它的一个样本点是一个多项式分布。链接:https://www.zhihu.com/question/23749913/answer/135084553
    通俗易懂的理解见:https://www.cnblogs.com/bonelee/p/14329635.html
    from pgmpy.estimators import BayesianEstimator
    
    esy = BayesianEstimator(model, data)
    
    print(esy.estimate_cpd('tasty', prior_type='BDeu', equivalent_sample_size=10))
    

    +------------+---------------------+--------------------+--------------------+---------------------+
    | fruit      | fruit(apple)        | fruit(apple)       | fruit(banana)      | fruit(banana)       |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | size       | size(large)         | size(small)        | size(large)        | size(small)         |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | tasty(no)  | 0.34615384615384615 | 0.4090909090909091 | 0.2647058823529412 | 0.6428571428571429  |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | tasty(yes) | 0.6538461538461539  | 0.5909090909090909 | 0.7352941176470589 | 0.35714285714285715 |
    +------------+---------------------+--------------------+--------------------+---------------------+
    

    CPD中的估计值现在更加保守。特别是,对一个不好吃的小香蕉的估计现在大约是0.64而不是1.0。将equivalent_sample_size设置为10意味着,对于每个父级配置,我们添加等效的10个均匀样本(这里+5个美味的小香蕉和+5个不美味的小香蕉)。

    model2 = BayesianModel([('fruit', 'tasty'), ('size', 'tasty')])
    model2.fit(data, estimator=BayesianEstimator)
    for i in model2.get_cpds():
        print(i)
    infer2 = VariableElimination(model2)    
    print("大,香蕉是否美味的概率分布:
    ", infer2.query(['tasty'], evidence={'fruit': 1, 'size': 0})['tasty'])
    

    +------------+---------------------+---------------------+---------------------+--------------------+
    | fruit      | fruit(apple)        | fruit(apple)        | fruit(banana)       | fruit(banana)      |
    +------------+---------------------+---------------------+---------------------+--------------------+
    | size       | size(large)         | size(small)         | size(large)         | size(small)        |
    +------------+---------------------+---------------------+---------------------+--------------------+
    | tasty(no)  | 0.30952380952380953 | 0.38235294117647056 | 0.22413793103448276 | 0.7222222222222222 |
    +------------+---------------------+---------------------+---------------------+--------------------+
    | tasty(yes) | 0.6904761904761905  | 0.6176470588235294  | 0.7758620689655172  | 0.2777777777777778 |
    +------------+---------------------+---------------------+---------------------+--------------------+
    +---------------+-----+
    | fruit(apple)  | 0.5 |
    +---------------+-----+
    | fruit(banana) | 0.5 |
    +---------------+-----+
    +-------------+----------+
    | size(large) | 0.657895 |
    +-------------+----------+
    | size(small) | 0.342105 |
    +-------------+----------+
    大,香蕉是否美味的概率分布:
     +---------+--------------+
    | tasty   |   phi(tasty) |
    +=========+==============+
    | tasty_0 |       0.2241 |
    +---------+--------------+
    | tasty_1 |       0.7759 |
    +---------+--------------+
    

    先到参数学习这里,下一节:structure learning


    structural Learning with Score

    
    def structuralLearning_Score():
        """
            score-based structure learning
            constraint-based structure learning
        The combination of both techniques allows further improvement:
            hybrid structure learning
        """
        print("===================基于评分=================================")
        # create random data sample with 3 variables, where Z is dependent on X, Y:
        data = pd.DataFrame(np.random.randint(0, 4, size=(5000, 2)), columns=list('XY'))
        data['Z'] = data['X'] + data['Y']
    
        bdeu = BdeuScore(data, equivalent_sample_size=5)
        k2 = K2Score(data)
        bic = BicScore(data)
    
        model1 = BayesianModel([('X', 'Z'), ('Y', 'Z')])  # X -> Z <- Y
        model2 = BayesianModel([('X', 'Z'), ('X', 'Y')])  # Y <- X -> Z
        print("==========基于评分===model1===============")
        print(bdeu.score(model1))
        print(k2.score(model1))
        print(bic.score(model1))
        print("==========基于评分===model2===============")
        print(bdeu.score(model2))
        print(k2.score(model2))
        print(bic.score(model2))
        print("==========基于评分===局部评分==============")
        print(bdeu.local_score('Z', parents=[]))
        print(bdeu.local_score('Z', parents=['X']))
        print(bdeu.local_score('Z', parents=['X', 'Y']))
        print("==========基于评分===穷举搜索算法==============")
        # 穷举搜索(计算困难),启发式搜索
        es = ExhaustiveSearch(data, scoring_method=bic)
        # 获取分数最高的分数
        best_model = es.estimate()
        print(best_model.edges())
        print("
     遍历所有的分数:")
        for score, dag in reversed(es.all_scores()):
            print(score, dag.edges())
        print("==========基于评分===爬山搜索算法==============")
        data = pd.DataFrame(np.random.randint(0, 3, size=(2500, 8)), columns=list('ABCDEFGH'))
        data['A'] += data['B'] + data['C']
        data['H'] = data['G'] - data['A']
        hc = HillClimbSearch(data, scoring_method=BicScore(data))
        best_model = hc.estimate()
        print(best_model.edges())
    

    structural Learning with Constraint

    
    def structuralLearning_Constraint():
        print("===================基于约束=================================")
        # Identify independencies in the data set using hypothesis tests
        # Construct DAG (pattern) according to identified independencies
        data = pd.DataFrame(np.random.randint(0, 3, size=(2500, 8)),
                            columns=list('ABCDEFGH'))
        data['A'] += data['B'] + data['C']
        data['H'] = data['G'] - data['A']
        data['E'] *= data['F']
        # Independencies in the data can be identified
        #       using chi2 conditional independence tests
        est = ConstraintBasedEstimator(data)
        print("==========基于约束===条件独立测试===============")
        # test_conditional_independence(X, Y, Zs)
        # 判断X,Y在Zs的条件下是否条件独立
        # check if X is independent from Y given a set of variables Zs:
        print(est.test_conditional_independence('B', 'H'))  # dependent  False
        print(est.test_conditional_independence('B', 'E'))  # independent   True
        print(est.test_conditional_independence('B', 'H', ['A']))  # independent   True
        print(est.test_conditional_independence('A', 'G'))  # independent   True
        print(est.test_conditional_independence('A', 'G', ['H']))  # dependent   False
        print("==========基于约束===DAG构建=================")
        """
        1. 构造一个无向骨架——estimate_skeleton()
        2. 利用强迫边进行定向,得到部分有向无环图(PDAG;- skeleton_to_pdag()
        3. 通过以某种方式保守地定向剩余的边,将DAG模式扩展到DAG—pdag_to_dag()
        Step 1.&2. form the so-called PC algorithm.
        PDAGs are DirectedGraphs, that may contain both-way edges, 
            to indicate that the orientation for the edge is not determined.
        """
        skel, seperating_sets = est.estimate_skeleton(significance_level=0.01)
        print("Undirected edges: ", skel.edges())
        pdag = est.skeleton_to_pdag(skel, seperating_sets)
        print("PDAG edges:       ", pdag.edges())
        model = est.pdag_to_dag(pdag)
        print("DAG edges:        ", model.edges())
        print("==========基于约束===DAG构建===estimate方法=================")
        # he estimate()-method provides a shorthand for the three steps above
        #   and directly returns a BayesianModel
        # 三步并作一步,直接返回一个网络结构
        print(est.estimate(significance_level=0.01).edges())
    
        print("==========基于约束===DAG构建===从independencies中学习======")
        ind = Independencies(['B', 'C'],
                             ['A', ['B', 'C'], 'D'])
        ind = ind.closure()  # required (!) for faithfulness
        model = ConstraintBasedEstimator.estimate_from_independencies("ABCD", ind)
        print(model.edges())
    

    structural Learning with Hybrid

    def structuralLearning_Hybrid():
        """
        MMHC算法[3]结合了基于约束和基于分数的方法。它有两部分:
            1. 使用基于约束的构造过程MMPC学习无向图骨架
            2. 基于分数的优化(BDeu分数+修改爬山)
        """
        print("===================混合方法=================================")
        # 实验数据生成
        data = pd.DataFrame(np.random.randint(0, 3, size=(2500, 8)), columns=list('ABCDEFGH'))
        data['A'] += data['B'] + data['C']
        data['H'] = data['G'] - data['A']
        data['E'] *= data['F']
        # 构建无向图骨架
        mmhc = MmhcEstimator(data)
        skeleton = mmhc.mmpc()
        print("Part 1) Skeleton: ", skeleton.edges())
        # 基于分数优化
        # use hill climb search to orient the edges:
        hc = HillClimbSearch(data, scoring_method=BdeuScore(data))
        model = hc.estimate(tabu_length=10, white_list=skeleton.to_directed().edges())
        print("Part 2) Model:    ", model.edges())
        print("===================两步划为一步=================================")
        # MmhcEstimator.estimate(self, scoring_method=None, tabu_length=10,
        #       significance_level=0.01)
        # mmhc.estimate(scoring_method=BdeuScore(data),tabu_length=10)
    

    mian

    if __name__ == "__main__":
        parameterLearning()
        structuralLearning_Score()
        structuralLearning_Constraint()
        structuralLearning_Hybrid()
    


    运行展示

    参数学习

    1 ============参数估计=================================================
    1 ========state_counts 统计信息=======
    
             fruit
    apple       7
    banana      7
    
            size
    large    10
    small     4
    
     fruit apple       banana      
    size  large small  large small
    tasty                         
    no      1.0   1.0    1.0   1.0
    yes     3.0   2.0    5.0   0.0
    1 ========MLE估计CPD==按变量===========
    +---------------+-----+
    | fruit(apple)  | 0.5 |
    +---------------+-----+
    | fruit(banana) | 0.5 |
    +---------------+-----+
    +------------+--------------+--------------------+---------------------+---------------+
    | fruit      | fruit(apple) | fruit(apple)       | fruit(banana)       | fruit(banana) |
    +------------+--------------+--------------------+---------------------+---------------+
    | size       | size(large)  | size(small)        | size(large)         | size(small)   |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(no)  | 0.25         | 0.3333333333333333 | 0.16666666666666666 | 1.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    | tasty(yes) | 0.75         | 0.6666666666666666 | 0.8333333333333334  | 0.0           |
    +------------+--------------+--------------------+---------------------+---------------+
    1 ========贝叶斯估计CPD==按变量========
    +------------+---------------------+--------------------+--------------------+---------------------+
    | fruit      | fruit(apple)        | fruit(apple)       | fruit(banana)      | fruit(banana)       |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | size       | size(large)         | size(small)        | size(large)        | size(small)         |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | tasty(no)  | 0.34615384615384615 | 0.4090909090909091 | 0.2647058823529412 | 0.6428571428571429  |
    +------------+---------------------+--------------------+--------------------+---------------------+
    | tasty(yes) | 0.6538461538461539  | 0.5909090909090909 | 0.7352941176470589 | 0.35714285714285715 |
    +------------+---------------------+--------------------+--------------------+---------------------+
    1 ========fit函数估计===所有变量=======
    1 ===================================
    +------+----------+
    | A(0) | 0.506593 |
    +------+----------+
    | A(1) | 0.493407 |
    +------+----------+
    +------+--------------------+---------------------+
    | A    | A(0)               | A(1)                |
    +------+--------------------+---------------------+
    | B(0) | 0.5183395779925064 | 0.48076533711277586 |
    +------+--------------------+---------------------+
    | B(1) | 0.4816604220074936 | 0.5192346628872241  |
    +------+--------------------+---------------------+
    +------+--------------------+--------------------+---------------------+--------------------+
    | A    | A(0)               | A(0)               | A(1)                | A(1)               |
    +------+--------------------+--------------------+---------------------+--------------------+
    | D    | D(0)               | D(1)               | D(0)                | D(1)               |
    +------+--------------------+--------------------+---------------------+--------------------+
    | C(0) | 0.5160626836434867 | 0.5142942227516378 | 0.49917576756645377 | 0.4964179104477612 |
    +------+--------------------+--------------------+---------------------+--------------------+
    | C(1) | 0.4839373163565132 | 0.4857057772483621 | 0.5008242324335462  | 0.5035820895522388 |
    +------+--------------------+--------------------+---------------------+--------------------+
    +------+--------------------+--------------------+
    | B    | B(0)               | B(1)               |
    +------+--------------------+--------------------+
    | D(0) | 0.5029982010793523 | 0.4918114639504693 |
    +------+--------------------+--------------------+
    | D(1) | 0.4970017989206476 | 0.5081885360495306 |
    +------+--------------------+--------------------+
    
    

    评分搜索

    
    
    
    2 ===================基于评分=================================
    2 ==========基于评分===model1===============
    -13939.038934816337
    -14329.822136429982
    -14295.079563299281
    2 ==========基于评分===model2===============
    -20900.389985754824
    -20927.22925737244
    -20944.436530518695
    2 ==========基于评分===局部评分==============
    -9232.535088735991
    -6990.879293129073
    -57.11895038935745
    2 ==========基于评分===穷举搜索算法==============
    [('X', 'Z'), ('Y', 'Z')]
    
    	 遍历所有的分数:
    -14295.079563299281 [('X', 'Z'), ('Y', 'Z')]
    -14326.77068416731 [('X', 'Y'), ('Z', 'X'), ('Z', 'Y')]
    -14326.770684167312 [('Y', 'X'), ('Z', 'X'), ('Z', 'Y')]
    -14326.770684167312 [('Y', 'Z'), ('Y', 'X'), ('Z', 'X')]
    -14326.770684167312 [('X', 'Z'), ('Y', 'Z'), ('Y', 'X')]
    -14326.770684167312 [('X', 'Y'), ('X', 'Z'), ('Z', 'Y')]
    -14326.770684167312 [('X', 'Y'), ('X', 'Z'), ('Y', 'Z')]
    -16536.707465219723 [('X', 'Y'), ('Z', 'Y')]
    -16537.846854154086 [('Y', 'X'), ('Z', 'X')]
    -18701.669239663883 [('Z', 'X'), ('Z', 'Y')]
    -18701.669239663883 [('Y', 'Z'), ('Z', 'X')]
    -18701.669239663886 [('X', 'Z'), ('Z', 'Y')]
    -20911.606020716295 [('Z', 'Y')]
    -20911.606020716295 [('Y', 'Z')]
    -20912.745409650663 [('Z', 'X')]
    -20912.745409650663 [('X', 'Z')]
    -20943.297141584328 [('Y', 'X'), ('Z', 'Y')]
    -20943.297141584328 [('Y', 'Z'), ('Y', 'X')]
    -20943.297141584328 [('X', 'Y'), ('Y', 'Z')]
    -20944.436530518695 [('X', 'Z'), ('Y', 'X')]
    -20944.436530518695 [('X', 'Y'), ('Z', 'X')]
    -20944.436530518695 [('X', 'Y'), ('X', 'Z')]
    -23122.682190703075 []
    -23154.373311571104 [('Y', 'X')]
    -23154.373311571104 [('X', 'Y')]
    2 ==========基于评分===爬山搜索算法==============
    [('A', 'H'), ('A', 'C'), ('A', 'B'), ('C', 'B'), ('G', 'H')]
    
    
    约束
    3 ===================基于约束=================================
    3 ==========基于约束===条件独立测试===============
    False
    True
    True
    True
    False
    3 ==========基于约束===DAG构建=================
    Undirected edges:  [('A', 'B'), ('A', 'C'), ('A', 'H'), ('E', 'F'), ('G', 'H')]
    PDAG edges:        [('A', 'H'), ('B', 'A'), ('C', 'A'), ('E', 'F'), ('F', 'E'), ('G', 'H')]
    DAG edges:         [('A', 'H'), ('B', 'A'), ('C', 'A'), ('F', 'E'), ('G', 'H')]
    3 ==========基于约束===DAG构建===estimate方法=================
    [('A', 'H'), ('B', 'A'), ('C', 'A'), ('F', 'E'), ('G', 'H')]
    3 ==========基于约束===DAG构建===从independencies中学习======
    [('A', 'D'), ('B', 'D'), ('C', 'D')]
    

    混合

    4 ===================混合方法=================================
    Part 1) Skeleton:  [('A', 'H'), ('A', 'C'), ('E', 'F'), ('G', 'H')]
    Part 2) Model:     [('A', 'C'), ('E', 'F'), ('H', 'A'), ('H', 'G')]
    4 ===================两步划为一步=================================
    。。。
  • 相关阅读:
    python学习笔记 async and await
    python学习笔记 异步asyncio
    python学习笔记 协程
    python学记笔记 2 异步IO
    python学习笔记 可变参数关键字参数**kw相关学习
    逆波兰表达式 栈表达式计算
    Codeforces 270E Flawed Flow 网络流问题
    Codeforces 219D Choosing Capital for Treeland 2次DP
    kuangbin 带你飞 概率期望
    函数式编程思想:以函数的方式思考,第3部分
  • 原文地址:https://www.cnblogs.com/bonelee/p/14312505.html
Copyright © 2011-2022 走看看