zoukankan      html  css  js  c++  java
  • Python 调C 示例《采样方法总结:Alias method》

    采样方法总结:Alias method

    采样方法总结:Alias method

    之前做的论文中,有一处关键的计算涉及到对一个高维度的离散分布求数学期望,这个问题本来可以简单地sum over这个概率分布的support解决,然而这个方案被reviewer质疑运行速度太慢,于是又想出一招用alias method做Monte Carlo采样算积分。用了采样以后程序变得更容易并行化,用C写了下效率大增。由于网上已经有相当多介绍alias method采样方法的文章了,其中不少还非常详细,我再写一遍也不过鹦鹉学舌,因此本文只对实现方法作简单记录。

    相关链接:

    一个国外大神写的文章,介绍非常详细:Darts, Dice, and Coins

    一篇讲得简单清晰的中文文章:【数学】时间复杂度O(1)的离散采样算法-- Alias method/别名采样方法

    C++实现:https://oroboro.com/non-uniform-random-numbers/

    各种离散采样方法的实验比较:回应CSDN肖舸《做程序,要“专注”和“客观”》,实验比较各离散采样算法 - Milo Yip - 博客园

    Naive Implementation:

    最早写的这个版本应该是网上所有alias method实现里代码量最少,也最简单暴力的版本了,虽然复杂度并不是最优。

    注意:

    • 为了方便在Python中调用,这里把生成alias table的过程写在cpp中,在Python里用numpy的C++接口调用cpp
    • 因为这个实现过于简单暴力,两种情况下代码会死循环,第一是输入的pdf不合法的情况,即pdf向量中所有元素加起来不等于1.0,第二是浮点数精度不足带来的条件判断不通过问题,这里的代码在我的电脑上运行良好;

    用shell(或用 python 调shell )跑咩?

    // TO COMPILE sampler.cpp into sampler.so: // g++ -std=c++11 sampler.cpp -o sampler.so -shared -fPIC -O2 -D_GLIBCXX_USE_CXX11_ABI=0 extern "C" { /** * @brief Generate an alias table from a discrete distribution * @param out An array of size 2 * length, representing the alias table Note that this array should be 1.0 in each column after the function **/ void gen_alias_table(float* out, int* events, int length) { int more, less, flag; while (true) { flag = 1; for (int i = 0; i < length; i++) { if (out[i] + out[i + length] < 1.0) { less = i; flag = 0; break; } } for (int i = 0; i < length; i++) { if (out[i] + out[i + length] > 1.0) { more = i; flag = 0; break; } } if (flag) break; out[less + length] = 1.0 - out[less]; out[more] = out[more] - 1.0 + out[less]; events[less + length] = more; } } } // extern "C"

    Python部分实现如下:

    # utils.py
    import numpy as np
    import ctypes as ct
    
    class discreteSampler(object):
        def __init__(self, probs):
            '''
            This class can sample from high dimensional discrete distribution using the Alias method
            For better efficiency, the generation of alias table is implemented with C++
            :param probs: a 1 dimensional numpy.ndarray representing the PDF for the discrete distribution
            '''
            length = probs.shape[-1]
            self.probs = np.copy(probs)
            self.alias = np.zeros((2, length), dtype=np.float32)
            self.alias[0, :] = self.probs * length
            self.events = np.ones((2, length), dtype=np.int32)
            self.events[0, :] = np.arange(0, length, dtype=np.int32)
            self.events[1, :] = -1.0
    
            # Load and run the C++ dynamic library
            BASE_DIR = os.path.dirname(os.path.abspath(__file__))
            self.dll = np.ctypeslib.load_library(os.path.join(BASE_DIR, 'sampler.so'), '.')
            self.dll.gen_alias_table(self.alias.ctypes.data_as(ct.c_void_p),
                                     self.events.ctypes.data_as(ct.c_void_p),
                                     ct.c_int(length))
    
        def generate(self):
            idx = random.randint(0, self.probs.shape[-1] - 1)
            if random.uniform(0.0, 1.0) <= self.alias[0, idx]:
                return idx
            else:
                return self.events[1, idx]

    最后用随机数测试下,采样10000次,可视化。

    # Testing the sampling using the alias method implementation
    import numpy as np
    import utils
    import matplotlib.pyplot as plt
    
    probs = np.random.uniform(0.0, 1.0, [10])
    probs /= np.sum(probs)
    idxs = np.arange(0, 10)
    
    sampler = utils.discreteSampler(probs)
    collect = list()
    for it in range(10000):
        collect.append(sampler.generate())
        
    plt.figure()
    plt.subplot(121)
    plt.bar(idxs, probs)
    plt.title('true distribution')
    plt.subplot(122)
    plt.hist(collect, bins=10, normed=True)
    plt.title('sampled distribution')
    plt.show()

    运行结果:

  • 相关阅读:
    hdu 4027 Can you answer these queries?
    hdu 4041 Eliminate Witches!
    hdu 4036 Rolling Hongshu
    pku 2828 Buy Tickets
    hdu 4016 Magic Bitwise And Operation
    pku2886 Who Gets the Most Candies?(线段树+反素数打表)
    hdu 4039 The Social Network
    hdu 4023 Game
    苹果官方指南:Cocoa框架(2)(非原创)
    cocos2d 中 CCNode and CCAction
  • 原文地址:https://www.cnblogs.com/cx2016/p/12801978.html
Copyright © 2011-2022 走看看