zoukankan      html  css  js  c++  java
  • np.random.choices的使用

    在看莫烦python的RL源码时,他的DDPG记忆库Memory的实现是这样写的:

    class Memory(object):
        def __init__(self, capacity, dims):
            self.capacity = capacity
            self.data = np.zeros((capacity, dims))
            self.pointer = 0
    
        def store_transition(self, s, a, r, s_):
            transition = np.hstack((s, a, [r], s_))
            index = self.pointer % self.capacity  # replace the old memory with new memory
            self.data[index, :] = transition
            self.pointer += 1
    
        def sample(self, n):
            assert self.pointer >= self.capacity, 'Memory has not been fulfilled'
            indices = np.random.choice(self.capacity, size=n)
            return self.data[indices, :]

    其中sample方法用assert断言pointer >= capacity,也就是说Memory必须满了才能学习。

    我在设计一种方案,一开始往记忆库里存比较好的transition(也就是reward比较高的),要是等记忆库填满再学习好像有点浪费,因为会在填满之后很快被差的transition所替代,甚至好的transition不能填满Memory,从而不能有效学习好的经验。

    此时就需要关注np.random.choice方法了,看源码解释:

    def choice(a, size=None, replace=True, p=None): # real signature unknown; restored from __doc__
        """
        choice(a, size=None, replace=True, p=None)
        
                Generates a random sample from a given 1-D array
        
                        .. versionadded:: 1.7.0
        
                Parameters
                -----------
                a : 1-D array-like or int
                    If an ndarray, a random sample is generated from its elements.
                    If an int, the random sample is generated as if a were np.arange(a)
                size : int or tuple of ints, optional
                    Output shape.  If the given shape is, e.g., ``(m, n, k)``, then
                    ``m * n * k`` samples are drawn.  Default is None, in which case a
                    single value is returned.
                replace : boolean, optional
                    Whether the sample is with or without replacement
                p : 1-D array-like, optional
                    The probabilities associated with each entry in a.
                    If not given the sample assumes a uniform distribution over all
                    entries in a.
        
                Returns
                --------
                samples : single item or ndarray
                    The generated random samples

    主要第一个参数为ndarray,如果给的是int,np会自动将其通过np.arange(a)转换为ndarray。

    此处主要关注的是,a(我们使用int)< size时,np会怎么取?

    上代码测试

    import numpy as np
    
    samples = np.random.choice(3, 5)
    print(samples)

    输出:

    [2 1 2 1 1]

    所以,是会从np.array(a)重复取,可以推断出,np.random.choice是“有放回地取”(具体我也没看源码,从重复情况来看,至少a<size时是这样的)

    然后我分别测试了np.random.choice(5, 5)、np.random.choice(10, 5)等。多试几次会发现samples中确实是会有重复的。:

    import numpy as np
    
    samples = np.random.choice(10, 5)
    print(samples)
    
    [3 4 3 4 5]
  • 相关阅读:
    [OpenGL]用OpenGL制作动画
    主管喜欢什么样的程序员
    windows下使用git管理代码,其中出现的问题的解决办法
    iOS7隐藏状态栏 status Bar
    [微信开发_02]环境搭建
    Matlab PCA 算法
    Git学习之msysGit环境支持
    Git学习之Git 暂存区
    Git学习之Git检出
    STL——(3)string容器
  • 原文地址:https://www.cnblogs.com/peanutk/p/10786839.html
Copyright © 2011-2022 走看看