假设数据规模为n,采样为k,
蓄水池采样算法是针对大数据集或者数据规模不确定的算法:空间为k,时间为n,
- 先选取数据流中的前k个元素,保存在集合A中;
- 从第j(k + 1 <= j <= n)个元素开始,每次先以概率p = k/j选择是否让第j个元素留下。若j被选中,则从A中随机选择一个元素并用该元素j替换它;否则直接淘汰该元素;
- 重复步骤2直到结束,最后集合A中剩下的就是保证随机抽取的k个元素。
具体证明参见:蓄水池采样证明及java代码
以及:
C++代码:知道总体长度产生(0~n-1)的随机数方法:
1 /** 2 * Definition for singly-linked list. 3 * struct ListNode { 4 * int val; 5 * ListNode *next; 6 * ListNode(int x) : val(x), next(NULL) {} 7 * }; 8 */ 9 class Solution { 10 public: 11 /** @param head The linked list's head. 12 Note that the head is guaranteed to be not null, so it contains at least one node. */ 13 Solution(ListNode* head) { 14 len=0; 15 ListNode* cur=head; 16 this->head = head; 17 while(cur){ 18 len++; 19 cur=cur->next; 20 } 21 } 22 23 /** Returns a random node's value. */ 24 int getRandom() { 25 int t = rand() % len;//产生0~(len-1)之间的随机数 26 ListNode* cur=head; 27 while(t){ 28 t--; 29 cur=cur->next; 30 } 31 return cur->val; 32 } 33 private: 34 int len; 35 ListNode* head; 36 }; 37 38 /** 39 * Your Solution object will be instantiated and called as such: 40 * Solution obj = new Solution(head); 41 * int param_1 = obj.getRandom(); 42 */
C++解法二:蓄水池算法
1 /** 2 * Definition for singly-linked list. 3 * struct ListNode { 4 * int val; 5 * ListNode *next; 6 * ListNode(int x) : val(x), next(NULL) {} 7 * }; 8 */ 9 class Solution { 10 public: 11 /** @param head The linked list's head. 12 Note that the head is guaranteed to be not null, so it contains at least one node. */ 13 Solution(ListNode* head) { 14 this->head = head; 15 } 16 17 /** Returns a random node's value. */ 18 int getRandom() { 19 ListNode* cur=head->next; 20 int res=head->val; 21 int j=2; 22 //维护一个长度为1的蓄水池,每次抽取概率1/j; 23 while(cur){ 24 int t=rand()%j; 25 if(t==0) res=cur->val; 26 cur=cur->next; 27 j++; 28 } 29 return res; 30 } 31 private: 32 ListNode* head; 33 }; 34 35 /** 36 * Your Solution object will be instantiated and called as such: 37 * Solution obj = new Solution(head); 38 * int param_1 = obj.getRandom(); 39 */
执行92ms,看看执行28ms的
leetcode答案执行效率28ms
1 /** 2 * Definition for singly-linked list. 3 * struct ListNode { 4 * int val; 5 * ListNode *next; 6 * ListNode(int x) : val(x), next(NULL) {} 7 * }; 8 */ 9 class Solution { 10 public: 11 /** @param head The linked list's head. 12 Note that the head is guaranteed to be not null, so it contains at least one node. */ 13 14 int length = 0; 15 struct ListNode* h; 16 17 Solution(ListNode* head) { 18 h = head; 19 struct ListNode* p = head; 20 while(p) 21 { 22 length++; 23 p = p->next; 24 } 25 } 26 27 /** Returns a random node's value. */ 28 int getRandom() { 29 int n = random()%length; 30 int i; 31 struct ListNode* p = h; 32 for(i=0;i<n;i++) 33 p = p->next; 34 return p->val; 35 } 36 }; 37 38 /** 39 * Your Solution object will be instantiated and called as such: 40 * Solution obj = new Solution(head); 41 * int param_1 = obj.getRandom(); 42 */