zoukankan      html  css  js  c++  java
  • naive cube implementation in python

    这篇论文中提到的naive cube算法的实现,python写出来真的就和伪代码差不多=。=

    输入大约长这样,依次是

    index  userid  country  state  city  topic  category  product  sales
    1    400141    3    78    3427    3    59    4967    4670.08
    2    783984    1    34    9    1    5    982    5340.9
    3    4945    1    47    1658    1    7    363    3065.37
    4    468352    2    57    2410    2    37    3688    9561.13
    5    553471    1    25    550    1    13    1476    3596.72
    6    649149    1    9    234    1    12    1456    2126.29
    ...

    输出的格式是这样,对于各个attr(用位置而不是名字表示)的各种value的搭配,输出对应group的measure的结果

    <attr><attr><attr>...|<value><value>...    <measure>

    mapper:

    #!/usr/bin/env python
    import sys
    from itertools import product
    
    
    def seq(start, end):
        return [range(start, i) for i in range(start, end + 2)]
    
    
    def read_input(file):
        for line in file:
            yield line.split()
    
    
    def main():
        data = read_input(sys.stdin)
        C = [a + b for a, b in product(seq(2, 4), seq(5, 7))]
        for e in data:
            for R in C:
                k = [e[i] for i in R]
                print "%s|%s	%s" % (' '.join([str(i) for i in R]), ' '.join(k), e[1])
    
    if __name__ == "__main__":
        main()

    reducer:

    #!/usr/bin/env python
    
    from itertools import groupby
    from operator import itemgetter
    import sys
    
    
    def read_input(file):
        for line in file:
            yield line.rstrip().split('	')
    
    
    def main():
        data = read_input(sys.stdin)
        for key, group in groupby(data, itemgetter(0)):
            ids = set(uid for key, uid in group)
            print "%s	%d" % (key, len(ids))
    
    if __name__ == "__main__":
        main()

    课程设计选python就可以玩各种缩短代码的奇技淫巧了好嗨森……

  • 相关阅读:
    ADO中的多层次数据集,类似于dataset
    工作流的设计
    Socket bind系统调用简要分析
    linux Network Address Translation NAT 转载 还需要整理
    生活20190602
    磁盘空间满的问题
    linux netfilter nat 实现 转载
    Socket 套接字的系统调用
    linux 网络编程 基础
    学习linux,不要找别人了,我有东西要发
  • 原文地址:https://www.cnblogs.com/joyeecheung/p/3667776.html
Copyright © 2011-2022 走看看