zoukankan      html  css  js  c++  java
  • spark mllib prefixspan demo

    ./bin/spark-submit ~/src_test/prefix_span_test.py
    

    source code:

    import os
    import sys 
    from  pyspark.mllib.fpm import PrefixSpan
    from pyspark import SparkContext
    from pyspark import SparkConf
    
    sc = SparkContext("local","testing")
    print(sc)
    data = [ 
       [['a'],["a", "b", "c"], ["a","c"],["d"],["c", "f"]],
       [["a","d"], ["c"],["b", "c"], ["a", "e"]],
       [["e", "f"], ["a", "b"], ["d","f"],["c"],["b"]],
       [["e"], ["g"],["a", "f"],["c"],["b"],["c"]]
       ]   
    rdd = sc.parallelize(data, 2)
    model = PrefixSpan.train(rdd, 0.5,4)
    result = sorted(model.freqSequences().collect())
    print("*"*88)
    print(result)
    print("*"*88)
    

     output:

    ****************************************************************************************
    [FreqSequence(sequence=[['a']], freq=4), FreqSequence(sequence=[['a'], ['a']], freq=2), FreqSequence(sequence=[['a'], ['b']], freq=4), FreqSequence(sequence=[['a'], ['b'], ['a']], freq=2), FreqSequence(sequence=[['a'], ['b'], ['c']], freq=2), FreqSequence(sequence=[['a'], ['b', 'c']], freq=2), FreqSequence(sequence=[['a'], ['b', 'c'], ['a']], freq=2), FreqSequence(sequence=[['a'], ['c']], freq=4), FreqSequence(sequence=[['a'], ['c'], ['a']], freq=2), FreqSequence(sequence=[['a'], ['c'], ['b']], freq=3), FreqSequence(sequence=[['a'], ['c'], ['c']], freq=3), FreqSequence(sequence=[['a'], ['d']], freq=2), FreqSequence(sequence=[['a'], ['d'], ['c']], freq=2), FreqSequence(sequence=[['a'], ['f']], freq=2), FreqSequence(sequence=[['b']], freq=4), FreqSequence(sequence=[['b'], ['a']], freq=2), FreqSequence(sequence=[['b'], ['c']], freq=3), FreqSequence(sequence=[['b'], ['d']], freq=2), FreqSequence(sequence=[['b'], ['d'], ['c']], freq=2), FreqSequence(sequence=[['b'], ['f']], freq=2), FreqSequence(sequence=[['b', 'a']], freq=2), FreqSequence(sequence=[['b', 'a'], ['c']], freq=2), FreqSequence(sequence=[['b', 'a'], ['d']], freq=2), FreqSequence(sequence=[['b', 'a'], ['d'], ['c']], freq=2), FreqSequence(sequence=[['b', 'a'], ['f']], freq=2), FreqSequence(sequence=[['b', 'c']], freq=2), FreqSequence(sequence=[['b', 'c'], ['a']], freq=2), FreqSequence(sequence=[['c']], freq=4), FreqSequence(sequence=[['c'], ['a']], freq=2), FreqSequence(sequence=[['c'], ['b']], freq=3), FreqSequence(sequence=[['c'], ['c']], freq=3), FreqSequence(sequence=[['d']], freq=3), FreqSequence(sequence=[['d'], ['b']], freq=2), FreqSequence(sequence=[['d'], ['c']], freq=3), FreqSequence(sequence=[['d'], ['c'], ['b']], freq=2), FreqSequence(sequence=[['e']], freq=3), FreqSequence(sequence=[['e'], ['a']], freq=2), FreqSequence(sequence=[['e'], ['a'], ['b']], freq=2), FreqSequence(sequence=[['e'], ['a'], ['c']], freq=2), FreqSequence(sequence=[['e'], ['a'], ['c'], ['b']], freq=2), FreqSequence(sequence=[['e'], ['b']], freq=2), FreqSequence(sequence=[['e'], ['b'], ['c']], freq=2), FreqSequence(sequence=[['e'], ['c']], freq=2), FreqSequence(sequence=[['e'], ['c'], ['b']], freq=2), FreqSequence(sequence=[['e'], ['f']], freq=2), FreqSequence(sequence=[['e'], ['f'], ['b']], freq=2), FreqSequence(sequence=[['e'], ['f'], ['c']], freq=2), FreqSequence(sequence=[['e'], ['f'], ['c'], ['b']], freq=2), FreqSequence(sequence=[['f']], freq=3), FreqSequence(sequence=[['f'], ['b']], freq=2), FreqSequence(sequence=[['f'], ['b'], ['c']], freq=2), FreqSequence(sequence=[['f'], ['c']], freq=2), FreqSequence(sequence=[['f'], ['c'], ['b']], freq=2)]
    ****************************************************************************************

  • 相关阅读:
    TensorFlow使用细节 NO1
    tensorflow的keras实现搭配dataset 之二
    tensorflow的keras实现搭配dataset 之一
    windows程序设计 Unicode和多字节之间转换
    windows程序设计 Unicode和多字节
    windows程序设计 显示一个窗口
    windows程序设计 获取系统文件路径
    windows程序设计 基础
    windows程序设计 加载位图图片
    windows程序设计 创建一个新的窗口
  • 原文地址:https://www.cnblogs.com/bonelee/p/10755622.html
Copyright © 2011-2022 走看看