zoukankan      html  css  js  c++  java
  • 寒假学习笔记09

    Spark GraphX例子

    假定我们想从一些文本文件中构建一个图,限制这个图包含重要的关系和用户,并且在子图上运行page-rank,最后返回与top用户相关的属性。可以通过如下方式实现。

    // Connect to the Spark cluster
    val sc = new SparkContext("spark://master.amplab.org", "research")
    
    // Load my user data and parse into tuples of user id and attribute list
    val users = (sc.textFile("graphx/data/users.txt")
      .map(line => line.split(",")).map( parts => (parts.head.toLong, parts.tail) ))
    
    // Parse the edge data which is already in userId -> userId format
    val followerGraph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")
    
    // Attach the user attributes
    val graph = followerGraph.outerJoinVertices(users) {
      case (uid, deg, Some(attrList)) => attrList
      // Some users may not have attributes so we set them as empty
      case (uid, deg, None) => Array.empty[String]
    }
    
    // Restrict the graph to users with usernames and names
    val subgraph = graph.subgraph(vpred = (vid, attr) => attr.size == 2)
    
    // Compute the PageRank
    val pagerankGraph = subgraph.pageRank(0.001)
    
    // Get the attributes of the top pagerank users
    val userInfoWithPageRank = subgraph.outerJoinVertices(pagerankGraph.vertices) {
      case (uid, attrList, Some(pr)) => (pr, attrList.toList)
      case (uid, attrList, None) => (0.0, attrList.toList)
    }
    
    println(userInfoWithPageRank.vertices.top(5)(Ordering.by(_._2._1)).mkString("
    "))
    

      

  • 相关阅读:
    hdu_5791_Two(DP)
    hdu_5783_Divide the Sequence(贪心)
    hdu_5769_Substring(后缀数组)
    hdu_5778_abs(暴力)
    hdu_5776_sum(前缀和维护)
    hdu_5777_domino(贪心)
    [wikioi2069]油画(贪心)
    [bzoj 1503][NOI 2004]郁闷的出纳员(平衡树)
    数据结构练习
    [poj3274]排排站(Hash)
  • 原文地址:https://www.cnblogs.com/wxy2000/p/12291783.html
Copyright © 2011-2022 走看看