zoukankan      html  css  js  c++  java
  • spark rdd topK sort spark获取topK排序

    You can use either top or takeOrdered with key argument:

    newRDD.top(2, key=lambda x: x[2])
    

    or

    newRDD.takeOrdered(2, key=lambda x: -x[2])
    

    Note that top is taking elements in descending order and takeOrdered in ascending so key function is different in both cases.

    如果是pairrdd对key,或者value排序的话:

    Sort by key and value in ascending and descending order

    val textfile = sc.textFile("file:///home/hdfs/input.txt")
    val words = textfile.flatMap(line => line.split(" "))
    //Sort by value in descending order. For ascending order remove 'false' argument from sortBy
    words.map( word => (word,1)).reduceByKey((a,b) => a+b).sortBy(_._2,false)
    //for ascending order by value
    words.map( word => (word,1)).reduceByKey((a,b) => a+b).sortBy(_._2)
    
    //Sort by key in ascending order
    words.map( word => (word,1)).reduceByKey((a,b) => a+b).sortByKey
    //Sort by key in descending order
    words.map( word => (word,1)).reduceByKey((a,b) => a+b).sortByKey(false)
  • 相关阅读:
    50个C/C++经典面试题
    多继承的构造顺序
    sizeof(struct)
    c++ 实现strcpy(),strlen()
    十天冲刺-01
    学习进度条(第八周)
    梦断代码阅读笔记01
    学习进度条(第七周)
    团队作业记账本开发NABCD
    学习进度条(第六周)
  • 原文地址:https://www.cnblogs.com/bonelee/p/14522640.html
Copyright © 2011-2022 走看看