countByKey 和 countByValue都是 action算子 ,结果集均在driver端,输出时不需要单独做collect
spark.sparkContext.setLogLevel("error") val bd=spark.sparkContext.parallelize(List(("hive",2),("hive",1),("hive",2),("hive",1),("hive",3),("spark",2),("spark",2))) bd.countByKey().foreach(println(_)) println("--------------------------------") bd.countByValue().foreach(println(_))