窄依赖:Narrow Dependency
父RDD和子RDD是一对一的依赖关系,如map,filter
宽依赖:Shuffle Dependency
本质就是shuffle。如reduceByKey,groupyByKey,父RDD一个分区数据给了子RDD的多个分区
存在shuffle就是宽依赖,否则就是窄依赖
窄依赖的函数有:map, filter, union, join(父RDD是hash-partitioned ), mapPartitions, mapValues
宽依赖的函数有:groupByKey, join(父RDD不是hash-partitioned ), partitionBy