zoukankan html css js c++ java

spark dataset join 使用方法java

 1 dataset<Row> df1,df2,df3
 2 
 3 //该方法可以执行成功
 4 df3= df1.join(df2,"post_id").selectExpr("hostname,request_date,post_id,title,author,name as category".split(","));  //innner join
 5 
 6 acc = df1.withColumnRenamed("post_id", "post_id_acc");
 7 //该方法join同名列的时候，要重命名，否则会报错：重名列(通过drop删除无效，不知道是什么原因)
 8 post_categories = acc.join(post_one_cat,acc.col("post_id_acc").equalTo(post_one_cat.col("post_id")),"left_outer").join(categories, post_one_cat.col("cate_id").equalTo(categories.col("id")),"left_outer").selectExpr("hostname,request_date,post_id_acc as post_id,title,author,name as category".split(","));
 9 //post_categories = acc.join(post_one_cat,acc.col("post_id_acc").equalTo(post_one_cat.col("post_id")),"left_outer").join(categories, post_one_cat.col("cate_id").equalTo(categories.col("id")),"left_outer").withColumnRenamed("name", "category")

.withColumnRenamed("post_id_cat", "post_id");

10 //该方法可以执行成功 

11 df3= df1.join(df2,JavaConverters.asScalaIteratorConverter(Arrays.asList("post_id").iterator()).asScala().toSeq(),"left_outer").join(cat, JavaConverters.asScalaIteratorConverter(Arrays.asList("cate_id").iterator()).asScala().toSeq(),"left_outer").selectExpr("hostname,request_date,post_id,title,author,name as category".split(","));

查看全文

相关阅读:
深入浅出SQL教程之Group by和Having
AFNetworking3.0 Https P12证书
 C#访问注册表
 One reason for not able to show chinese correctly in installation
Debugging DLL loading issues with GFLAGS
RegistryFree COM Registration
RegistrationFree COM Interop
net use
MS UI Automation原来如此
 取景器的视野率和放大倍率

原文地址：https://www.cnblogs.com/lyy-blog/p/9579026.html