zoukankan      html  css  js  c++  java
  • pyspark 读写csv、json文件

    from pyspark import SparkContext,SparkConf
    import os
    from pyspark.sql.session import SparkSession
    
    def CreateSparkContex():
    	sparkconf=SparkConf().setAppName("MYPRO").set("spark.ui.showConsoleProgress","false")
    	sc=SparkContext(conf=sparkconf)
    	print("master:"+sc.master)
    	sc.setLogLevel("WARN")
    	Setpath(sc)
    	spark = SparkSession.builder.config(conf=sparkconf).getOrCreate()
    	return sc,spark
    
    def Setpath(sc):
    	global Path
    	if sc.master[:5]=="local":
    		Path="file:/C:/spark/sparkworkspace"
    	else:
    		Path="hdfs://test"
    
    
    if __name__=="__main__":
    	print("Here we go!
    ")
    	sc,spark=CreateSparkContex()
    	readcsvpath=os.path.join(Path,'iris.csv')
    	readjspath=os.path.join(Path,'fd.json')
    	
    	outcsvpath=os.path.join(Path,'write_iris.csv')
    	outjspath=os.path.join(Path,'write_js.json')
    	
    	dfcsv=spark.read.csv(readcsvpath,header=True)
    	dfjs=spark.read.json(readjspath)
    	#df.write.csv(outcsvpath)
    	#df.write.json(outjspath)
    	dfcsv.show(3)
    	dfjs.show(3)
    	sc.stop()
    	spark.stop()
    

  • 相关阅读:
    hdu 2295 DLX
    hdu 4714 树形DP
    hdu 4711 动态规划
    hdu 3656 DLX
    hust 1017 DLX
    hdu 3938 并查集
    hdu 3652 打表
    poj 2152 树形DP
    洛谷P1266速度限制
    洛谷P1841重要的城市
  • 原文地址:https://www.cnblogs.com/mahailuo/p/9591623.html
Copyright © 2011-2022 走看看