zoukankan      html  css  js  c++  java
  • Spark2 Can't write dataframe to parquet hive table : HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`

    7

    I'm trying to save dataframe in table hive.

    In spark 1.6 it's work but after migration to 2.2.0 it doesn't work anymore.

    Here's the code:

    blocs
          .toDF()
          .repartition($"col1", $"col2", $"col3", $"col4")
          .write
          .format("parquet")
          .mode(saveMode)
          .partitionBy("col1", "col2", "col3", "col4")
          .saveAsTable("db".tbl)
    

    The format of the existing table project_bsc_dhr.bloc_views is HiveFileFormat. It doesn't match the specified format ParquetFileFormat.; org.apache.spark.sql.AnalysisException: The format of the existing table project_bsc_dhr.bloc_views is HiveFileFormat. It doesn't match the specified format ParquetFileFormat.;

    share  improve this question   
    •  
      have you got any solution ? i am facing same issue..can you please let me know what is the work around – BigD Feb 8 '19 at 11:42
    •  
      Yes, i used insertInto instead of saveAsTable and i deleted partitionby. The code: blocs .toDF() .repartition($"col1", $"col2", $"col3", $"col4") .write .format("parquet") .insertInto("db".tbl) – youssef grati Feb 9 '19 at 12:07 
    •  
      am using spark 2.3.0 .. is repartitions works on latest spark ? – BigD Feb 9 '19 at 15:34 
    8

    I have just tried to use .format("hive") to saveAsTable after getting the error and it worked.

    I also would not recommend to use insertInto suggested by the author, because it looks not type-safe (as much as this term can be applied to SQL API) and is error-prone in the way it ignores column names and uses position-base resolution.

  • 相关阅读:
    POJ 2209
    POJ 2196
    POJ 2215
    POJ 2192
    POJ 2195
    POJ 2181
    POJ 2182
    POJ 2159
    POJ 2153
    字符设备驱动 —— 字符设备驱动框架
  • 原文地址:https://www.cnblogs.com/felixzh/p/13501857.html
Copyright © 2011-2022 走看看