zoukankan      html  css  js  c++  java
  • java.io.IOException: Could not locate executable nullinwinutils.exe in the Hadoop binaries.

    https://stackoverflow.com/questions/35652665/java-io-ioexception-could-not-locate-executable-null-bin-winutils-exe-in-the-ha

    93

    I'm not able to run a simple spark job in Scala IDE (Maven spark project) installed on Windows 7

    Spark core dependency has been added.

    val conf = new SparkConf().setAppName("DemoDF").setMaster("local")
    val sc = new SparkContext(conf)
    val logData = sc.textFile("File.txt")
    logData.count()
    

    Error:

    16/02/26 18:29:33 INFO SparkContext: Created broadcast 0 from textFile at FrameDemo.scala:13
    16/02/26 18:29:34 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
    java.io.IOException: Could not locate executable nullinwinutils.exe in the Hadoop binaries.
        at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
        at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
        at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
        at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
        at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
        at <br>org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
        at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
        at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
        at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)<br>
        at scala.Option.map(Option.scala:145)<br>
        at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)<br>
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)<br>
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
        at scala.Option.getOrElse(Option.scala:120)<br>
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)<br>
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
        at scala.Option.getOrElse(Option.scala:120)<br>
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)<br>
        at org.apache.spark.rdd.RDD.count(RDD.scala:1143)<br>
        at com.org.SparkDF.FrameDemo$.main(FrameDemo.scala:14)<br>
        at com.org.SparkDF.FrameDemo.main(FrameDemo.scala)<br>
    
    share  improve this question   

    12 Answers

    142
     

    Here is a good explanation of your problem with the solution.

    1. Download winutils.exe from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe.
    2. SetUp your HADOOP_HOME environment variable on the OS level or programmatically:

      System.setProperty("hadoop.home.dir", "full path to the folder with winutils");

    3. Enjoy

    share  improve this answer   
    • 14
      I have to set HADOOP_HOME to hadoop folder instead of the bin folder. – Stanley Aug 29 '16 at 7:44
    • 4
      Also, be sure to download the correct winutils.exe based on the version of hadoop that spark is compiled for (so, not necessarily the link above). Otherwise, pain awaits :) – NP3 Jun 30 '17 at 12:14
    •  
      System.setProperty("hadoop.home.dir", "C:\hadoop-2.7.1\") – Shyam Gupta Oct 14 '17 at 19:00
    • 1
      yes exactly as @Stanley says. worked with setting up the HADOOP_HOME to hadoop folder instead of the bin folder. – Jazz Apr 9 '19 at 13:09
    •  
      @NP3 and how do you know that version? I am using latest pyspark. Thanks, – JDPeckham Nov 10 '19 at 19:12
     
    67
    1. Download winutils.exe
    2. Create folder, say C:winutilsin
    3. Copy winutils.exe inside C:winutilsin
    4. Set environment variable HADOOP_HOME to C:winutils
    share  improve this answer   
    •  
      also, if you have a cmd line open, restart it for the variables to take affect. – eych Aug 21 '19 at 16:51
    26

    Follow this:

    1. Create a bin folder in any directory(to be used in step 3).

    2. Download winutils.exe and place it in the bin directory.

    3. Now add System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR"); in your code.

    share  improve this answer   
    • 2
      Thanks a lot, just what i was looking for – user373201 Feb 27 '17 at 2:59
    • 3
      It is to be noted that the path to be pointed should not include the 'bin' directory. Ex: If the path where winutils.exe is "D://Hadoop//bin//winutils.exe" , then the path for hadoop.home.dir should be "D://Hadoop" – Keshav Pradeep Ramanath May 31 '18 at 10:30
     
    4

    if we see below issue

    ERROR Shell: Failed to locate the winutils binary in the hadoop binary path

    java.io.IOException: Could not locate executable nullinwinutils.exe in the Hadoop binaries.

    then do following steps

    1. download winutils.exe from http://public-repo-1.hortonworks.com/hdp- win-alpha/winutils.exe.
    2. and keep this under bin folder of any folder you created for.e.g. C:Hadoopin
    3. and in program add following line before creating SparkContext or SparkConf System.setProperty("hadoop.home.dir", "C:Hadoop");
    share  improve this answer   
    4

    On Windows 10 - you should add two different arguments.

    (1) Add the new variable and value as - HADOOP_HOME and path (i.e. c:Hadoop) under System Variables.

    (2) Add/append new entry to the "Path" variable as "C:Hadoopin".

    The above worked for me.

    share  improve this answer   
    4
    1) Download winutils.exe from https://github.com/steveloughran/winutils 
    2) Create a directory In windows "C:winutilsin
    3) Copy the winutils.exe inside the above bib folder .
    4) Set the environmental property in the code 
      System.setProperty("hadoop.home.dir", "file:///C:/winutils/");
    5) Create a folder "file:///C:/temp" and give 777 permissions.
    6) Add config property in spark Session ".config("spark.sql.warehouse.dir", "file:///C:/temp")"
    
    share  improve this answer   
    2

    I got the same problem while running unit tests. I found this workaround solution:

    The following workaround allows to get rid of this message:

        File workaround = new File(".");
        System.getProperties().put("hadoop.home.dir", workaround.getAbsolutePath());
        new File("./bin").mkdirs();
        new File("./bin/winutils.exe").createNewFile();
    

    from: https://issues.cloudera.org/browse/DISTRO-544

    share  improve this answer   
    2

    You can alternatively download winutils.exe from GITHub:

    https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin

    replace hadoop-2.7.1 with the version you want and place the file in D:hadoopin

    If you do not have access rights to the environment variable settings on your machine, simply add the below line to your code:

    System.setProperty("hadoop.home.dir", "D:\hadoop");
  • 相关阅读:
    AngularJS ui-router (嵌套路由)
    解决Can't connect to MySQL server on 'localhost' (10048)
    PHP获取一年有几周以及每周开始日期和结束日期
    MySQL(八)之DML
    MySQL(七)MySQL常用函数
    MySQL(六)之MySQL常用操作符
    MySQL(五)之DDL(数据定义语言)与六大约束
    MySQL(四)之MySQL数据类型
    MySQL(三)之SQL语句分类、基本操作、三大范式
    linux命令详解之netstat
  • 原文地址:https://www.cnblogs.com/felixzh/p/14024071.html
Copyright © 2011-2022 走看看