zoukankan      html  css  js  c++  java
  • Atittit HDFS hadoop 大数据文件系统java使用总结 目录 1. 操作系统,进行操作 1 2. Hdfs 类似nfs ftp远程分布式文件服务 2 3. 启动hdfs服务start

    Atittit HDFS hadoop 大数据文件系统java使用总结

     

    目录

    1. 操作系统,进行操作 1

    2. Hdfs 类似nfs ftp远程分布式文件服务 2

    3. 启动hdfs服务start-dfs.cmd 2

    3.1. 配置core-site.xml 2

    3.2. 启动 2

    3.3. Code 2

    4. prob总结 6

    4.1. 启动hdfs服务中提示windows找不到hadoop 6

    4.2. D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format 6

    4.3. 提示file://没权限 10

    4.4. 347java.io.IOException: NameNode is not formatted 11

    4.5. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory C:\tmp\hadoop-Administrator\dfs\name is in an inconsistent state: storage directory does not exist or is not accessible. 11

    4.6. Unsafe link  ,, 12

    4.7. Unkonw host  java.net.ConnectException: Connection refused: no further information 12

    5. Theory 12

    5.1. 建立文件夹 13

    5.2. 写文件 13

    6. ref 13

     

     

    文件系统的几种操作

     

    1. 操作系统,进行操作
    1. 文件夹的操作:增删改查
    2. 远程文件的IO操作
    3. 文件的上传下载 (本地 远程文件复制操作

     

    .具体的操作命令

    1. 根据配置获取HDFS文件操作系统(共有三种方式)
      1. 方法一:直接获取配置文件方法
        通常情况下该方法用于本地有hadoop系统,可以直接进行访问。此时仅需在配置文件中指定要操作的文件系统为hdfs即可。这里的conf的配置文件可以设置hdfs的各种参数,并且优先级比配置文件要搞
      2. 方法二:指定URI路径,进而获取配置文件创建操作系统
        通常该方法用于本地没有hadoop系统,但是可以通过URI的方式进行访问。此时要给给定hadoop的NN节点的访问路径,hadoop的用户名,以及配置文件信息(此时会自动访问远程hadoop的配置文件)
    1. Hdfs 类似nfs ftp远程分布式文件服务
    2. 启动hdfs服务start-dfs.cmd
      1. 配置core-site.xml

     

    D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

     

    <configuration>

     

     <property>

      <name>fs.default.name</name>

      <value>hdfs://huabingood01:9000</value>

    </property>

     

    </configuration>

     

      1. 启动

     

      1. Code

    package hdfsHadoopUse;

     

    import java.io.IOException;

     

    import org.apache.hadoop.conf.Configuration;

    import org.apache.hadoop.fs.FileSystem;

    import org.apache.hadoop.fs.Path;

     

    public class hdfsHadoopClass {

     

    public static void main(String[] args) throws IOException {

     String pathToCreate = "/firstDirS09/secdirS09";

     hdfsHadoopClass hdfsHadoopClass = new hdfsHadoopClass();

    FileSystem fs=hdfsHadoopClass.getHadoopFileSystem();

    hdfsHadoopClass.myCreatePath(fs, pathToCreate);

    System.out.println("--f");

    }

    /**

         * 根据配置文件获取HDFS操作对象

         * 有两种方法:

         *  1.使用conf直接从本地获取配置文件创建HDFS对象

         *  2.多用于本地没有hadoop系统,但是可以远程访问。使用给定的URI和用户名,访问远程的配置文件,然后创建HDFS对象。

         * @return FileSystem

     * @throws IOException

         */

        public FileSystem getHadoopFileSystem() throws IOException {

     

     

            FileSystem fs = null;

            Configuration conf = null;

     

            // 方法一,本地有配置文件,直接获取配置文件(core-site.xml,hdfs-site.xml)

            // 根据配置文件创建HDFS对象

            // 此时必须指定hdsf的访问路径。

            conf = new Configuration();

            // 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高

            conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");

            conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());  

            

                // 根据配置文件创建HDFS对象

                fs = FileSystem.get(conf);

           

     

     

            return fs;

        }

        

        /**

         * 这里的创建文件夹同shell中的mkdir -p 语序前面的文件夹不存在

         * 跟java中的IO操作一样,也只能对path对象做操作;但是这里的Path对象是hdfs中的

         * @param fs

         * @return

         * @throws IOException

         */

        public boolean myCreatePath(FileSystem fs, String pathToCreate) throws IOException{

            boolean b = false;

     

         //   String pathToCreate = "/hyw/test/huabingood/hyw";

    Path path = new Path(pathToCreate);

            try {

                // even the path exist,it can also create the path.

                b = fs.mkdirs(path);

            }   finally {

               

                    fs.close();

               

            }

            return b;

        }

    }

     

    1. prob总结

     

    Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01

    Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01

    需要启动hdfs服务。。

    %HADOOP_PREFIX%\sbin\start-dfs.cmd

      1. 启动hdfs服务中提示windows找不到hadoop

    start-dfs.cmd

    start "Apache Hadoop Distribution" hadoop namenode

    start "Apache Hadoop Distribution" hadoop datanode

    需要先建立namenode

     D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

    这里的hadoop指的是应该是bin\hadoop.cmd命令,把他加入到path目录envi pathvar

      1. D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

    D:\haddop\hadoop-3.1.1\sbin> D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

    2018-10-28 07:02:54,801 INFO namenode.NameNode: STARTUP_MSG:

    /************************************************************

    STARTUP_MSG: Starting NameNode

    STARTUP_MSG:   host = hmNotePC/192.168.1.101

    STARTUP_MSG:   args = [-format]

    STARTUP_MSG:   version = 3.1.1

    STARTUP_MSG:   classpath = D:\haddop\hadoop-3.1.1\etc\hadoop;D:\haddop\hadoop-3.1.1\share\had

    STARTUP_MSG:   build = https://github.com/apache/hadoop -r 2b9a8c1d3a2caf1e733d57f346af3ff0d5ba529c; compiled by 'leftnoteasy' on 2018-08-02T04:26Z

    STARTUP_MSG:   java = 1.8.0_31

    ************************************************************/

    2018-10-28 07:02:54,854 INFO namenode.NameNode: createNameNode [-format]

    Formatting using clusterid: CID-ecf4351a-e57c-411b-8ef3-2198981bc44b

    2018-10-28 07:02:56,060 INFO namenode.FSEditLog: Edit logging is async:true

    2018-10-28 07:02:56,090 INFO namenode.FSNamesystem: KeyProvider: null

    2018-10-28 07:02:56,092 INFO namenode.FSNamesystem: fsLock is fair: true

    2018-10-28 07:02:56,100 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false

    2018-10-28 07:02:56,119 INFO namenode.FSNamesystem: fsOwner             = Administrator (auth:SIMPLE)

    2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: supergroup          = supergroup

    2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: isPermissionEnabled = true

    2018-10-28 07:02:56,121 INFO namenode.FSNamesystem: HA Enabled: false

    2018-10-28 07:02:56,203 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling

    2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000

    2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true

    2018-10-28 07:02:56,238 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000

    2018-10-28 07:02:56,239 INFO blockmanagement.BlockManager: The block deletion will start around 2018 十月 28 07:02:56

    2018-10-28 07:02:56,243 INFO util.GSet: Computing capacity for map BlocksMap

    2018-10-28 07:02:56,243 INFO util.GSet: VM type       = 64-bit

    2018-10-28 07:02:56,248 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB

    2018-10-28 07:02:56,251 INFO util.GSet: capacity      = 2^21 = 2097152 entries

    2018-10-28 07:02:56,269 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false

    2018-10-28 07:02:56,362 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS

    2018-10-28 07:02:56,362 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033

    2018-10-28 07:02:56,363 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0

    2018-10-28 07:02:56,364 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000

    2018-10-28 07:02:56,364 INFO blockmanagement.BlockManager: defaultReplication         = 3

    2018-10-28 07:02:56,365 INFO blockmanagement.BlockManager: maxReplication             = 512

    2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: minReplication             = 1

    2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2

    2018-10-28 07:02:56,367 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms

    2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: encryptDataTransfer        = false

    2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000

    2018-10-28 07:02:56,425 INFO util.GSet: Computing capacity for map INodeMap

    2018-10-28 07:02:56,426 INFO util.GSet: VM type       = 64-bit

    2018-10-28 07:02:56,426 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB

    2018-10-28 07:02:56,427 INFO util.GSet: capacity      = 2^20 = 1048576 entries

    2018-10-28 07:02:56,428 INFO namenode.FSDirectory: ACLs enabled? false

    2018-10-28 07:02:56,429 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true

    2018-10-28 07:02:56,429 INFO namenode.FSDirectory: XAttrs enabled? true

    2018-10-28 07:02:56,430 INFO namenode.NameNode: Caching file names occurring more than 10 times

    2018-10-28 07:02:56,440 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRo

     

    2018-10-28 07:02:56,444 INFO snapshot.SnapshotManager: SkipList is disabled

    2018-10-28 07:02:56,452 INFO util.GSet: Computing capacity for map cachedBlocks

    2018-10-28 07:02:56,452 INFO util.GSet: VM type       = 64-bit

    2018-10-28 07:02:56,453 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB

    2018-10-28 07:02:56,453 INFO util.GSet: capacity      = 2^18 = 262144 entries

    2018-10-28 07:02:56,467 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10

    2018-10-28 07:02:56,468 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10

    2018-10-28 07:02:56,469 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25

    2018-10-28 07:02:56,475 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

    2018-10-28 07:02:56,476 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis

    2018-10-28 07:02:56,481 INFO util.GSet: Computing capacity for map NameNodeRetryCache

    2018-10-28 07:02:56,482 INFO util.GSet: VM type       = 64-bit

    2018-10-28 07:02:56,482 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB

    2018-10-28 07:02:56,483 INFO util.GSet: capacity      = 2^15 = 32768 entries

    2018-10-28 07:02:56,527 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1079199093-192.168.1.101-1540681376517

    2018-10-28 07:02:56,547 INFO common.Storage: Storage directory \tmp\hadoop-Administrator\dfs\name has been successfully formatted.

    2018-10-28 07:02:56,580 INFO namenode.FSImageFormatProtobuf: Saving image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 us

    2018-10-28 07:02:56,717 INFO namenode.FSImageFormatProtobuf: Image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 of size 3

    2018-10-28 07:02:56,741 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

    2018-10-28 07:02:56,756 INFO namenode.NameNode: SHUTDOWN_MSG:

    /************************************************************

    SHUTDOWN_MSG: Shutting down NameNode at hmNotePC/192.168.1.101

    ************************************************************/

      1. 提示file://没权限

    D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

     

    <configuration>

     

     <property>

      <name>fs.default.name</name>

      <value>hdfs://huabingood01:9000</value>

    </property>

     

    </configuration>

     

      1. 347java.io.IOException: NameNode is not formatted

     

    评论

     

    访问localhost:50070失败,说明namenode启动失败
    3、查看namenode启动日志

     

      1. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory C:\tmp\hadoop-Administrator\dfs\name is in an inconsistent state: storage directory does not exist or is not accessible.

            at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSI

    mage.java:376)

            at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(

     

      1. Unsafe link  ,,

    Hadoop.dll  winutil.exe  feodg   system  win  dir hto...maybe  need boot

      1. Unkonw host  java.net.ConnectException: Connection refused: no further information

     

    D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

     

    <configuration>

     

     <property>

      <name>fs.default.name</name>

      <value>hdfs://huabingood01:9000</value>

    </property>

     

    </configuration>

        conf = new Configuration();

            // 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高

            conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");

            conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());  

            

     

    Cfg file and code url yao yyo..

     

     

    1. Theory

     

    C:\tmp\hadoop-Administrator>tree

    卷 p1sys 的文件夹 PATH 列表

    卷序列号为 A87E-7AB4

    C:.

    ├─dfs

    │  ├─data

    │  └─name

    └─nm-local-dir

     

      1. 建立文件夹

     

     String pathToCreate = "/firstDirS09/secdirS09";

        hdfsHadoopClass.myCreatePath(fs, pathToCreate);

    实际建立的文件夹在的盘

     

    D:\firstDirS09\secdirS09

     

      1. 写文件

     

        //wirte file

    FSDataOutputStream FSDataOutputStream1 = fs.create(new Path("/file1S09.txt"));  

    FSDataOutputStream1.writeUTF("attilax bazai");

    FSDataOutputStream1.close();

     

     

    D:\file1S09.txt

    D:\.file1S09.txt.crc

     

    1. ref

    使用javaAPI操作hdfs - huabingood - 博客园.html

     

  • 相关阅读:
    最容易被淘汰的八种人
    java基础编程——用两个栈来实现一个队列
    java基础编程——重建二叉树
    java基础——多线程
    java基础编程——链表反转
    java基础——线程池
    java基础——线程
    java基础编程——二维数组中的查找
    网络编程——TCP协议和通信
    网络编程——UDP协议和通信
  • 原文地址:https://www.cnblogs.com/attilax/p/15197507.html
Copyright © 2011-2022 走看看