zoukankan      html  css  js  c++  java
  • 【HDFS API编程】从本地拷贝文件,从本地拷贝大文件,拷贝HDFS文件到本地

    接着之前继续API操作的学习

    CopyFromLocalFile: 顾名思义,从本地文件拷贝

    /**
     * 使用Java API操作HDFS文件系统
     * 关键点:
     * 1)create Configuration
     * 2)get FileSystem
     * 3)...It's your HDFS API operation.
     */
    public class HDFSApp {
    
        public static final String HDFS_PATH = "hdfs://hadoop000:8020";
        FileSystem fileSystem = null;
        Configuration configuration = null;
    
        @Before
        public void setUp() throws Exception{
            System.out.println("setUp-----------");
            configuration = new Configuration();
            configuration.set("dfs.replication","1");
    
            /*
             * 构造一个访问制定HDFS系统的客户端对象
             * 第一个参数:HDFS的URI
             * 第二个参数:客户端制定的配置参数
             * 第三个参数:客户端的身份,说白了就是用户名
             */
            fileSystem = FileSystem.get(new URI(HDFS_PATH),configuration,"hadoop");
        }
    
        /*
         * 拷贝本地文件到HDFS文件系统
         */
        @Test
        public void copyFromLocalFile() throws Exception{
            Path src = new Path("/home/hadoop/t.txt");
            Path dst = new Path("/hdfsapi/test/");
            fileSystem.copyFromLocalFile(src,dst);
        }
      @After
        public void tearDown(){
            configuration = null;
            fileSystem = null;
            System.out.println("----------tearDown------");
        }
    }

    方法怎么用?还是那句 哪里不会Ctrl点哪里。

    点进CopyFromLocalFile方法源码得知方法需要两个参数:本地文件的Path,和目标文件的Path,无返回值。

    我们运行该测试类后进入终端使用-ls查看/hdfsapi/test目录下包含了刚刚copy进来的t.txt文件,测试成功。

    [hadoop@hadoop000 ~]$ hadoop fs -ls /hdfsapi/test
    Found 3 items
    -rw-r--r--   3 hadoop supergroup         14 2019-04-19 16:31 /hdfsapi/test/a.txt
    -rw-r--r--   1 hadoop supergroup         28 2019-04-19 16:50 /hdfsapi/test/c.txt
    -rw-r--r--   1 hadoop supergroup       2732 2019-04-20 19:51 /hdfsapi/test/t.txt

    如果我们需要从本地拷贝一个大文件,文件越大需要等待的时间自然越长,这么漫长的等待且毫无显示严重影响用户体验。

    所以在上传大文件的时候可以添加上传进度条,在fileSystem下有个create方法带有进度条的功能:

    /**
       * Create an FSDataOutputStream at the indicated Path with write-progress
       * reporting.
       * Files are overwritten by default.
       * @param f the file to create
       * @param progress to report progress
    *在具有写入进度的指定路径上创建fsdataoutputstream。
    *默认情况下会覆盖文件。
    *@参数 f 要创建的文件
    *@参数 progress 报告进度
    */ 
      public FSDataOutputStream create(Path f, Progressable progress) 
          throws IOException {
        return create(f, true, 
                      getConf().getInt("io.file.buffer.size", 4096),
                      getDefaultReplication(f),
                      getDefaultBlockSize(f), progress);
      }

    运行测试类,能看到打印显示,虽然全是点看起来比较抽象,但是比什么都没有到怀疑死机还是要好点儿。

    setUp-----------
    log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
    ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................----------tearDown------
    
    Process finished with exit code 0
    ........................

    我们打开终端-ls查看,上传成功。

    [hadoop@hadoop000 software]$ hadoop fs -ls /hdfsapi/test
    Found 4 items
    -rw-r--r--   3 hadoop supergroup         14 2019-04-19 16:31 /hdfsapi/test/a.txt
    -rw-r--r--   1 hadoop supergroup         28 2019-04-19 16:50 /hdfsapi/test/c.txt
    -rw-r--r--   1 hadoop supergroup  181367942 2019-04-20 20:10 /hdfsapi/test/jdk.zip
    -rw-r--r--   1 hadoop supergroup       2732 2019-04-20 19:51 /hdfsapi/test/t.txt

    能上传那就自然会问:怎么下载?直接上代码,和上面类似就不多介绍了。

        /**
         * 拷贝HDFS文件到本地:下载
         * @throws Exception
         */
        @Test
        public void copyToLocalFile() throws Exception{
            Path src = new Path("/hdfsapi/test/t.txt");
            Path dst = new Path("/home/hadoop/app");
            fileSystem.copyToLocalFile(src,dst);
        }
  • 相关阅读:
    windows2016优化
    oracle什么时候需要commit
    Mysql的锁表,锁行记录
    git add
    linux系统优化
    解决rsyslog启动问题
    HAProxy启用日志功能
    nc命令获取远端端口状态
    将pip源更换到国内镜像
    Centos7.6下安装Python3.7
  • 原文地址:https://www.cnblogs.com/Liuyt-61/p/10742558.html
Copyright © 2011-2022 走看看