zoukankan      html  css  js  c++  java
  • Set replication in Hadoop

    I was trying loading file using hadoop API as an experiment.

    I want to set replication to minimum as this one is for experiment. I first tried this with FileSystem.setReplication():

    Configuration config = new Configuration();
    config.set("fs.defaultFS","hdfs://192.168.248.166:8020");
    FileSystem dfs2 = FileSystem.get(config);
    Path src2 = new Path("C:\Users\abc\Desktop\testfile.txt");
    Path dst2 = new Path(dfs2.getWorkingDirectory()+"/tempdir");
    dfs2.copyFromLocalFile(src2, dst2);
    dfs2.setReplication(dst2, (short)1);  /**setting replication**/

    The replica was shown as 1, but it was available on 3 datanodes.

    When I tried it with Configuration.set():

    Configuration config = new Configuration();
    config.set("fs.defaultFS","hdfs://192.168.248.166:8020");
    config.set("dfs.replication", "1");  /**setting replication**/
    FileSystem dfs2 = FileSystem.get(config);
    Path src2 = new Path("C:\Users\abc\Desktop\testfile.txt");
    Path dst2 = new Path(dfs2.getWorkingDirectory()+"/tempdir");
    

    This gave the desired outcome (1 replica available on 1 datanode)

    Why there are two APIs for the same thing? What is the difference between these two?

    The difference is that Filesystem's setReplication() sets the replication of an existing file on HDFS. In your case, you first copy the local file testFile.txt to HDFS, using the default replication factor (3) and then change the replication factor of this file to 1. After this command, it takes a while until the over-replicated blocks get deleted. (source)

    On the other hand, when you use the config.set("dfs.replication", "1"); command to set the replication, you can copy the local file after that, so its blocks get copied just once, from the first time.

    In other words, I believe (but I might be wrong) that both commands have the same final result, but you have to wait a little bit until the first one is carried out.

  • 相关阅读:
    使用事件驱动代替定时任务
    MySql中的有条件插入 insert where
    Mac上“您没有权限来打开应用程序”(Big Sur)
    Java反编译反混淆神器
    Java实现开根号运算(不使用数组和String)
    使用vs code搭建Q#开发环境 (Mac)
    离散傅里叶变换DFT入门
    Java的nanoTime()方法
    Eslint提示const关键字被保留
    myBatis分页插件PageHelper的使用及源码详解
  • 原文地址:https://www.cnblogs.com/felixzh/p/8252721.html
Copyright © 2011-2022 走看看