zoukankan      html  css  js  c++  java
  • 马士兵hadoop第三课:java开发hdfs(转)

    马士兵hadoop第一课:虚拟机搭建和安装hadoop及启动

    马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作

    马士兵hadoop第三课:java开发hdfs

    马士兵hadoop第四课:Yarn和Map/Reduce配置启动和原理讲解

    马士兵hadoop第五课:java开发Map/Reduce

    (1)关于hdfs小结

    hadoop由hdfs + yarn + map/reduce组成,

    hdfs是数据库存储模块,主要由1台namenode和n台datanode组成的一个集群系统,

    datanode可以动态扩展,文件根据固定大小分块(默认为128M),

    每一块数据默认存储到3台datanode,故意冗余存储,防止某一台datanode挂掉,数据不会丢失。

    HDFS = NameNode + SecondaryNameNode + journalNode + DataNode

    hdfs的典型应用就是:百度云盘

    (2)修改hadoop.tmp.dir默认值

    hadoop.tmp.dir默认值为/tmp/hadoop-${user.name},由于/tmp目录是系统重启时候会被删除,所以应该修改目录位置。
    修改core-site.xml(在所有节点上都修改)

    [root@master ~]#  vim core-site.xml

    修改完namenode和datanode上的hadoop.tmp.dir参数后,需要格式化namenode,在master上执行:

    [root@master ~]# hdfs namenode -format

    (4)测试期间关闭权限检查

    为了简单起见,需要关闭权限检查,需要在namenode的hdfs-site.xml上,添加配置:

    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>

    重新启动namenode:

    [root@master ~]# hadoop-daemon.sh stop namenode
    [root@master ~]# hadoop-daemon.sh start namenode

    (5) 使用FileSyste类来读写hdfs

    复制代码
    package com.hadoop.hdfs;
    

    import java.io.FileInputStream;
    import org.apache.commons.logging.Log;
    import org.apache.commons.logging.LogFactory;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FSDataOutputStream;
    import org.apache.hadoop.fs.FileStatus;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;

    public class HelloHDFS {

    </span><span style="color: #0000ff">public</span> <span style="color: #0000ff">static</span> Log log =  LogFactory.getLog(HelloHDFS.<span style="color: #0000ff">class</span><span style="color: #000000">);
    
    </span><span style="color: #0000ff">public</span> <span style="color: #0000ff">static</span> <span style="color: #0000ff">void</span> main(String[] args) <span style="color: #0000ff">throws</span><span style="color: #000000"> Exception {
    
        Configuration conf </span>= <span style="color: #0000ff">new</span><span style="color: #000000"> Configuration();
        conf.set(</span>"fs.defaultFS", "hdfs://192.168.56.100:9000"<span style="color: #000000">);
        conf.set(</span>"dfs.replication", "2"<span style="color: #000000">);//默认为3
        FileSystem fileSystem </span>=<span style="color: #000000"> FileSystem.get(conf);
        
        </span><span style="color: #0000ff">boolean</span> success = fileSystem.mkdirs(<span style="color: #0000ff">new</span> Path("/yucong"<span style="color: #000000">));
        log.info(</span>"创建文件是否成功:" +<span style="color: #000000"> success);
        
        </span><span style="color: #0000ff">success</span> = fileSystem.exists(<span style="color: #0000ff">new</span> Path("/yucong"<span style="color: #000000">));
        log.info(</span>"文件是否存在:" +<span style="color: #000000"> success);
        
        success </span>= fileSystem.delete(<span style="color: #0000ff">new</span> Path("/yucong"), <span style="color: #0000ff">true</span><span style="color: #000000">);
        log.info(</span>"删除文件是否成功:" +<span style="color: #000000"> success);
        
        </span><span style="color: #008000">/*</span><span style="color: #008000">FSDataOutputStream out = fileSystem.create(new Path("/test.data"), true);
        FileInputStream fis = new FileInputStream("c:/test.txt");
        IOUtils.copyBytes(fis, out, 4096, true);</span><span style="color: #008000">*/</span><span style="color: #000000">
        
        FSDataOutputStream out </span>= fileSystem.create(<span style="color: #0000ff">new</span> Path("/test2.data"<span style="color: #000000">));
        FileInputStream in </span>= <span style="color: #0000ff">new</span> FileInputStream("c:/test.txt"<span style="color: #000000">);
        </span><span style="color: #0000ff">byte</span>[] buf = <span style="color: #0000ff">new</span> <span style="color: #0000ff">byte</span>[4096<span style="color: #000000">];
        </span><span style="color: #0000ff">int</span> len =<span style="color: #000000"> in.read(buf);
        </span><span style="color: #0000ff">while</span>(len != -1<span style="color: #000000">) {
            out.write(buf,</span>0<span style="color: #000000">,len);
            len </span>=<span style="color: #000000"> in.read(buf);
        }
        in.close();
        out.close();
        
        FileStatus[] statuses </span>= fileSystem.listStatus(<span style="color: #0000ff">new</span> Path("/"<span style="color: #000000">));
        log.info(statuses.length);
        </span><span style="color: #0000ff">for</span><span style="color: #000000">(FileStatus status : statuses) {
            log.info(status.getPath());
            log.info(status.getPermission());
            log.info(status.getReplication());
        }
    }
    

    }

    复制代码

     这是一个maven项目,pom.xml文件为:

    复制代码
      <dependencies>
    
    <span style="color: #0000ff">&lt;</span><span style="color: #800000">dependency</span><span style="color: #0000ff">&gt;</span>
      <span style="color: #0000ff">&lt;</span><span style="color: #800000">groupId</span><span style="color: #0000ff">&gt;</span>org.apache.hadoop<span style="color: #0000ff">&lt;/</span><span style="color: #800000">groupId</span><span style="color: #0000ff">&gt;</span>
      <span style="color: #0000ff">&lt;</span><span style="color: #800000">artifactId</span><span style="color: #0000ff">&gt;</span>hadoop-common<span style="color: #0000ff">&lt;/</span><span style="color: #800000">artifactId</span><span style="color: #0000ff">&gt;</span>
      <span style="color: #0000ff">&lt;</span><span style="color: #800000">version</span><span style="color: #0000ff">&gt;</span>2.7.3<span style="color: #0000ff">&lt;/</span><span style="color: #800000">version</span><span style="color: #0000ff">&gt;</span>
    <span style="color: #0000ff">&lt;/</span><span style="color: #800000">dependency</span><span style="color: #0000ff">&gt;</span>
    
    <span style="color: #0000ff">&lt;</span><span style="color: #800000">dependency</span><span style="color: #0000ff">&gt;</span>
        <span style="color: #0000ff">&lt;</span><span style="color: #800000">groupId</span><span style="color: #0000ff">&gt;</span>org.apache.hadoop<span style="color: #0000ff">&lt;/</span><span style="color: #800000">groupId</span><span style="color: #0000ff">&gt;</span>
        <span style="color: #0000ff">&lt;</span><span style="color: #800000">artifactId</span><span style="color: #0000ff">&gt;</span>hadoop-hdfs<span style="color: #0000ff">&lt;/</span><span style="color: #800000">artifactId</span><span style="color: #0000ff">&gt;</span>
       <span style="color: #0000ff">&lt;</span><span style="color: #800000">version</span><span style="color: #0000ff">&gt;</span>2.7.3<span style="color: #0000ff">&lt;/</span><span style="color: #800000">version</span><span style="color: #0000ff">&gt;</span>
    <span style="color: #0000ff">&lt;/</span><span style="color: #800000">dependency</span><span style="color: #0000ff">&gt;</span>
    

    </dependencies>

    复制代码

    马士兵视频课程百度云盘下载:http://pan.baidu.com/s/1kVSbxS7

    原文地址:http://www.cnblogs.com/yucongblog/p/6650839.html
  • 相关阅读:
    Java HashMap存储问题
    <转>堆和栈的区别
    Linux shell命令
    DNS(三)DNS SEC(域名系统安全扩展)
    DNS (二)协议
    绕过CDN查找网站真实IP方法
    stream流思想应用
    http接口实现附件对接
    AQS深入分析
    AQS快速入门
  • 原文地址:https://www.cnblogs.com/jpfss/p/9034788.html
Copyright © 2011-2022 走看看