zoukankan      html  css  js  c++  java
  • HDFS的java操作

    搭建开发环境

    1、引入依赖

    <dependency>

        <groupId>org.apache.hadoop</groupId>

        <artifactId>hadoop-client</artifactId>

        <version>2.6.1</version>

    </dependency>

    2、window下开发的说明注:如需手动引入jar包,hdfs的jar包----hadoop的安装目录的share

    建议在linux下进行hadoop应用的开发,不会存在兼容性问题。如在window上做客户端应用开发,需要设置以下环境:

    A、在windows的某个目录下解压一个hadoop的安装包

    B、将安装包下的lib和bin目录用对应windows版本平台编译的本地库替换

    C、在window系统中配置HADOOP_HOME指向你解压的安装包

    D、在windows系统的path变量中加入hadoop的bin目录

     获取api中的客户端对象

    在java中操作hdfs,首先要获得一个客户端实例

    Configuration conf = new Configuration()

    FileSystem fs = FileSystem.get(conf)

    而我们的操作目标是HDFS,所以获取到的fs对象应该是DistributedFileSystem的实例; 

    get方法是从何处判断具体实例化那种客户端类呢?

    ——从conf中的一个参数 fs.defaultFS的配置值判断;

    如果我们的代码中没有指定fs.defaultFS,并且工程classpath下也没有给定相应的配置,conf中的默认值就来自于hadoop的jar包中的core-default.xml,默认值为: file:///,则获取的将不是一个DistributedFileSystem的实例,而是一个本地文件系统的客户端对象

    HDFS客户端操作数据代码示例:

    文件的增删改查

    public class HdfsClient {

               FileSystem fs = null;

               @Before

               public void init() throws Exception {

                         // 构造一个配置参数对象,设置一个参数:我们要访问的hdfs的URI

                         // 从而FileSystem.get()方法就知道应该是去构造一个访问hdfs文件系统的客户端,以及hdfs的访问地址

                         // new Configuration();的时候,它就会去加载jar包中的hdfs-default.xml

                         // 然后再加载classpath下的hdfs-site.xml

                         Configuration conf = new Configuration();

                         conf.set("fs.defaultFS", "hdfs://hdp-node01:9000");

                         /**

                          * 参数优先级: 1、客户端代码中设置的值 2、classpath下的用户自定义配置文件 3、然后是服务器的默认配置

                          */

                         conf.set("dfs.replication", "3");

                         // 获取一个hdfs的访问客户端,根据参数,这个实例应该是DistributedFileSystem的实例

                         // fs = FileSystem.get(conf);

                         // 如果这样去获取,那conf里面就可以不要配"fs.defaultFS"参数,而且,这个客户端的身份标识已经是hadoop用户

                         fs = FileSystem.get(new URI("hdfs://hdp-node01:9000"), conf, "hadoop");

               }

               /**

                * 往hdfs上传文件

                *

                * @throws Exception

                */

               @Test

               public void testAddFileToHdfs() throws Exception {

                         // 要上传的文件所在的本地路径

                         Path src = new Path("g:/redis-recommend.zip");

                         // 要上传到hdfs的目标路径

                         Path dst = new Path("/aaa");

                         fs.copyFromLocalFile(src, dst);

                         fs.close();

               }

               /**

                * 从hdfs中复制文件到本地文件系统

                *

                * @throws IOException

                * @throws IllegalArgumentException

                */

               @Test

               public void testDownloadFileToLocal() throws IllegalArgumentException, IOException {

                         fs.copyToLocalFile(new Path("/jdk-7u65-linux-i586.tar.gz"), new Path("d:/"));

                         fs.close();

               }

               @Test

               public void testMkdirAndDeleteAndRename() throws IllegalArgumentException, IOException {

                         // 创建目录

                         fs.mkdirs(new Path("/a1/b1/c1"));

                         // 删除文件夹 ,如果是非空文件夹,参数2必须给值true

                         fs.delete(new Path("/aaa"), true);

                         // 重命名文件或文件夹

                         fs.rename(new Path("/a1"), new Path("/a2"));

               }

               /**

                * 查看目录信息,只显示文件

                *

                * @throws IOException

                * @throws IllegalArgumentException

                * @throws FileNotFoundException

                */

               @Test

               public void testListFiles() throws FileNotFoundException, IllegalArgumentException, IOException {

                         // 思考:为什么返回迭代器,而不是List之类的容器

                         RemoteIterator<LocatedFileStatus> listFiles = fs.listFiles(new Path("/"), true);

                         while (listFiles.hasNext()) {

                                    LocatedFileStatus fileStatus = listFiles.next();

                                    System.out.println(fileStatus.getPath().getName());

                                    System.out.println(fileStatus.getBlockSize());

                                    System.out.println(fileStatus.getPermission());

                                    System.out.println(fileStatus.getLen());

                                    BlockLocation[] blockLocations = fileStatus.getBlockLocations();

                                    for (BlockLocation bl : blockLocations) {

                                              System.out.println("block-length:" + bl.getLength() + "--" + "block-offset:" + bl.getOffset());

                                              String[] hosts = bl.getHosts();

                                              for (String host : hosts) {

                                                         System.out.println(host);

                                              }

                                    }

                                    System.out.println("--------------为angelababy打印的分割线--------------");

                         }

               }

               /**

                * 查看文件及文件夹信息

                *

                * @throws IOException

                * @throws IllegalArgumentException

                * @throws FileNotFoundException

                */

               @Test

               public void testListAll() throws FileNotFoundException, IllegalArgumentException, IOException {

                         FileStatus[] listStatus = fs.listStatus(new Path("/"));

                         String flag = "d--             ";

                         for (FileStatus fstatus : listStatus) {

                                    if (fstatus.isFile())  flag = "f--         ";

                                    System.out.println(flag + fstatus.getPath().getName());

                         }

               }

    }

    通过流的方式访问hdfs

    /**

     * 相对那些封装好的方法而言的更底层一些的操作方式

     * 上层那些mapreduce   spark等运算框架,去hdfs中获取数据的时候,就是调的这种底层的api

     * @author

     *

     */

    public class StreamAccess {

              

               FileSystem fs = null;

     

               @Before

               public void init() throws Exception {

     

                         Configuration conf = new Configuration();

                         fs = FileSystem.get(new URI("hdfs://hdp-node01:9000"), conf, "hadoop");

     

               }

              

              

              

               @Test

               public void testDownLoadFileToLocal() throws IllegalArgumentException, IOException{

                        

                         //先获取一个文件的输入流----针对hdfs上的

                         FSDataInputStream in = fs.open(new Path("/jdk-7u65-linux-i586.tar.gz"));

                        

                         //再构造一个文件的输出流----针对本地的

                         FileOutputStream out = new FileOutputStream(new File("c:/jdk.tar.gz"));

                        

                         //再将输入流中数据传输到输出流

                         IOUtils.copyBytes(in, out, 4096);

                        

                        

               }

              

              

               /**

                * hdfs支持随机定位进行文件读取,而且可以方便地读取指定长度

                * 用于上层分布式运算框架并发处理数据

                * @throws IllegalArgumentException

                * @throws IOException

                */

               @Test

               public void testRandomAccess() throws IllegalArgumentException, IOException{

                         //先获取一个文件的输入流----针对hdfs上的

                         FSDataInputStream in = fs.open(new Path("/iloveyou.txt"));

                        

                        

                         //可以将流的起始偏移量进行自定义

                         in.seek(22);

                        

                         //再构造一个文件的输出流----针对本地的

                         FileOutputStream out = new FileOutputStream(new File("c:/iloveyou.line.2.txt"));

                        

                         IOUtils.copyBytes(in,out,19L,true);

                        

               }

              

              

              

               /**

                * 显示hdfs上文件的内容

                * @throws IOException

                * @throws IllegalArgumentException

                */

               @Test

               public void testCat() throws IllegalArgumentException, IOException{

                        

                         FSDataInputStream in = fs.open(new Path("/iloveyou.txt"));

                        

                         IOUtils.copyBytes(in, System.out, 1024);

               }

    }

     

  • 相关阅读:
    android29
    android28
    android27
    android26
    Dynamics CRM2011 MspInstallAction failed when installing an Update Rollup
    Dynamics CRM Import Solution Attribute Display Name description is null or empty
    The service cannot be activated because it does not support ASP.NET compatibility
    IIS部署WCF报 无法读取配置节“protocolMapping”,因为它缺少节声明
    Unable to access the IIS metabase.You do not have sufficient privilege
    LM算法与非线性最小二乘问题
  • 原文地址:https://www.cnblogs.com/wuyl/p/9392930.html
Copyright © 2011-2022 走看看