zoukankan      html  css  js  c++  java
  • hadoop2.5.2学习及实践笔记(五)—— HDFS shell命令行常见操作

    附:HDFS shell guide文档地址

    http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/FileSystemShell.html

    启动HDFS后,输入hadoop fs命令,即可显示HDFS常用命令的用法

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs 
    Usage: hadoop fs [generic options]
        [-appendToFile <localsrc> ... <dst>]
        [-cat [-ignoreCrc] <src> ...]
        [-checksum <src> ...]
        [-chgrp [-R] GROUP PATH...]
        [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
        [-chown [-R] [OWNER][:[GROUP]] PATH...]
        [-copyFromLocal [-f] [-p] <localsrc> ... <dst>]
        [-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
        [-count [-q] <path> ...]
        [-cp [-f] [-p | -p[topax]] <src> ... <dst>]
        [-createSnapshot <snapshotDir> [<snapshotName>]]
        [-deleteSnapshot <snapshotDir> <snapshotName>]
        [-df [-h] [<path> ...]]
        [-du [-s] [-h] <path> ...]
        [-expunge]
        [-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
        [-getfacl [-R] <path>]
        [-getfattr [-R] {-n name | -d} [-e en] <path>]
        [-getmerge [-nl] <src> <localdst>]
        [-help [cmd ...]]
        [-ls [-d] [-h] [-R] [<path> ...]]
        [-mkdir [-p] <path> ...]
        [-moveFromLocal <localsrc> ... <dst>]
        [-moveToLocal <src> <localdst>]
        [-mv <src> ... <dst>]
        [-put [-f] [-p] <localsrc> ... <dst>]
        [-renameSnapshot <snapshotDir> <oldName> <newName>]
        [-rm [-f] [-r|-R] [-skipTrash] <src> ...]
        [-rmdir [--ignore-fail-on-non-empty] <dir> ...]
        [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
        [-setfattr {-n name [-v value] | -x name} <path>]
        [-setrep [-R] [-w] <rep> <path> ...]
        [-stat [format] <path> ...]
        [-tail [-f] <file>]
        [-test -[defsz] <path>]
        [-text [-ignoreCrc] <src> ...]
        [-touchz <path> ...]
        [-usage [cmd ...]]
    
    Generic options supported are
    -conf <configuration file>     specify an application configuration file
    -D <property=value>            use value for given property
    -fs <local|namenode:port>      specify a namenode
    -jt <local|jobtracker:port>    specify a job tracker
    -files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
    -libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
    -archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
    
    The general command line syntax is
    bin/hadoop command [genericOptions] [commandOptions]

    >帮助相关命令

    • usage

      查看命令的用法,例查看ls的用法

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -usage ls
    Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [<path> ...]
    • help

      查看命令的详细帮助,例查看ls命令的帮助:

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -help ls
    -ls [-d] [-h] [-R] [<path> ...] :
      List the contents that match the specified file pattern. If path is not
      specified, the contents of /user/<currentUser> will be listed. Directory entries
      are of the form:
          permissions - userId groupId sizeOfDirectory(in bytes)
      modificationDate(yyyy-MM-dd HH:mm) directoryName
      
      and file entries are of the form:
          permissions numberOfReplicas userId groupId sizeOfFile(in bytes)
      modificationDate(yyyy-MM-dd HH:mm) fileName
                                                                                     
      -d  Directories are listed as plain files.                                     
      -h  Formats the sizes of files in a human-readable fashion rather than a number
          of bytes.                                                                  
      -R  Recursively list the contents of directories. 

    >查看相关命令

    • ls

      查看文件或目录,下例中:hdfs://localhost:9000是fs.defaultFS配置的值,hdfs://localhost:9000/即表示HDFS文件系统中根目录,如果使用的是HDFS文件系统, 可以简写为/。

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls  hdfs://localhost:9000/
    Found 3 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 hdfs://localhost:9000/input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-31 07:17 hdfs://localhost:9000/input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 hdfs://localhost:9000/output
    
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 3 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-31 07:17 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output

       选项-R:连同子目录的文件一起列出,例:

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt         --子目录下的文件也被列出
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup         14 2015-03-31 07:17 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    • cat

      显示文件内容

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -cat /input1.txt
    hello hadoop!
    hello hadoop!
    • text

      将给定的文件以文本的格式输出,允许的格式zip、TextRecordInputStream、Avro。当文件为文本文件时,等同于cat。例:

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt
    hello hadoop!
    • tail

      显示文件最后1KB的内容

      选项-f:当文件内容增加时显示追加的内容

    • checksum

      显示文件的校验和信息。因为需要和存储文件每个块的datanode互相通信,因此对大量的文件使用此命令效率可能会低

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -checksum /input.zip
    /input.zip        MD5-of-0MD5-of-0CRC32        00000000000000000000000070bc8f4b72a86921468bf8e8441dce51

    >文件及目录相关命令

    • touchz

      创建一个空文件,如果存在指定名称的非空文件,则返回错误

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 3 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -touchz /input1.zip
    touchz: `/input1.zip': Not a zero-length file        --非空时给出错误提示
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -touchz /input.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 4 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip   --创建成功
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output
    • appendToFile

      向现有文件中追加内容,例:

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt
    hello hadoop!
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -appendToFile ~/Desktop/input1.txt /input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt  
    hello hadoop! hello hadoop! --查看追加后的文件内容
    • put

      从本地文件系统上传文件到HDFS

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -put ~/Desktop/input1.txt /
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt     --查看上传后的文件内容
    hello hadoop!

      选项-f:如果文件已经存在,覆盖已有文件

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -put ~/Desktop/input1.txt /
    put: `/input1.txt': File exists   --文件已存在时给出错误提示
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -put -f ~/Desktop/input1.txt / 
    [hadoop@localhost hadoop-2.5.2]$     --使用-f选项后没有再报错

      选项-p:保留原文件的访问、修改时间,用户和组,权限属性

    [hadoop@localhost hadoop-2.5.2]$ ll ~/input1.txt 
    -rw-r--r--. 1 hadoop hadoops 28 Mar 31 08:59 /home/hadoop/input1.txt   --本地文件属性
    [hadoop@localhost hadoop-2.5.2]$ chmod 777 ~/input1.txt    --修改权限为rwxrwxrwx
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -put ~/input1.txt /
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /input1.txt
    -rw-r--r--   1 hadoop supergroup         28 2015-04-02 05:19 /input1.txt   --不使用-p选项,上传后文件属性
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -put -f -p ~/input1.txt /
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /input1.txt
    -rwxrwxrwx   1 hadoop hadoops         28 2015-03-31 08:59 /input1.txt    --使用-p选项,上传后文件属性
    • get

      从HDFS上下载文件到本地,与put不同,没有覆盖本地已有文件的选项

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -get /input1.txt ~
    [hadoop@localhost hadoop-2.5.2]$ cat ~/input1.txt  --查看本地下载的文件
    hello hadoop!
    hellp hadoop!
    • getmerge

      将指定的HDFS中原目录下的文件合并成一个文件并下载到本地,源文件保留

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input/input1.txt
    hello hadoop!    --input1.txt内容
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input/input2.txt
    welcome to the world of hadoop!    --input2.txt内容
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getmerge /input/ ~/merge.txt
    [hadoop@localhost hadoop-2.5.2]$ cat ~/merge.txt
    hello hadoop!        --合并后本地文件的内容
    welcome to the world of hadoop!

      选项-nl:在每个文件的最后增加一个新行

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getmerge -nl /input/ ~/merge.txt
    [hadoop@localhost hadoop-2.5.2]$ cat ~/merge.txt
    hello hadoop!
             --input1.txt增加的新行
    welcome to the world of hadoop!
             --input2.txt增加的新行
    [hadoop@localhost hadoop-2.5.2]$ 
    • copyFromLocal

      从本地文件系统上传文件到HDFS,与put命令相同

    •  copyToLocal

      从HDFS下载文件到本地文件系统,与get命令相同

    • moveFromLocal

      与put命令相同,只是上传成功后本地文件会被删除

    •  moveToLocal

      该命令还未实现

    •  mv

      同linux的mv命令,移动或重命名文件

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 5 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input.zip
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:10 /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -mv /input.zip /input1.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 5 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip  --重命名
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:10 /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -mv /input1.zip /text/
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:12 /text
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /text/input1.zip   --移动文件
    • cp

      复制文件

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:29 /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -cp /input1.txt /input.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup         28 2015-04-02 07:31 /input.txt   --新复制文件
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:29 /text

      选项-f:如果文件已存在,覆盖已有文件

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -cp /input1.txt /input.txt
    cp: `/input.txt': File exists     --文件已存在时给出错误提示
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -cp -f /input1.txt /input.txt
    [hadoop@localhost hadoop-2.5.2]$
    • mkdir

      创建文件夹

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -mkdir /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
    Found 5 items
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input.zip
    -rw-r--r--   1 hadoop supergroup        210 2015-03-31 07:49 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-03-31 08:23 /text

      选项-p:如果上层目录不存在,递归建立所需目录

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -mkdir /text1/text2
    mkdir: `/text1/text2': No such file or directory    --上层目录不存在,给出错误提示
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -mkdir -p /text1/text2
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input.zip
    -rw-r--r--   1 hadoop supergroup        210 2015-03-31 07:49 /input1.txt
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-03-31 08:23 /text
    drwxr-xr-x   - hadoop supergroup          0 2015-03-31 08:26 /text1
    drwxr-xr-x   - hadoop supergroup          0 2015-03-31 08:26 /text1/text2   --使用-p选项,创建成功
    • rm

      删除文件

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm /input.zip
    15/03/31 08:02:32 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    Deleted /input.zip

      选项-r:递归的删除,可以删除非空目录

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm /text
    rm: `/text': Is a directory    --删除文件夹时,给出错误提示
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm -r /text  --使用-r选项,文件夹及文件夹下文件删除成功
    15/04/02 08:28:42 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    Deleted /text
    • rmdir

      删除空目录

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -rmdir /output
    rmdir: `/output': Directory is not empty     --不能删除非空目录

      选项--ignore-fail-on-non-empty:忽略非空删除失败时的提示

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -rmdir --ignore-fail-on-non-empty /output  
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output    --不给出错误提示,但文件未删除
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /output/input1.txt
    • setrep

      改变一个文件的副本数

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -stat %r /input.zip
    1    --原副本数
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setrep  2 /input.zip
    Replication 2 set: /input.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -stat %r /input.zip
    2     --改变后副本数

      选项-w:命令等待副本数调整完成

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setrep -w 1 /input.zip
    Replication 1 set: /input.zip
    Waiting for /input.zip ... done
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -stat %r /input.zip
    1
    • expunge

      清空回收站

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -expunge
    15/04/03 01:52:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    • chgrp

      修改文件用户组

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output                   --文件原用户组
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chgrp test /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop test                0 2015-04-02 08:34 /output                     --修改后的用户组(未建立test组,仍可成功)
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /output/input1.txt    --目录下文件的用户组未修改

      选项-R:递归修,如果是目录,则递归的修改其下的文件及目录

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chgrp -R testgrp /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop testgrp             0 2015-04-02 08:34 /output            --目录及其下文件都被更改
    -rwxrwxrwx   1 hadoop testgrp            28 2015-03-31 08:59 /output/input1.txt
    • chmod

      修改文件权限,权限模式同linux shell命令中的模式

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 08:34 /output    --文件原权限
    -rwxrwxrwx   1 hadoop supergroup         28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chmod 754 /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr--   - hadoop supergroup          0 2015-04-02 08:34 /output      --修改后的权限
    -rwxrwxrwx   1 hadoop supergroup         28 2015-03-31 08:59 /output/input1.txt  --目录下文件的权限未修改
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chmod -R 775 /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxrwxr-x   - hadoop supergroup          0 2015-04-02 08:34 /output         --目录及其下文件都被更改
    -rwxrwxr-x   1 hadoop supergroup         28 2015-03-31 08:59 /output/input1.txt
    • chown

      修改文件的用户或组

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxrwxr-x   - hadoop supergroup          0 2015-04-02 08:34 /output       --文件原用户和组
    -rwxrwxr-x   1 hadoop supergroup         28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chown test /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxrwxr-x   - test   supergroup          0 2015-04-02 08:34 /output           --修改后的用户(未建立test用户,仍可成功)
    -rwxrwxr-x   1 hadoop supergroup         28 2015-03-31 08:59 /output/input1.txt    --目录下文件的用户未修改

      选项-R:递归修改,如果是目录,则递归的修改其下的文件及目录

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -chown -R testown:testgrp /output
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop  supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop  supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop  supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop  supergroup          0 2015-04-02 08:43 /input.zip
    -rw-r--r--   1 hadoop  supergroup        184 2015-03-31 08:14 /input1.zip
    drwxrwxr-x   - testown testgrp             0 2015-04-02 08:34 /output       --目录及其下文件都被更改
    -rwxrwxr-x   1 testown testgrp            28 2015-03-31 08:59 /output/input1.txt
    • getfacl

      显示访问控制列表ACLs(Access Control Lists)

    [hadoop@localhost bin]$ hadoop fs -getfacl /input.zip
    # file: /input.zip
    # owner: hadoop
    # group: supergroup
    user::rw-
    group::r--
    other::r--

      选项-R:递归显示

    [hadoop@localhost bin]$ hadoop fs -getfacl -R /input
    # file: /input
    # owner: hadoop
    # group: supergroup
    user::rwx
    group::r-x
    other::r-x
    
    # file: /input/input1.txt
    # owner: hadoop
    # group: supergroup
    user::rw-
    group::r--
    other::r--
    
    # file: /input/input2.txt
    # owner: hadoop
    # group: supergroup
    user::rw-
    group::r--
    other::r--
    • setfacl

      设置访问控制列表,acls默认未开启,直接使用该命令会报错

    [hadoop@localhost bin]$ hadoop fs -setfacl -b /output/input1.txt
    setfacl: The ACL operation has been rejected.  Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false.

      开启acls,配置hdfs-site.xml

    [hadoop@localhost hadoop-2.5.2]$ vi etc/hadoop/hdfs-site.xml
    <property>
        <name>dfs.namenode.acls.enabled</name>
        <value>true</value>
    </property>

      选项-m:修改acls

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfacl /output/input1.txt
    # file: /output/input1.txt
    # owner: testown
    # group: testgrp
    user::rwx
    group::rwx
    other::r-x
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setfacl -m user::rw-,user:hadoop:rw-,group::r--,other::r-- /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfacl /output/input1.txt
    # file: /output/input1.txt
    # owner: testown
    # group: testgrp
    user::rw-
    user:hadoop:rw-
    group::r--
    mask::rw-
    other::r--

      选项-x:删除指定规则

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setfacl -m user::rw-,user:hadoop:rw-,group::r--,other::r-- /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfacl /output/input1.txt
    # file: /output/input1.txt
    # owner: testown
    # group: testgrp
    user::rw-
    user:hadoop:rw-
    group::r--
    mask::rw-
    other::r--
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setfacl -x user:hadoop /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfacl /output/input1.txt
    # file: /output/input1.txt
    # owner: testown
    # group: testgrp
    user::rw-
    group::r--
    mask::r--
    other::r--

      以下选项未做实验

      选项-b:基本的acl规则(所有者,群组,其他)被保留,其他规则全部删除.

      选项-k:删除缺省规则

    •  setfattr

      设置扩展属性的名称和值

      选项-n:属性名称      选项-v:属性值    

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -d /input.zip
    # file: /input.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setfattr -n user.web -v www.baidu.com /input.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -d /input.zip
    # file: /input.zip
    user.web="www.baidu.com"

      选项-x:删除扩展属性

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -d /input.zip
    # file: /input.zip
    user.web="www.baidu.com"
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -setfattr -x user.web /input.zip
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -d /input.zip
    # file: /input.zip
    • getfattr

      显示扩展属性的名称和值

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -d /input.zip
    # file: /input.zip
    user.web="www.baidu.com"
    user.web2="www.google.com"

      选项-n:显示指定名称的属性值

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -getfattr -n user.web /input.zip# file: /input.zip
    user.web="www.baidu.com"

    >统计相关命令

    • count

    显示指定文件或目录的:DIR_COUNT、FILE_COUNT、CONTENT_SIZE、 FILE_NAME,分别表示:子目录个数(如果指定路径是目录,则包含该目录本身)、文件个数、使用字节个数,以及文件或目录名。

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup         28 2015-04-02 07:32 /input.txt
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:29 /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -count /
               4            5                286 /

      选项-q:显示配额信息(在多人共用的情况下,可以通过限制用户写入目录,并设置目录的quota ,防止不小心就把所有的空间用完造成别人无法存取的情况)。配额信息包括:QUOTA、REMAINING_QUOTA、SPACE_QUOTA、REMAINING_SPACE_QUOTA,分别表示某个目录下档案及目录的总数、剩余目录或文档数量、目录下空间的大小、目录下剩余空间。

      计算公式:

      QUOTA – (DIR_COUNT + FILE_COUNT) = REMAINING_QUOTA;

      SPACE_QUOTA – CONTENT_SIZE = REMAINING_SPACE_QUOTA。

      none和inf表示未配置。

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -count -q /
    9223372036854775807 9223372036854775798            none             inf            4            5                286 /
    • du

      显示文件大小,如果指定目录,会显示该目录中每个文件的大小

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:19 /input
    -rw-r--r--   1 hadoop supergroup         14 2015-03-27 19:19 /input/input1.txt
    -rw-r--r--   1 hadoop supergroup         32 2015-03-27 19:19 /input/input2.txt
    -rw-r--r--   1 hadoop supergroup         28 2015-04-02 07:32 /input.txt
    -rwxrwxrwx   1 hadoop hadoops            28 2015-03-31 08:59 /input1.txt
    -rw-r--r--   1 hadoop supergroup        184 2015-03-31 08:14 /input1.zip
    drwxr-xr-x   - hadoop supergroup          0 2015-03-27 19:16 /output
    drwxr-xr-x   - hadoop supergroup          0 2015-04-02 07:29 /text
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -du /
    46   /input
    28   /input.txt
    28   /input1.txt
    184  /input1.zip
    0    /output
    0    /text

      选项-s:显示总的统计信息,而不是显示每个文件的信息

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -du -s /
    286  /
    • df

      检查文件系统的磁盘空间占用情况

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -df /
    Filesystem                    Size   Used   Available  Use%
    hdfs://localhost:9000  18713219072  73728  8864460800    0%
    • stat

      显示文件统计信息。

      格式: %b - 文件所占的块数; %g - 文件所属的用户组 ;%n - 文件名; %o - 文件块大小;%r - 备份数 ;%u - 文件所属用户;%y - 文件修改时间

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -stat %b,%g,%n,%o,%r,%u,%y /input.zip
    0,supergroup,input.zip,134217728,1,hadoop,2015-04-02 15:43:24

    >快照命令

    • createSnapshot

      创建快照,

      附:官方文档 http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

      snapshot(快照)是一个全部文件系统、或者某个目录在某一时刻的镜像。创建动作仅仅是在目录对应的Inode上加个快照的标签,不会涉及到数据块的拷贝操作,也不会对读写性能有影响,但是会占用namenode一定的额外内存来存放快照中被修改的文件和目录的元信息

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /output
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -createSnapshot /output s1
    createSnapshot: Directory is not a snapshottable directory: /output   --直接创建给出错误
    [hadoop@localhost hadoop-2.5.2]$ hdfs dfsadmin -allowSnapshot /output  --对开启某一目录的快照功能
    Allowing snaphot on /output succeeded
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -createSnapshot /output s1  --创建快照
    Created snapshot /output/.snapshot/s1
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /output
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /output/.snapshot/s1
    Found 1 items
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/.snapshot/s1/input1.txt  --查看快照
    • renameSnapshot

      重命名快照

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /output/.snapshot/s1Found 1 items
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/.snapshot/s1/input1.txt  --原快照
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -renameSnapshot /output/ s1 s2[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /output/.snapshot/s2
    Found 1 items
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/.snapshot/s2/input1.txt    --新快照
    • deleteSnapshot

      删除快照

    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /output/.snapshot/s2
    Found 1 items
    -rwxrwxr-x   1 testown testgrp         28 2015-03-31 08:59 /output/.snapshot/s2/input1.txt
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -deleteSnapshot /output/ s2
    [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /output/.snapshot/s2
    ls: `/output/.snapshot/s2': No such file or directory
  • 相关阅读:
    配置struts2拦截器
    <global-results>标签来定义全局的<result>
    StringUtils.isEmpty和StringUtils.isBlank用法
    Tomcat xxx unbound
    getRequestDispatcher()和response.sendRedirect()
    转 intValue()的用法
    jspf与jsp的区别
    table标签中thead、tbody、tfoot的作用
    hibernate的cascade
    hibernate 持久化对象的三个状态
  • 原文地址:https://www.cnblogs.com/zhaosk/p/4391294.html
Copyright © 2011-2022 走看看