zoukankan      html  css  js  c++  java
  • 二、HDFS 架构

    源自:http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html

    HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

    namenode:存储系统的元数据(用于描述数据的数据,内存),例如 文件命名空间/block到datanode的映射.负责管理datanode

    datanode:用于存储数据块的节点.负责响应客户端对块的读写请求,向namenode汇报自己块信息.

    block:数据块,是对文件拆分的最小单位,表示一个切分尺度默认值128MB,每个数据块的默认副本因子是3通过

    dfs.replication进行配置,用户可以通过dfs.blocksize设置块大小

    rack机架,使用机架对存储节点做物理编排,用于优化存储和计算.查看机架

    [root@CentOS ~]# hdfs dfsadmin -printTopology
    Rack: /default-rack
       192.168.169.139:50010 (CentOS)
    

    为什么说HDFS不擅长存储小文件?

        文件      	 namenode占用(内存) 	 datanode占用磁盘 
    128MB 单个文件  	  1个block元数据信息  	128MB  *  副本因子
    

    128MB 10000个文件 10000个block元数据信息 128MB * 副本因子

    因为Namenode是使用单机的内存存储元数据,因此导致namenode内存紧张.

    NameNode和Secondary Namenode的关系?

    辅助NameNode整理Edits和Fsimage文件,加速NameNode启动过程.

    HDFS Shell

    [root@CentOS ~]# hdfs dfs -help     
    Usage: hadoop fs [generic options]
    	[-appendToFile <localsrc> ... <dst>]
    	[-cat [-ignoreCrc] <src> ...]  #
    	[-checksum <src> ...]          #
    	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]    #
    	[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]     #
    	[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]   #
    	[-cp [-f] [-p | -p[topax]] <src> ... <dst>]               #
    	[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]      #
    	[-help [cmd ...]]
    	[-ls [-d] [-h] [-R] [<path> ...]]   #
    	[-mkdir [-p] <path> ...]                                              #新建文件夹
    	[-moveFromLocal <localsrc> ... <dst>]
    	[-moveToLocal <src> <localdst>]
    	[-mv <src> ... <dst>]
    	[-put [-f] [-p] [-l] <localsrc> ... <dst>]    #
    	[-rm [-f] [-r|-R] [-skipTrash] <src> ...]     #
    	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
    	[-tail [-f] <file>]                                                    #
    	[-text [-ignoreCrc] <src> ...]
    	[-touchz <path> ...]                                #
    	[-usage [cmd ...]]
    

    hdfs dfs -ls / 这条执行会列出/目录下的文件和目录

    hdfs dfs -ls -R /这条会列出/目录下的左右文件,由于有-R参数,会在文件夹和子文件夹下执行ls操作。

    [root@CentOS sysconfig]# hdfs dfs -mkdir -p /tt/test     #新建文件夹
    [root@CentOS ~]# touch 123.txt
    [root@CentOS ~]# vi 123.txt
    [root@CentOS ~]# hdfs dfs -copyFromLocal ~/123.txt /tt   #复制文件到hdfs
    [root@CentOS ~]# hdfs dfs -cat /tt/test/123.txt          #查看文件
    雲想衣山花形容
    
    
    [root@CentOS 123123]# hdfs dfs -copyToLocal /tt/test/123.txt /usr/local/222.txt     #可以把hdfs中的文件copy到本地
    [root@CentOS 123123]# cd ..
    [root@CentOS local]# ls
    123123  222.txt  bin  etc  games  include  lib  lib64  libexec  sbin  share  src
    [root@CentOS local]# hdfs dfs -put 123123 /tt          #将本地文件或目录(eg:123123)上传到HDFS中的路径( /tt)
    
    [root@CentOS local]# hdfs dfs -ls /tt/                     #查看文件夹下的目录
    Found 2 items
    -rw-r--r--   1 root supergroup         22 2019-01-03 04:18 /tt/123.txt
    -rw-r--r--   1 root supergroup          0 2019-01-03 04:28 /tt/777.txt
    
    [root@CentOS local]# hdfs dfs -rm -f /tt/123.txt            #删除文件
    19/01/03 03:54:55 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    Deleted /tt/123.txt
    [root@CentOS local]# hdfs dfs -rm -r /tt/test               #删除文件夹
    19/01/03 03:55:58 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
    Deleted /tt/test
    
    [root@CentOS ~]# hdfs dfs -checksum /tt                      #查看文件大小
    checksum: `/tt': Is a directory
    [root@CentOS ~]# hdfs dfs -checksum /tt/123.txt
    /tt/123.txt     MD5-of-0MD5-of-512CRC32C        000002000000000000000000790c2cd6e313015e7896c41d37dce4d5
    
    [root@CentOS local]# hdfs dfs -cp /tt/123.txt /             #拷贝一个文件到另一个文件
    
    [root@CentOS local]# hdfs dfs -touchz /tt/777.txt           #创建文件
    
    [root@CentOS local]# hdfs dfs -tail /tt/123.txt             #显示文件最后的1KB内容到标准输出。
    雲想衣山花形容
    
    [root@CentOS local]# hdfs dfs -get /tt/777.txt /usr/local   #.将文件或目录从HDFS中的路径(/tt/777.txt)拷贝到本地文件路径(/usr/local)
    [root@CentOS local]# ls
    123123  222.txt  777.txt 
    [root@CentOS local]# hdfs dfs -ls -R  /tt/                 #递归地显示子目录下的内容。
    -rw-r--r--   1 root supergroup         22 2019-01-03 04:18 /tt/123.txt
    -rw-r--r--   1 root supergroup          0 2019-01-03 04:28 /tt/777.txt
    drwxr-xr-x   - root supergroup          0 2019-01-03 04:40 /tt/test
    -rw-r--r--   1 root supergroup         22 2019-01-03 04:40 /tt/test/222.txt
    [root@CentOS local]# hdfs dfs -chmod -R 755 /tt/123.txt
    [root@CentOS local]# hdfs dfs -ls -R  /tt/
    -rwxr-xr-x   1 root supergroup         22 2019-01-03 04:18 /tt/123.txt
    

    更多参考:http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#appendToFile

  • 相关阅读:
    flash中网页跳转总结
    as3自定义事件
    mouseChildren启示
    flash拖动条移出flash无法拖动
    需要一个策略文件,但在加载此媒体时未设置checkPolicyFile标志
    Teach Yourself SQL in 10 Minutes
    电子书本地转换软件 Calibre
    Teach Yourself SQL in 10 Minutes
    Teach Yourself SQL in 10 Minutes
    Teach Yourself SQL in 10 Minutes – Page 31 练习
  • 原文地址:https://www.cnblogs.com/adrien/p/10222602.html
Copyright © 2011-2022 走看看