zoukankan      html  css  js  c++  java
  • Storage Types and Storage Policies

    https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

    Introduction

    Archival Storage is a solution to decouple growing storage capacity from compute capacity. Nodes with higher density and less expensive storage with low compute power are becoming available and can be used as cold storage in the clusters. Based on policy the data from hot can be moved to the cold. Adding more nodes to the cold storage can grow the storage independent of the compute capacity in the cluster.

    The frameworks provided by Heterogeneous Storage and Archival Storage generalizes the HDFS architecture to include other kinds of storage media including SSD and memory. Users may choose to store their data in SSD or memory for a better performance.

    Storage Types and Storage Policies

    Storage Types: ARCHIVE, DISK, SSD and RAM_DISK

    The first phase of Heterogeneous Storage (HDFS-2832) changed datanode storage model from a single storage, which may correspond to multiple physical storage medias, to a collection of storages with each storage corresponding to a physical storage media. It also added the notion of storage types, DISK and SSD, where DISK is the default storage type.

    A new storage type ARCHIVE, which has high storage density (petabyte of storage) but little compute power, is added for supporting archival storage.

    Another new storage type RAM_DISK is added for supporting writing single replica files in memory.

    Storage Policies: Hot, Warm, Cold, All_SSD, One_SSD and Lazy_Persist

    A new concept of storage policies is introduced in order to allow files to be stored in different storage types according to the storage policy.

    We have the following storage policies:

    • Hot - for both storage and compute. The data that is popular and still being used for processing will stay in this policy. When a block is hot, all replicas are stored in DISK.
    • Cold - only for storage with limited compute. The data that is no longer being used, or data that needs to be archived is moved from hot storage to cold storage. When a block is cold, all replicas are stored in ARCHIVE.
    • Warm - partially hot and partially cold. When a block is warm, some of its replicas are stored in DISK and the remaining replicas are stored in ARCHIVE.
    • All_SSD - for storing all replicas in SSD.
    • One_SSD - for storing one of the replicas in SSD. The remaining replicas are stored in DISK.
    • Lazy_Persist - for writing blocks with single replica in memory. The replica is first written in RAM_DISK and then it is lazily persisted in DISK.

    More formally, a storage policy consists of the following fields:

    1. Policy ID
    2. Policy name
    3. A list of storage types for block placement
    4. A list of fallback storage types for file creation
    5. A list of fallback storage types for replication

    When there is enough space, block replicas are stored according to the storage type list specified in #3. When some of the storage types in list #3 are running out of space, the fallback storage type lists specified in #4 and #5 are used to replace the out-of-space storage types for file creation and replication, respectively.

    The following is a typical storage policy table.

    Policy IDPolicy NameBlock Placement (n  replicas)Fallback storages for creationFallback storages for replication
    15 Lazy_Persist RAM_DISK: 1, DISK: n-1 DISK DISK
    12 All_SSD SSD: n DISK DISK
    10 One_SSD SSD: 1, DISK: n-1 SSD, DISK SSD, DISK
    7 Hot (default) DISK: n <none> ARCHIVE
    5 Warm DISK: 1, ARCHIVE: n-1 ARCHIVE, DISK ARCHIVE, DISK
    2 Cold ARCHIVE: n <none> <none>

    Note 1: The Lazy_Persist policy is useful only for single replica blocks. For blocks with more than one replicas, all the replicas will be written to DISK since writing only one of the replicas to RAM_DISK does not improve the overall performance.

    Note 2: For the erasure coded files with striping layout, the suitable storage policies are All_SSD, Hot, Cold. So, if user sets the policy for striped EC files other than the mentioned policies, it will not follow that policy while creating or moving block.

    Storage Policy Resolution

    When a file or directory is created, its storage policy is unspecified. The storage policy can be specified using the “storagepolicies -setStoragePolicy” command. The effective storage policy of a file or directory is resolved by the following rules.

    1. If the file or directory is specified with a storage policy, return it.

    2. For an unspecified file or directory, if it is the root directory, return the default storage policy. Otherwise, return its parent’s effective storage policy.

    The effective storage policy can be retrieved by the “storagepolicies -getStoragePolicy” command.

  • 相关阅读:
    mysql 中将汉字(中文)按照拼音首字母排序
    数据库连接客户端 dbeaver 程序包以及使用说明
    maven 项目在 tomcat 中启动报错:Caused by: java.util.zip.ZipException: invalid LOC header (bad signature)
    iPadOS 更新日志
    iOS 更新日志
    mybatis 中 if else 用法
    Chrome 地址栏如何设置显示 http/https 和 www
    Windows 常用工具 & 开发工具 & Chrome插件 & Firefox 插件 & 办公软件
    elasticsearch安装ik分词器
    js关闭浏览器
  • 原文地址:https://www.cnblogs.com/rsapaper/p/7764463.html
Copyright © 2011-2022 走看看