zoukankan      html  css  js  c++  java
  • Secondary Namenode

    原文链接:http://blog.madhukaraphatak.com/secondary-namenode---what-it-really-do/

    Secondary Namenode is one of the poorly named component in Hadoop. By its name, it gives a sense that its a backup for the Namenode.But in reality its not. Lot of beginners in Hadoop get confused about what exactly SecondaryNamenode does and why its present in HDFS.So in this blog post I try to explain the role of secondary namenode in HDFS.

    By its name, you may assume that it has something to do with Namenode and you are right. So before we dig into Secondary Namenode lets see what exactly Namenode does.

    Namenode

    Namenode holds the meta data for the HDFS like Namespace information, block information etc. When in use, all this information is stored in main memory. But these information also stored in disk for persistence storage.

    namenode

    The above image shows how Name Node stores information in disk.
    Two different files are

    1. fsimage - Its the snapshot of the filesystem when namenode started
    2. Edit logs - Its the sequence of changes made to the filesystem after namenode started

    Only in the restart of namenode , edit logs are applied to fsimage to get the latest snapshot of the file system. But namenode restart are rare in production clusters which means edit logs can grow very large for the clusters where namenode runs for a long period of time. The following issues we will encounter in this situation.

    1. Editlog become very large , which will be challenging to manage it
    2. Namenode restart takes long time because lot of changes has to be merged
    3. In the case of crash, we will lost huge amount of metadata since fsimage is very old

    So to overcome this issues we need a mechanism which will help us reduce the edit log size which is manageable and have up to date fsimage ,so that load on namenode reduces . It’s very similar to Windows Restore point, which will allow us to take snapshot of the OS so that if something goes wrong , we can fallback to the last restore point.

    So now we understood NameNode functionality and challenges to keep the meta data up to date.So what is this all have to with Seconadary Namenode?

    Secondary Namenode

    Secondary Namenode helps to overcome the above issues by taking over responsibility of merging editlogs with fsimage from the namenode.

    secondarynamenode

    The above figure shows the working of Secondary Namenode

    1. It gets the edit logs from the namenode in regular intervals and applies to fsimage
    2. Once it has new fsimage, it copies back to namenode
    3. Namenode will use this fsimage for the next restart,which will reduce the startup time

    Secondary Namenode whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode.That’s why it also known as checkpoint node inside the community.

    So we now understood all Secondary Namenode does puts a checkpoint in filesystem which will help Namenode to function better. Its not the replacement or backup for the Namenode. So from now on make a habit of calling it as a checkpoint node.

  • 相关阅读:
    HTTP 无法注册URL 进程不具有命名空间的访问权限
    逆变与协变详解
    正式学习React(五) react-redux源码分析
    正式学习React(四) ----Redux源码分析
    正式学习react(二)
    工作总结
    如何自定义echarts 线性图的选择事件
    viewport大白话
    关于div可编辑的复制粘贴问题
    nginx学习
  • 原文地址:https://www.cnblogs.com/langren1992/p/6275816.html
Copyright © 2011-2022 走看看