zoukankan      html  css  js  c++  java
  • hadoop挂载多硬盘,ZZ-- multiple disks per node

    hadoop挂载多硬盘 ...multiple disks per node

     multiple disks per node Read more at: http://www.queryhome.com/24784/how-to-set-hadoop-tmp-dir-if-i-have-multiple-disks-per-node
     multiple disks per node Read more at: http://www.queryhome.com/24784/how-to-set-hadoop-tmp-dir-if-i-have-multiple-disks-per-node

    http://blog.sina.com.cn/s/blog_b88e09dd01013rd4.html

    Ubuntu - 硬盘分区、格式化、自动挂载配置 | Hard disk add new partition, format, auto mount in ubuntu 

    http://aofengblog.blog.163.com/blog/static/6317021201101502540117/

    http://my.oschina.net/leejun2005/blog/290073

    proper-care-and-feeding-of-drives-in-a-hadoop-cluster-a-conversation-with-stackiqs-dr-bruno

    http://hortonworks.com/blog/proper-care-and-feeding-of-drives-in-a-hadoop-cluster-a-conversation-with-stackiqs-dr-bruno/

    Utilizing-multiple-hard-disks-for-hadoop-HDFS

    http://lucene.472066.n3.nabble.com/Utilizing-multiple-hard-disks-for-hadoop-HDFS-td3553851.html

    =================

    First, Hadoop requires at least two locations for storing it’s files: mapred.local.dir, where MapReduce stores intermediary files,

    and dfs.data.dir, where HDFS stores the HDFS data (there are other locations as well, like hadoop.tmp.dir, where Hadoop and components stores its temporary data).  

    Both of them can cover multiple partitions.  

    While the two locations can be placed on physically different partitions, Cloudera recommends to configure them across the same set of partitions to maximize disk-level parallelism (this might not be an issue if the number of disk is much larger than the number of cores).

    ==

    Hadoop多磁碟設定(ubuntu)
    Hadoop預設只會使用hadoop資料夾的那個磁碟,然而要使用到所有的磁碟必須要做額外的設定,才可以讓HDFS使用到。設定步驟如下:
    1. 設定新磁碟的權限為777(也許不用這麼高),讓Hadoop擁有讀寫的權限
       sudo chmod -c 777 /media/diskName
    2. 修改conf資料夾內的hdfs-site.xml文件,在<configuration></configuration>之間加入property
    <property>
    <name>dfs.data.dir</name>  
    <value> ~/dfs/data,media/diskName/dfs/data</value>
    </property>

    多磁碟間的路徑使用半形逗點分隔,並請確定路徑是否正確

    最後重新啟動HDFS
    ============================
    http://girishkathalagiri.blogspot.com/2012/09/adding-disk-to-hadoop-data-nodesrepeat.html
    ==================
    You need to apply comma-separated lists only to
    dfs.data.dir (HDFS) and mapred.local.dir (MR) directly.
     
    Make sure the subdirectories are different for each, else you may accidentally wipe away your data when you restart MR services.

    The hadoop.tmp.dir property does not accept multiple paths and you should avoid using it in production - its more of a utility property that acts as a default base path for other properties.
    stop-dfs.sh
    start-dfs.sh
  • 相关阅读:
    [扩展推荐] Laravel 中利用 GeoIP 获取用户地理位置信息
    10 个优质的 Laravel 扩展推荐
    5 个非常有用的 Laravel Blade 指令,你用过哪些?
    PHP 7.3 我们将迎来灵活的 heredoc 和 nowdoc 句法结构
    使用 Swoole 来加速你的 Laravel 应用
    一个成功的 Git 分支模型(适用于商业应用开发)
    github搜索语法
    python协程爬虫-aiohttp+aiomultiprocess使用
    python-协程、多线程、多进程性能比较
    functools模块-为函数预设args/kwargs参数
  • 原文地址:https://www.cnblogs.com/GrantYu/p/4106910.html
Copyright © 2011-2022 走看看