zoukankan      html  css  js  c++  java
  • hadoop的Linux操作

    初学hadoop之linux系统操作的hdfs的常用命令

    Hadoop之HDFS文件操作

    Hadoop fs命令详解

    官网doc

    sudo su - hdfs:免密,以hdfs账户登陆。可操作hdfs文件

    logout

    sudo su - root

    hadoop fs -ls /

    rm -rf  目录名

    sh dvm_auto_hive_ci_test.sh 2017-11-22 2017-11-22 criteo

    hadoop fs -get  /report/dvm_test/script/bashScript

    ls -l :查看文件权限

    chmod 777mm.txt:修改文件权限

    cat criteo.log:查看文件

     sh dvm_auto_hive_criteoTransaction_test.sh -d "2017-11-22" -P "criteoTransaction" --input-folder "/report/dvm_test/naa" --hdfs-script "/report/dvm_test/script/etl"

    hadoop fs -rmdir /tmp/out/report/dvm_test/naa/TransactionCriteo/2017/11

    hadoop jar "/usr/hdp/2.6.2.0-205/hadoop-mapreduce/hadoop-streaming-2.7.3.2.6.2.0-205.jar" -input "/report/dvm_test/naa/TransactionCriteo/2017/11/22" -output "/tmp/out/report/dvm_test/naa/TransactionCriteo/2017/11/22" -mapper "python /report/dvm_test/script/etl/TransactionCriteo_naa_map.py" -reducer NONE

    truncate table table_name;

    DROP TABLE [IF EXISTS] table_name;

    ALTER TABLE myTable DROP IF EXISTS PARTITION
    (date>='date1' and date<='date2');

    ALTER TABLE myTable DROP IF EXISTS PARTITION
    (date>='date1' && date<='date2');

    ALTER TABLE myTable DROP IF EXISTS PARTITION
    (date between 'date1' and 'date2');

    update partition:

    ALTER TABLE logs PARTITION(year = 2012, month = 12, day = 18) 
    SET LOCATION 'hdfs://user/darcy/logs/2012/12/18';




    drop a partition:
    ALTER TABLE logs DROP IF EXISTS PARTITION(year = 2012, month = 12, day = 18);



     I implemented a workaround for this issue using some shell scripts, like for instance:
    for y in {2011..2014} 
    do 
      for m in {01..12}
      do 
        echo -n "ALTER TABLE reporting.frontend DROP IF EXISTS PARTITION (year=0000,month=00,day=00,hour=00)" 
        for d in {01..31}
        do 
          for h in {01..23}
          do 
            echo -n ", PARTITION (year=$y,month=$m,day=$d,hour=$h)" 
          done
        done
        echo ";"
      done
    done > drop_partitions_v1.hql

    The resulting .hql file can be simply executed by using the hive (or beeline) -f option.

    Obviously the loops should be able to generate the range you want to drop, which might be nontrivial. In the worst case you will need to use several such shell scripts in order to drop the desired range of dates.

    Further, please note that in my case the partitions had four keys (year, month, day, hour). If your dates/partitions are coded as strings (not a good idea in my opinion), you will have to 'build' your target string out of the variables y, m, d and h in the shell script, and plot the string inside the echo command. By the way, the dummy partition (containing only 0s) is just there in order to write easily by means of 3-4 loops the whole 'ALTER TABLE' command, which has a special syntax.

     
  • 相关阅读:
    vuecli 4使用report分析vendor.js
    vue使用 NProgress 浏览器顶部进度条
    vue项目中 configureWebpack 与 chainWebpack的区别及配置方式
    vue 项目中报错 Error: Avoided redundant navigation to current location: “/xxx”. 的解决方案
    npm中的savedev和save的区别
    vuecli 4 使用scss (配置全局scss变量)
    css如何修改滚动条样式
    vue 项目http://localhost:8080/sockjsnode/info?t=1556418283950 net:: ERR_CONNECTION_REFUSED
    java类的加载时机
    android中屏蔽键盘的2种方法
  • 原文地址:https://www.cnblogs.com/panpanwelcome/p/7825836.html
Copyright © 2011-2022 走看看