zoukankan      html  css  js  c++  java
  • Java API 读取HDFS的单文件

    HDFS上的单文件:

    -bash-3.2$ hadoop fs -ls /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category
    Found 1 items
    -rw-r--r--   2 deploy supergroup        520 2014-08-14 17:03 /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt
    文件内容:

    -bash-3.2$ hadoop fs -cat /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt | more
    8104
    960985
    5472
    971917
    5320
    971895
    971902
    971922
    958261
    972047
    972050

    Java API使用FileSystem方式 读取HDFS单文件的方法

    /**
     * 获取可反复推荐的类目,以英文逗号分隔
     * @param filePath
     * @param conf
     * @return
     */
    public String getRepeatRecCategoryStr(String filePath) {
    	final String DELIMITER = "	";
    	final String INNER_DELIMITER = ",";
    	
    	String categoryFilterStrs = new String();
    	BufferedReader br = null;
    	try {
    		FileSystem fs = FileSystem.get(new Configuration());
    		FSDataInputStream inputStream = fs.open(new Path(filePath));
    		br = new BufferedReader(new InputStreamReader(inputStream));
    		
    		String line = null;
    		while (null != (line = br.readLine())) {
    			String[] strs = line.split(DELIMITER);
    			categoryFilterStrs += (strs[0] + INNER_DELIMITER);
    		}
    	} catch (IOException e) {
    		e.printStackTrace();
    	} finally {
    		if (null != br) {
    			try {
    				br.close();
    			} catch (IOException e) {
    				e.printStackTrace();
    			}
    		}
    	}
    	
    	return categoryFilterStrs;
    }

  • 相关阅读:
    BZOJ1251: 序列终结者
    BZOJ1014 [JSOI2008]火星人prefix
    NOI模拟赛Day6
    NOI模拟赛Day5
    BZOJ2329: [HNOI2011]括号修复
    NOI模拟赛Day4
    状压dp题目总结
    BZOJ2097[Usaco2010 Dec] 奶牛健美操
    BZOJ4027: [HEOI2015]兔子与樱花 贪心
    BZOJ1443: [JSOI2009]游戏Game
  • 原文地址:https://www.cnblogs.com/wzzkaifa/p/6957820.html
Copyright © 2011-2022 走看看