zoukankan      html  css  js  c++  java
  • 大数据python词频统计之hdfs分发-cacheArchive

    -cacheArchive也是从hdfs上进分发,但是分发文件是一个压缩包,压缩包内可能会包含多层目录多个文件

    1.The_Man_of_Property.txt文件如下(将其上传至hdfs上)

    hadoop fs -put The_Man_of_Property.txt  /mapreduce
    Preface
    “The Forsyte Saga” was the title originally destined for that part of it which is called “The Man of Property”; and to adopt it for the collected chronicles of the Forsyte family has indulged the Forsytean tenacity that is in all of us. The word Saga might be objected to on the ground that it connotes the heroic and that there is little heroism in these pages. But it is used with a suitable irony; and, after all, this long tale, though it may deal with folk in frock coats, furbelows, and a gilt-edged period, is not devoid of the essential heat of conflict. Discounting for the gigantic stature and blood-thirstiness of old days, as they have come down to us in fairy-tale and legend, the folk of the old Sagas were Forsytes, assuredly, in their possessive instincts, and as little proof against the inroads of beauty and passion as Swithin, Soames, or even Young Jolyon. And if heroic figures, in days that never were, seem to startle out from their surroundings in fashion unbecoming to a Forsyte of the Victorian era, we may be sure that tribal instinct was even then the prime force, and that “family” and the sense of home and property counted as they do to this day, for all the recent efforts to “talk them out.”
    So many people have written and claimed that their families were the originals of the Forsytes that one has been almost encouraged to believe in the typicality of an imagined species. Manners change and modes evolve, and “Timothy’s on the Bayswater Road” becomes a nest of the unbelievable in all except essentials; we shall not look upon its like again, nor perhaps on such a one as James or Old Jolyon. And yet the figures of Insurance Societies and the utterances of Judges reassure us daily that our earthly paradise is still a rich preserve, where the wild raiders, Beauty and Passion, come stealing in, filching security from beneath our noses. As surely as a dog will bark at a brass band, so will the essential Soames in human nature ever rise up uneasily against the dissolution which hovers round the folds of ownership.
    “Let the dead Past bury its dead” would be a better saying if the Past ever died. The persistence of the Past is one of those tragi-comic blessings which each new age denies, coming cocksure on to the stage to mouth its claim to a perfect novelty.
    But no Age is so new as that! Human Nature, under its changing pretensions and clothes, is and ever will be very much of a Forsyte, and might, after all, be a much worse animal.
    Looking back on the Victorian era, whose ripeness, decline, and ‘fall-of’ is in some sort pictured in “The Forsyte Saga,” we see now that we have but jumped out of a frying-pan into a fire. It would be difficult to substantiate a claim that the case of England was better in 1913 than it was in 1886, when the Forsytes assembled at Old Jolyon’s to celebrate the engagement of June to Philip Bosinney. And in 1920, when again the clan gathered to bless the marriage of Fleur with Michael Mont, the state of England is as surely too molten and bankrupt as in the eighties it was too congealed and low-percented. If these chronicles had been a really scientific study of transition one would have dwelt probably on such factors as the invention of bicycle, motor-car, and flying-machine; the arrival of a cheap Press; the decline of country life and increase of the towns; the birth of the Cinema. Men are, in fact, quite unable to control their own inventions; they at best develop adaptability to the new conditions those inventions create.
    But this long tale is no scientific study of a period; it is rather an intimate incarnation of the disturbance that Beauty effects in the lives of men.
    The figure of Irene, never, as the reader may possibly have observed, present, except through the senses of other characters, is a concretion of disturbing Beauty impinging on a possessive world.
    One has noticed that readers, as they wade on through the salt waters of the Saga, are inclined more and more to pity Soames, and to think that in doing so they are in revolt against the mood of his creator. Far from it! He, too, pities Soames, the tragedy of whose life is the very simple, uncontrollable tragedy of being unlovable, without quite a thick enough skin to be thoroughly unconscious of the fact. Not even Fleur loves Soames as he feels he ought to be loved. But in pitying Soames, readers incline, perhaps, to animus against Irene: After all, they think, he wasn’t a bad fellow, it wasn’t his fault; she ought to have forgiven him, and so on!

     2.white_list1与white_list2做为白名单,找出白名单文件中单词在The_Man_of_Property.tx中出现的次数(实现将2个文件打包为white.tar.gz,上传至hdfs上)

    white_list1如下:

    suitable
    against
    recent
    across

    white_list2如下:

    Age
    on

    打包并上传至hdfs:

    tar czvf white.tar.gz white_list1 white_list2
    hadoop fs -put  white.tar.gz  /mapreduce

    map函数代码如下:思路(1.遍历找到所有文件的路径,2.读取white_list文件内容;3.进行过滤)

    #!usr/bin/python
    import sys
    import os
    def read_dir_file(file_dir,dir):
            fs = os.listdir(dir)
            for f1 in fs:
                    tmp_path=os.path.join(dir,f1)
                    if not os.path.isdir(tmp_path):
                            file_dir.append(tmp_path)
                    else:
                            read_dir_file(file_dir,tmp_path)
            return file_dir
    def read_local_file(file_dir):
            word_set = set()
            for file in file_dir:
                    file_in = open (file,'r')
                    for line in file_in:
                            word = line.strip()
                            word_set.add(word)
            return word_set
    def mapper_func(dir):
            file_dir=[]
            file_dir=read_dir_file(file_dir,dir)
            word_set=read_local_file(file_dir)
            for line in sys.stdin:
                    ss=line.strip().split()
                    for word in ss:
                            word.strip()
                            if word != "" and (word in word_set):
                                    print "%s	%s"%(word,"1")
    if __name__ == "__main__":
            func = getattr(sys.modules[__name__],sys.argv[1])
            args = None
            if len(sys.argv) > 1:
                    args = sys.argv[2:]
            func(*args)

    4.reduce端代码如下:

    #!usr/bin/python
    import sys
    def reducer_func():
            word="None"
            sum=0
            for line in sys.stdin:
                    ss=line.split()
                    cur_word=ss[0]
                    cnt=int(ss[1])
                    if cur_word!=word:
                            if word!="None":
                                    print "%s	%s"%(word,sum)
                            word=cur_word
                            sum=0
                    else:
                            sum+=cnt
            print "%s	%s"%(word,sum)
    if __name__ == "__main__":
            func = getattr(sys.modules[__name__],sys.argv[1])
            args = None
            if len(sys.argv) > 1:
                    args=sys.argv[2:]
            func(*args)

    5.运行脚本run.sh如下:

    HADOOP="/usr/local/src/hadoop-1.2.1/bin/hadoop"
    HADOOP_STREAMING="/usr/local/src/hadoop-1.2.1/contrib/streaming/hadoop-streaming-1.2.1.jar"
    INPUT_PATH="/mapreduce/The_Man_of_Property.txt"
    OUTPUT_PATH="/mapreduce/out"
    $HADOOP fs -rmr $OUTPUT_PATH
    $HADOOP jar $HADOOP_STREAMING 
            -input "$INPUT_PATH" 
            -output "$OUTPUT_PATH" 
            -mapper "python map.py mapper_func ABC" 
            -reducer "python red.py reducer_func" 
            -file "./map.py"
            -file "./red.py"
            -cacheArchive "hdfs://master:9000/mapreduce/white.tar.gz#ABC"

  • 相关阅读:
    酷商城新闻客户端源码
    一款类似塔防类的保卫羊村游戏android源码
    躲避球游戏ios源码
    卡通投掷游戏ios源码
    爱拼图游戏源码完整版
    newsstand杂志阅读应用源码ipad版
    linux下proxy设定的一般方法
    android中调用App市场对自身App评分
    Android AChartEngine
    设计模式之单例模式
  • 原文地址:https://www.cnblogs.com/students/p/8711820.html
Copyright © 2011-2022 走看看