zoukankan      html  css  js  c++  java
  • hadoop:将WordCount打包成独立运行的jar包

    hadoop示例中的WordCount程序,很多教程上都是推荐以下二种运行方式:

    1.将生成的jar包,复制到hadoop集群中的节点,然后运行

    $HADOOP_HOME/bin/hadoop xxx.jar xxx.WordCount /input/xxx.txt /output

    2.或者直接在IDE环境中调试(参见eclipse/intellij idea 远程调试hadoop 2.6.0)

    但是生产环境中,更多的情况是:没有ide环境,且各应用最终生成的jar包部署在应用服务器上(应用服务器并非hadoop集群中的服务器节点),所以需要jar能独立运行并能连接到hadoop环境,以下是关键点:

    1. pom.xml中将WordCount所依赖的jar包依赖项,全添加进来(这样最终运行时,这些jar包就不用依赖ide或hadoop运行环境)

    2. 参考maven: 打包可运行的jar包(java application)及依赖项处理 一文将依赖的jar包导出,且通过maven插件自动修改MANIFEST.MF中的Main-Class信息

    3. core-site.xml要复制到maven项目的resources目录下(这样打包后,xml会复制到classpath下,运行时,根据这个配置文件,WordCount就能知道去连哪里的hadoop)

    4. 部署时,将最终生成的WordCount jar包及依赖的lib包,全上传到应用服务器

    然后就能直接以类似

    java -jar hadoop-helloworld.jar /jimmy/input/README.txt /jimmy/output 运行

    最后附几个关键文件内容:

    a. pom.xml

     1 <?xml version="1.0" encoding="UTF-8"?>
     2 <project xmlns="http://maven.apache.org/POM/4.0.0"
     3          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     4          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
     5     <modelVersion>4.0.0</modelVersion>
     6 
     7     <groupId>cn.cnblogs.yjmyzz</groupId>
     8     <artifactId>hadoop-helloworld</artifactId>
     9     <version>1.0</version>
    10 
    11     <dependencies>
    12         <dependency>
    13             <groupId>org.apache.hadoop</groupId>
    14             <artifactId>hadoop-common</artifactId>
    15             <version>2.6.0</version>
    16         </dependency>
    17         <dependency>
    18             <groupId>org.apache.hadoop</groupId>
    19             <artifactId>hadoop-hdfs</artifactId>
    20             <version>2.6.0</version>
    21         </dependency>
    22         <dependency>
    23             <groupId>org.apache.hadoop</groupId>
    24             <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
    25             <version>2.6.0</version>
    26         </dependency>
    27         <dependency>
    28             <groupId>commons-cli</groupId>
    29             <artifactId>commons-cli</artifactId>
    30             <version>1.2</version>
    31         </dependency>
    32     </dependencies>
    33 
    34     <build>
    35         <finalName>${project.artifactId}</finalName>
    36 
    37         <plugins>
    38             <plugin>
    39                 <groupId>org.apache.maven.plugins</groupId>
    40                 <artifactId>maven-jar-plugin</artifactId>
    41                 <configuration>
    42                     <archive>
    43                         <manifest>
    44                             <mainClass>cn.cnblogs.yjmyzz.WordCount</mainClass>
    45                             <addClasspath>true</addClasspath>
    46                             <classpathPrefix>lib/</classpathPrefix>
    47                         </manifest>
    48                     </archive>
    49                     <classesDirectory>
    50                     </classesDirectory>
    51                 </configuration>
    52             </plugin>
    53         </plugins>
    54     </build>
    55 
    56     <!--mvn dependency:copy-dependencies -DoutputDirectory=target/lib-->
    57 
    58 </project>
    View Code

    b.META-INFMANIFEST.MF内容

    Manifest-Version: 1.0
    Built-By: jimmy
    Build-Jdk: 1.7.0_09
    Class-Path: lib/hadoop-common-2.6.0.jar lib/hadoop-annotations-2.6.0.j
     ar lib/guava-11.0.2.jar lib/commons-math3-3.1.1.jar lib/xmlenc-0.52.j
     ar lib/commons-httpclient-3.1.jar lib/commons-codec-1.4.jar lib/commo
     ns-io-2.4.jar lib/commons-net-3.1.jar lib/commons-collections-3.2.1.j
     ar lib/servlet-api-2.5.jar lib/jetty-6.1.26.jar lib/jetty-util-6.1.26
     .jar lib/jersey-core-1.9.jar lib/jersey-json-1.9.jar lib/jettison-1.1
     .jar lib/jaxb-impl-2.2.3-1.jar lib/jaxb-api-2.2.2.jar lib/stax-api-1.
     0-2.jar lib/activation-1.1.jar lib/jackson-jaxrs-1.8.3.jar lib/jackso
     n-xc-1.8.3.jar lib/jersey-server-1.9.jar lib/asm-3.1.jar lib/jasper-c
     ompiler-5.5.23.jar lib/jasper-runtime-5.5.23.jar lib/jsp-api-2.1.jar 
     lib/commons-el-1.0.jar lib/commons-logging-1.1.3.jar lib/log4j-1.2.17
     .jar lib/jets3t-0.9.0.jar lib/httpclient-4.1.2.jar lib/httpcore-4.1.2
     .jar lib/java-xmlbuilder-0.4.jar lib/commons-lang-2.6.jar lib/commons
     -configuration-1.6.jar lib/commons-digester-1.8.jar lib/commons-beanu
     tils-1.7.0.jar lib/commons-beanutils-core-1.8.0.jar lib/slf4j-api-1.7
     .5.jar lib/slf4j-log4j12-1.7.5.jar lib/jackson-core-asl-1.9.13.jar li
     b/jackson-mapper-asl-1.9.13.jar lib/avro-1.7.4.jar lib/paranamer-2.3.
     jar lib/snappy-java-1.0.4.1.jar lib/protobuf-java-2.5.0.jar lib/gson-
     2.2.4.jar lib/hadoop-auth-2.6.0.jar lib/apacheds-kerberos-codec-2.0.0
     -M15.jar lib/apacheds-i18n-2.0.0-M15.jar lib/api-asn1-api-1.0.0-M20.j
     ar lib/api-util-1.0.0-M20.jar lib/curator-framework-2.6.0.jar lib/jsc
     h-0.1.42.jar lib/curator-client-2.6.0.jar lib/curator-recipes-2.6.0.j
     ar lib/jsr305-1.3.9.jar lib/htrace-core-3.0.4.jar lib/zookeeper-3.4.6
     .jar lib/commons-compress-1.4.1.jar lib/xz-1.0.jar lib/hadoop-hdfs-2.
     6.0.jar lib/commons-daemon-1.0.13.jar lib/netty-3.6.2.Final.jar lib/x
     ercesImpl-2.9.1.jar lib/xml-apis-1.3.04.jar lib/hadoop-mapreduce-clie
     nt-jobclient-2.6.0.jar lib/hadoop-mapreduce-client-common-2.6.0.jar l
     ib/hadoop-yarn-common-2.6.0.jar lib/hadoop-yarn-api-2.6.0.jar lib/jer
     sey-client-1.9.jar lib/jersey-guice-1.9.jar lib/hadoop-yarn-client-2.
     6.0.jar lib/hadoop-mapreduce-client-core-2.6.0.jar lib/hadoop-yarn-se
     rver-common-2.6.0.jar lib/hadoop-mapreduce-client-shuffle-2.6.0.jar l
     ib/hadoop-yarn-server-nodemanager-2.6.0.jar lib/leveldbjni-all-1.8.ja
     r lib/guice-servlet-3.0.jar lib/guice-3.0.jar lib/javax.inject-1.jar 
     lib/aopalliance-1.0.jar lib/commons-cli-1.2.jar
    Created-By: Apache Maven 3.2.3
    Main-Class: cn.cnblogs.yjmyzz.WordCount
    Archiver-Version: Plexus Archiver
    View Code

    运行截图:

  • 相关阅读:
    ASP.NET MVC实现通用设置
    C# Redis的操作
    Jquery Ajax向服务端传递数组参数值
    ASP.NET 通过配置hiddenSegment禁止目录下资源通过Url形式访问
    Jquery组织Form表单提交之Form submission canceled because the form is not connected
    Entity Framework工具POCO Code First Generator的使用
    ASP.NET MVC 5搭建自己的视图基架 (CodeTemplate)
    Jquery Ajax 提交json数据
    使用Reflector反编译并提取源代码
    ASP.NET MVC下Bundle的使用
  • 原文地址:https://www.cnblogs.com/yjmyzz/p/4519130.html
Copyright © 2011-2022 走看看