zoukankan      html  css  js  c++  java
  • 有多少项目准备和Hadoop比拼?

    有哪些项目能够PK目前最红的Hadoop? 以下是目前同Hadoop一样实现MapReduce分布式处理模式的项目:

    1. Sector, 自己实现了类似GFS的文件系统和处理库,被用于处理TB级的天文数据,参见http://sector.sourceforge.net/
    其自称与Hadoop的PK结果如下:

    Hadoop Sector
    Storage Unit Blocks. Better granularity, better disk usage; may reduce performance due to block lookup and movement; may waste disk space for small files. Files. Good performance for lookup and wide area data transfer. Robust (no permanent metadata required). Requires users' knowledge to split files; may waste disk space when disks are near full.
    Data replication Real time. Emphasizes data reliability, but slow. Periodically. Favors fast IO with less reliability (but still provides long term replicas).
    Programming Model MapReduce Stream processing paradigm and MapReduce
    Programming Language System written by Java. Native programming language is Java, but support any executables with Hadoop Streaming. System written by C++. Native programming language is C++, but any program can be called by Sphere for data processing.
    Data Transfer and Message Passing TCP. Inefficient over wide area; sometimes requires parameters tuning. UDP/UDT. High performance, firewall friendly, more secure, and tuning-free.

    2. disco:核心由 erlang 写成,外部接口是 Python 。

    用Pthyon写的M/R程序:
    from disco.core import Disco, result_iterator

    def fun_map(e, params):
    return [(w, 1) for w in e.split()]

    def fun_reduce(iter, out, params):
    s = {}
    for w, f in iter:
    s[w] = s.get(w, 0) + int(f)
    for w, f in s.iteritems():
    out.add(w, f)

    results = Disco("disco://localhost").new_job(
    name = "wordcount",
    input = ["http://discoproject.org/chekhov.txt"],
    map = fun_map,
    reduce = fun_reduce).wait()

    for word, frequency in result_iterator(results):
    print word, frequency

    3. skynet:一个 Ruby 的 MapReduce 实现。


    至于GFS-like系统,有 KosMos File System (KFS, C++编写,可取代Hadoop里的HDFS ), 而 Hypertable 则试图成为HBase的替代者。
  • 相关阅读:
    桌面工具集
    运维工具集
    使用Maven插件构建Spring Boot应用程序Docker镜像
    解决Ubuntu 17.10设置面板打不开的问题
    防止Web表单重复提交的方法总结
    深入浅出mybatis之启动详解
    yum方式安装mysql
    在Java中调用Python
    UUID在Java中的实现与应用
    VM克隆后找不到eth0的问题解决
  • 原文地址:https://www.cnblogs.com/wycg1984/p/1722423.html
Copyright © 2011-2022 走看看