zoukankan      html  css  js  c++  java
  • 有多少项目准备和Hadoop比拼?

    有哪些项目能够PK目前最红的Hadoop? 以下是目前同Hadoop一样实现MapReduce分布式处理模式的项目:

    1. Sector, 自己实现了类似GFS的文件系统和处理库,被用于处理TB级的天文数据,参见http://sector.sourceforge.net/
    其自称与Hadoop的PK结果如下:

    Hadoop Sector
    Storage Unit Blocks. Better granularity, better disk usage; may reduce performance due to block lookup and movement; may waste disk space for small files. Files. Good performance for lookup and wide area data transfer. Robust (no permanent metadata required). Requires users' knowledge to split files; may waste disk space when disks are near full.
    Data replication Real time. Emphasizes data reliability, but slow. Periodically. Favors fast IO with less reliability (but still provides long term replicas).
    Programming Model MapReduce Stream processing paradigm and MapReduce
    Programming Language System written by Java. Native programming language is Java, but support any executables with Hadoop Streaming. System written by C++. Native programming language is C++, but any program can be called by Sphere for data processing.
    Data Transfer and Message Passing TCP. Inefficient over wide area; sometimes requires parameters tuning. UDP/UDT. High performance, firewall friendly, more secure, and tuning-free.

    2. disco:核心由 erlang 写成,外部接口是 Python 。

    用Pthyon写的M/R程序:
    from disco.core import Disco, result_iterator

    def fun_map(e, params):
    return [(w, 1) for w in e.split()]

    def fun_reduce(iter, out, params):
    s = {}
    for w, f in iter:
    s[w] = s.get(w, 0) + int(f)
    for w, f in s.iteritems():
    out.add(w, f)

    results = Disco("disco://localhost").new_job(
    name = "wordcount",
    input = ["http://discoproject.org/chekhov.txt"],
    map = fun_map,
    reduce = fun_reduce).wait()

    for word, frequency in result_iterator(results):
    print word, frequency

    3. skynet:一个 Ruby 的 MapReduce 实现。


    至于GFS-like系统,有 KosMos File System (KFS, C++编写,可取代Hadoop里的HDFS ), 而 Hypertable 则试图成为HBase的替代者。
  • 相关阅读:
    mysql 5.7修改密码
    使用zfs进行pg的pitr恢复测试
    什么是构造函数?它和普通函数的区别?
    匿名函数和普通函数的区别
    http缓存机制
    全局变量和局部变量
    什么是web语义化?
    ajax状态值和状态码
    如何理解MVVM?
    如果理解&&运算符和各类数值的布尔值
  • 原文地址:https://www.cnblogs.com/wycg1984/p/1722423.html
Copyright © 2011-2022 走看看