zoukankan      html  css  js  c++  java
  • Linux安装ElasticSearch与MongoDB分布式集群环境下数据同步

    ElasticSearch有一个叫做river的插件式模块,可以将外部数据源中的数据导入elasticsearch并在上面建立索引。River在集群上是单例模式的,它被自动分配到一个节点上,当这个节点挂掉后,river会被自动分配到另外的一个节点上。目前支持的数据源包括:Wikipedia, MongoDB, CouchDB, RabbitMQ, RSS, Sofa, JDBC, FileSystem,Dropbox等。River有一些指定的规范,依照这些规范可以开发适合于自己的应用数据的插件。

    1、安装MongoDB的River:https://github.com/richardwilly98/elasticsearch-river-mongodbhttp://blog.csdn.net/huwei2003/article/details/40407555

    git clone https://github.com/richardwilly98/elasticsearch-river-mongodb.git
    cd elasticsearch-river-mongodb
    
    git tag
    git checkout elasticsearch-river-mongodb-2.0.5
    mvn clean mvn compile mvn package

    vim install-local.sh --把plugin --remove 改成 plugin remove;注释掉最后一行sudo命令,因为没有plugin命令的url参数了。
    ./install-local.sh 
    cd target/releases/

    cd /usr/share/elasticsearch/plugins/
    
    mkdir mongodb
    
    cd mongodb
    
    mv /data/elasticsearch-river-mongodb/target/releases/elasticsearch-river-mongodb-2.0.5.zip ./
    
    cd ../head/
    
    cp plugin-descriptor.properties ../mongodb/
    
    cd -
    
    vim plugin-descriptor.properties     --修改description,name的信息为mongo

    mkdir _site cp
    -r /data/elasticsearch-river-mongodb/src/site/* /usr/share/elasticsearch/plugins/mongodb/_site/ --复制mongo网站目录到plugins/mongodb/_site

    cd _site service elasticsearch restart

    2、浏览器输入:http://192.168.7.131:9200/_plugin/mongodb/

    3、配置Mongo&ElasticSearch数据互通:https://github.com/richardwilly98/elasticsearch-river-mongodb/wikihttp://blog.csdn.net/huwei2003/article/details/40407555

    创建river并索引的标准模版如下:

    $ curl -XPUT "localhost:9200/_river/${es.river.name}/_meta" -d '
    {
      "type": "mongodb",
      "mongodb": { 
        "servers":
        [
          { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },
          { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }
        ],
        "options": { 
          "secondary_read_preference" : true, 
          "drop_collection": ${mongo.drop.collection}, 
          "exclude_fields": ${mongo.exclude.fields},
          "include_fields": ${mongo.include.fields},
          "include_collection": ${mongo.include.collection},
          "import_all_collections": ${mongo.import.all.collections},
          "initial_timestamp": {
            "script_type": ${mongo.initial.timestamp.script.type},
            "script": ${mongo.initial.timestamp.script}
          },
          "skip_initial_import" : ${mongo.skip.initial.import},
          "store_statistics" : ${mongo.store.statistics},
        },
        "credentials":
        [
          { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },
          { "db": "admin", "user": ${mongo.db.user}, "password": ${mongo.db.password} }
        ],
        "db": ${mongo.db.name}, 
        "collection": ${mongo.collection.name}, 
        "gridfs": ${mongo.is.gridfs.collection},
        "filter": ${mongo.filter}
      }, 
      "index": { 
        "name": ${es.index.name}, 
        "throttle_size": ${es.throttle.size},
        "bulk_size": ${es.bulk.size},
        "type": ${es.type.name}
        "bulk": {
          "actions": ${es.bulk.actions},
          "size": ${es.bulk.size},
          "concurrent_requests": ${es.bulk.concurrent.requests},
          "flush_interval": ${es.bulk.flush.interval}
        }
      }
    }'

    一些配置项的解释如下,具体可以查看github的wiki:

    • db为同步的数据库名,
    • host mongodb的ip地址(默认为localhost)
    • port mongodb的端口
    • collection 要同步的表名
    • fields 要同步的字段名(用逗号隔开,默认全部)
    • gridfs 是否是gridfs文件(如果collection是gridfs的话就设置成true)
    • local_db_user local数据库的用户名(没有的话不用写)
    • local_db_password local数据库的密码(没有的话不用写)
    • db_user 要同步的数据库的密码(没有的话不用写)
    • db_password 要同步的数据库的密码(没有的话不用写)
    • index.name 要建立的索引名,最好是小写(应该是必须小写,并且索引名称必须唯一,之前不能存在,建议使用“数据库名+集合名+index”方式命名)
    • index:type collection名,即该索引对应的数据集合名
    • type 类型  后面是 mongodb 因为用的是 mongodb 数据库
    • bulk_size 批量添加的最大数
    • bulk_timeout 批量添加的超时时间

     例如添加mongodb数据到elasticsearch里面:

    curl -XPUT "http://ip地址:端口号/river_dbname/collectionname/_meta" -d'
    {
        "type":"mongodb",
        "mongodb":{
            "servers":[
                {"host":"mongodb的ip地址","port":mongodb的访问端口},
                {"host":"mongodb的ip地址","port":mongodb的访问端口},
                {"host":"mongodb的ip地址","port":mongodb的访问端口}
            ],
            "db":"dbname",   (实际数据库名称)
            "collection":"collectionname",   (实际集合名称)
            "gridfs":false,
            "credentials":[
              { "db": "admin", "user": "user", "password": "user" }
            ],
            "options":{
                "secondary_read_preference":true   (使用副本集)
            }
        },
        "index":{
            "name":"collectionname_index",  (全部小写字母)
            "type":"collectionname"   (实际集合名称)
        }
    }'

    添加成功,如下图: 

    使用head插件观察结果:

    也可以通过get方式获取

    GET /river_dbname/collectionname/_meta

    转:

    http://blog.csdn.net/huwei2003/article/details/40407555

    http://www.yuansir-web.com/2015/03/06/elasticsearch%E4%B8%8Emongodb-%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A5%E5%8F%8A%E5%88%86%E5%B8%83%E5%BC%8F%E9%9B%86%E7%BE%A4%E6%90%AD%E5%BB%BA-%EF%BC%88%E4%B8%80%EF%BC%89/

  • 相关阅读:
    hdu5360 Hiking(水题)
    hdu5348 MZL's endless loop(欧拉回路)
    hdu5351 MZL's Border(规律题,java)
    hdu5347 MZL's chemistry(打表)
    hdu5344 MZL's xor(水题)
    hdu5338 ZZX and Permutations(贪心、线段树)
    hdu 5325 Crazy Bobo (树形dp)
    hdu5323 Solve this interesting problem(爆搜)
    hdu5322 Hope(dp)
    Lightoj1009 Back to Underworld(带权并查集)
  • 原文地址:https://www.cnblogs.com/qiyebao/p/5260517.html
Copyright © 2011-2022 走看看