Flume更新比较慢,而elasticsearch更新非常快所以当涉及更换elasticsearch版本时会出现不兼容问题。
apache-flume-1.6.0+elasticsearch1.5.1是可以完美结合的,这里将elasticsearch版本升级到6.3.2。
低版本elasticsearch和高版本elasticsearch连接方式完全不一样所以需要重写Sink。
下载源码flume-ng-sinksflume-ng-elasticsearch-sinkElasticSearchSink.java,查看人家的源码。
我直接起个项目重写了
POM
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.4.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.4.3</version>
</dependency>
<!-- <dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId>
<version>4.1.32.Final</version> </dependency> -->
<!-- https://mvnrepository.com/artifact/org.apache.flume.flume-ng-sinks/flume-ng-elasticsearch-sink -->
<dependency>
<groupId>org.apache.flume.flume-ng-sinks</groupId>
<artifactId>flume-ng-elasticsearch-sink</artifactId>
<version>1.6.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.5</version>
</dependency>
</dependencies>
重写的Sink类
package com.jachs.sink.elasticsearch; import org.apache.flume.Channel; import org.apache.flume.Context; import org.apache.flume.Event; import org.apache.flume.EventDeliveryException; import org.apache.flume.Transaction; import org.apache.flume.conf.Configurable; import org.apache.flume.sink.AbstractSink; import org.elasticsearch.action.index.IndexRequestBuilder; import org.elasticsearch.action.index.IndexResponse; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.transport.TransportAddress; import org.elasticsearch.transport.client.PreBuiltTransportClient; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME; import java.net.InetAddress; import java.net.UnknownHostException; import java.util.HashMap; import java.util.Map; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES; public class ElasticSearchSink extends AbstractSink implements Configurable { private String hostNames; private String indexName; private String clusterName; static TransportClient client; public void configure(Context context) { hostNames = context.getString(HOSTNAMES); indexName = context.getString(INDEX_NAME); clusterName = context.getString(CLUSTER_NAME); } @Override public void start() { Settings settings = Settings.builder().put("cluster.name", clusterName).build(); try { client = new PreBuiltTransportClient(settings).addTransportAddress(new TransportAddress( InetAddress.getByName(hostNames.split(":")[0]), Integer.parseInt(hostNames.split(":")[1]))); } catch (UnknownHostException e) { e.printStackTrace(); } } @Override public void stop() { super.stop(); } public Status process() throws EventDeliveryException { Status status = null; Channel ch = getChannel(); Transaction txn = ch.getTransaction(); txn.begin(); try { Event event = ch.take(); Map<String, String> head = event.getHeaders(); Map<String, Object> map = new HashMap<String, Object>(); for (String key : head.keySet()) { map.put("topic", key); map.put("timestamp", head.get(key)); map.put("data", new String(event.getBody())); } IndexRequestBuilder create = client.prepareIndex(indexName, "text").setSource(map); IndexResponse response = create.execute().actionGet(); txn.commit(); status = Status.READY; } catch (Throwable t) { txn.rollback(); status = Status.BACKOFF; if (t instanceof Error) { throw (Error) t; } } finally { txn.close(); } return status; } }
mvn install -DskipTests
打包,然后将Flume下的flume-ng-kafka-sink.jar替换掉。
修改Flume配置文件将下面修改为自己的类位置
a1.sinks.elasticsearch.type=com.jachs.sink.elasticsearch.ElasticSearchSink
我这里使用的是FileBeat-kafka-flume-elasticsearch,所以是从kafka取数到elasticsearch,根据自己sources修改自己连接。然后将kafka和elasticsearch的jar包Copy到Flume下注意版本冲突保持JAR版本正确不要冲突。
官方参考
http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html#sink
http://flume.apache.org/releases/content/1.6.0/apidocs/index.html
Channel对象是管道,可以创建Transaction事务,采用回调方式将sources数据放进了Data,启动个Even事件,然后根据自己逻辑代码动态设置状态码最后返回状态码。