接一下以一个示例配置来介绍一下如何以Flink连接HDFS
1. 依赖HDFS
pom.xml 添加依赖
<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-hadoop-compatibility_2.11</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency>
2. 配置 HDFS
将hdfs-site.xml
和core-site.xml
放入到src/main/resources
目录下面
3. 读取HDFS上面文件
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); DataSource<String> text = env.readTextFile("hdfs://flinkhadoop:9000/user/wuhulala/input/core-site.xml");
TIP
- 请关闭HDFS 权限,不关闭需要把认证copy到resources目录下
<property> <name>dfs.permissions</name> <value>false</value> </property>