Atittit HDFS hadoop 大数据文件系统java使用总结
目录
4.1. 启动hdfs服务中提示windows找不到hadoop 6
4.2. D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format 6
4.4. 347java.io.IOException: NameNode is not formatted 11
4.7. Unkonw host java.net.ConnectException: Connection refused: no further information 12
文件系统的几种操作
- 文件夹的操作:增删改查
- 远程文件的IO操作
- 文件的上传下载 (本地 远程文件复制操作
.具体的操作命令
- 根据配置获取HDFS文件操作系统(共有三种方式)
- 方法一:直接获取配置文件方法
通常情况下该方法用于本地有hadoop系统,可以直接进行访问。此时仅需在配置文件中指定要操作的文件系统为hdfs即可。这里的conf的配置文件可以设置hdfs的各种参数,并且优先级比配置文件要搞 - 方法二:指定URI路径,进而获取配置文件创建操作系统
通常该方法用于本地没有hadoop系统,但是可以通过URI的方式进行访问。此时要给给定hadoop的NN节点的访问路径,hadoop的用户名,以及配置文件信息(此时会自动访问远程hadoop的配置文件)
- 方法一:直接获取配置文件方法
D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://huabingood01:9000</value>
</property>
</configuration>
package hdfsHadoopUse;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class hdfsHadoopClass {
public static void main(String[] args) throws IOException {
String pathToCreate = "/firstDirS09/secdirS09";
hdfsHadoopClass hdfsHadoopClass = new hdfsHadoopClass();
FileSystem fs=hdfsHadoopClass.getHadoopFileSystem();
hdfsHadoopClass.myCreatePath(fs, pathToCreate);
System.out.println("--f");
}
/**
* 根据配置文件获取HDFS操作对象
* 有两种方法:
* 1.使用conf直接从本地获取配置文件创建HDFS对象
* 2.多用于本地没有hadoop系统,但是可以远程访问。使用给定的URI和用户名,访问远程的配置文件,然后创建HDFS对象。
* @return FileSystem
* @throws IOException
*/
public FileSystem getHadoopFileSystem() throws IOException {
FileSystem fs = null;
Configuration conf = null;
// 方法一,本地有配置文件,直接获取配置文件(core-site.xml,hdfs-site.xml)
// 根据配置文件创建HDFS对象
// 此时必须指定hdsf的访问路径。
conf = new Configuration();
// 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高
conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");
conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
// 根据配置文件创建HDFS对象
fs = FileSystem.get(conf);
return fs;
}
/**
* 这里的创建文件夹同shell中的mkdir -p 语序前面的文件夹不存在
* 跟java中的IO操作一样,也只能对path对象做操作;但是这里的Path对象是hdfs中的
* @param fs
* @return
* @throws IOException
*/
public boolean myCreatePath(FileSystem fs, String pathToCreate) throws IOException{
boolean b = false;
// String pathToCreate = "/hyw/test/huabingood/hyw";
Path path = new Path(pathToCreate);
try {
// even the path exist,it can also create the path.
b = fs.mkdirs(path);
} finally {
fs.close();
}
return b;
}
}
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01
需要启动hdfs服务。。
%HADOOP_PREFIX%\sbin\start-dfs.cmd
start-dfs.cmd
start "Apache Hadoop Distribution" hadoop namenode
start "Apache Hadoop Distribution" hadoop datanode
需要先建立namenode
D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format
这里的hadoop指的是应该是bin\hadoop.cmd命令,把他加入到path目录envi pathvar
D:\haddop\hadoop-3.1.1\sbin> D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format
2018-10-28 07:02:54,801 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hmNotePC/192.168.1.101
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.1.1
STARTUP_MSG: classpath = D:\haddop\hadoop-3.1.1\etc\hadoop;D:\haddop\hadoop-3.1.1\share\had
STARTUP_MSG: build = https://github.com/apache/hadoop -r 2b9a8c1d3a2caf1e733d57f346af3ff0d5ba529c; compiled by 'leftnoteasy' on 2018-08-02T04:26Z
STARTUP_MSG: java = 1.8.0_31
************************************************************/
2018-10-28 07:02:54,854 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-ecf4351a-e57c-411b-8ef3-2198981bc44b
2018-10-28 07:02:56,060 INFO namenode.FSEditLog: Edit logging is async:true
2018-10-28 07:02:56,090 INFO namenode.FSNamesystem: KeyProvider: null
2018-10-28 07:02:56,092 INFO namenode.FSNamesystem: fsLock is fair: true
2018-10-28 07:02:56,100 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2018-10-28 07:02:56,119 INFO namenode.FSNamesystem: fsOwner = Administrator (auth:SIMPLE)
2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: supergroup = supergroup
2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: isPermissionEnabled = true
2018-10-28 07:02:56,121 INFO namenode.FSNamesystem: HA Enabled: false
2018-10-28 07:02:56,203 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2018-10-28 07:02:56,238 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2018-10-28 07:02:56,239 INFO blockmanagement.BlockManager: The block deletion will start around 2018 十月 28 07:02:56
2018-10-28 07:02:56,243 INFO util.GSet: Computing capacity for map BlocksMap
2018-10-28 07:02:56,243 INFO util.GSet: VM type = 64-bit
2018-10-28 07:02:56,248 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
2018-10-28 07:02:56,251 INFO util.GSet: capacity = 2^21 = 2097152 entries
2018-10-28 07:02:56,269 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2018-10-28 07:02:56,362 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
2018-10-28 07:02:56,362 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2018-10-28 07:02:56,363 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2018-10-28 07:02:56,364 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2018-10-28 07:02:56,364 INFO blockmanagement.BlockManager: defaultReplication = 3
2018-10-28 07:02:56,365 INFO blockmanagement.BlockManager: maxReplication = 512
2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: minReplication = 1
2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
2018-10-28 07:02:56,367 INFO blockmanagement.BlockManager: redundancyRecheckInterval = 3000ms
2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: encryptDataTransfer = false
2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
2018-10-28 07:02:56,425 INFO util.GSet: Computing capacity for map INodeMap
2018-10-28 07:02:56,426 INFO util.GSet: VM type = 64-bit
2018-10-28 07:02:56,426 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
2018-10-28 07:02:56,427 INFO util.GSet: capacity = 2^20 = 1048576 entries
2018-10-28 07:02:56,428 INFO namenode.FSDirectory: ACLs enabled? false
2018-10-28 07:02:56,429 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2018-10-28 07:02:56,429 INFO namenode.FSDirectory: XAttrs enabled? true
2018-10-28 07:02:56,430 INFO namenode.NameNode: Caching file names occurring more than 10 times
2018-10-28 07:02:56,440 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRo
2018-10-28 07:02:56,444 INFO snapshot.SnapshotManager: SkipList is disabled
2018-10-28 07:02:56,452 INFO util.GSet: Computing capacity for map cachedBlocks
2018-10-28 07:02:56,452 INFO util.GSet: VM type = 64-bit
2018-10-28 07:02:56,453 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
2018-10-28 07:02:56,453 INFO util.GSet: capacity = 2^18 = 262144 entries
2018-10-28 07:02:56,467 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2018-10-28 07:02:56,468 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2018-10-28 07:02:56,469 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2018-10-28 07:02:56,475 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2018-10-28 07:02:56,476 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2018-10-28 07:02:56,481 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2018-10-28 07:02:56,482 INFO util.GSet: VM type = 64-bit
2018-10-28 07:02:56,482 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
2018-10-28 07:02:56,483 INFO util.GSet: capacity = 2^15 = 32768 entries
2018-10-28 07:02:56,527 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1079199093-192.168.1.101-1540681376517
2018-10-28 07:02:56,547 INFO common.Storage: Storage directory \tmp\hadoop-Administrator\dfs\name has been successfully formatted.
2018-10-28 07:02:56,580 INFO namenode.FSImageFormatProtobuf: Saving image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 us
2018-10-28 07:02:56,717 INFO namenode.FSImageFormatProtobuf: Image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 of size 3
2018-10-28 07:02:56,741 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2018-10-28 07:02:56,756 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hmNotePC/192.168.1.101
************************************************************/
D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://huabingood01:9000</value>
</property>
</configuration>
评论
访问localhost:50070失败,说明namenode启动失败
3、查看namenode启动日志
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSI
mage.java:376)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(
Hadoop.dll winutil.exe feodg system win dir hto...maybe need boot
D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://huabingood01:9000</value>
</property>
</configuration>
conf = new Configuration();
// 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高
conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");
conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
Cfg file and code url yao yyo..
C:\tmp\hadoop-Administrator>tree
卷 p1sys 的文件夹 PATH 列表
卷序列号为 A87E-7AB4
C:.
├─dfs
│ ├─data
│ └─name
└─nm-local-dir
String pathToCreate = "/firstDirS09/secdirS09";
hdfsHadoopClass.myCreatePath(fs, pathToCreate);
实际建立的文件夹在的盘
D:\firstDirS09\secdirS09
//wirte file
FSDataOutputStream FSDataOutputStream1 = fs.create(new Path("/file1S09.txt"));
FSDataOutputStream1.writeUTF("attilax bazai");
FSDataOutputStream1.close();
D:\file1S09.txt
D:\.file1S09.txt.crc
使用javaAPI操作hdfs - huabingood - 博客园.html