写在前面的话
Hdfs采用分布式架构,为上层的应用和用户提供可扩展、高吞吐、高可靠的数据存储服务。在整个Hadoop生态系统中,hdfs处于最底层,也是最无可替代的一个基础设施。从2008年hadoop-0.10.1版本开始到现在的hadoop-3.0.0-beta1,hdfs已经走过了近10个年头,其架构和功能特性也发生了巨大的变化。特别是hdfs3.0.0系列,和hdfs2.x相比,增加了基于纠删码(erasure encoding)的容错方式,与传统的副本方式相比,在同等可用性的情况下, 能大幅节省一半以上的空间,这也是自hdfs诞生近这十年来,数据可靠性机制上的一个重大变化(之前一直都是副本容错方式)。此外hdfs3.0.0还增加了其它的一些特性,例如在Namenode HA中支持3个Namenode,可以容忍2个Namenode失效,而hdfs2.x只能容忍1个Namenode失效。
本文以的方式,在“”上记录自己使用hadoop-3.0.0-beta1的hdfs的点点滴滴,包括从零开始搭建分布式hdfs3.0,如何动态扩展hdfs节点、如何使用hdfs3.0的纠删码容错等等。不当之处,请大家发邮件aishuc@126com给艾叔,谢谢!
4.8 查看HDFS系统
显示当前HDFS系统的容量、空间使用情况、节点信息(节点主机名、状态是否正常)等。
例子代码
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfsadmin -report
Configured Capacity: 18746441728 (17.46 GB)
Present Capacity: 16170074112 (15.06 GB)
DFS Remaining: 16169504768 (15.06 GB)
DFS Used: 569344 (556 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 32
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 192.168.182.11:9866 (nn1)
Hostname: nn1
Decommission Status : Normal
Configured Capacity: 18746441728 (17.46 GB)
DFS Used: 569344 (556 KB)
Non DFS Used: 2576367616 (2.40 GB)
DFS Remaining: 16169504768 (15.06 GB)
DFS Used%: 0.00%
DFS Remaining%: 86.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Nov 29 16:40:22 EST 2017
Last Block Report: Wed Nov 29 16:38:07 EST 2017
注意:
Configured Capacity: 18746441728 (17.46 GB) 这个是datanode目录所在分区的总容量,例如我们这里,就是/dev/mapper/centos-root的容量
Present Capacity: 16170074112 (15.06 GB):
DFS Remaining: 16169504768 (15.06 GB):hdfs目前可用容量
DFS Used: 569344 (556 KB):hdfs已用容量
Present Capacity=所有datanode目录可用容量之和
Present Capacity = DFS Remaining + DFS Used
4.9 查看文本文件内容
查看hdfs的/profile文件内容
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfs -cat /profile
4.10 查看命令帮助
使用-help + command,将显示command对应的帮助说明
例子
例如,查看tail命令
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfs -help tail
-tail [-f]
Show the last 1KB of the file.
-f Shows appended data as the file grows.
查看get命令
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfs -help get
-get [-f] [-p] [-ignoreCrc] [-crc]
Copy files that match the file pattern
When copying multiple files, the destination must be a directory. Passing -f
overwrites the destination if it already exists and -p preserves access and
modification times, ownership and the mode.
查看df命令
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfs -help df
-df [-h] [
Shows the capacity, free and used space of the filesystem. If the filesystem has
multiple partitions, and no path to a particular partition is specified, then
the status of the root partitions will be shown.
-h Formats the sizes of files in a human-readable fashion rather than a number
of bytes.
4.11 在hdfs中查找文件或目录
例子
[user@nn1 hadoop-3.0.0-beta1]$ hdfs dfs -find / -name "*.xml"
输出结果
/hadoop/capacity-scheduler.xml
/hadoop/core-site.xml
/hadoop/hadoop-policy.xml
/hadoop/hdfs-site.xml
/hadoop/httpfs-site.xml
/hadoop/kms-acls.xml
/hadoop/kms-site.xml
/hadoop/mapred-site.xml
/hadoop/yarn-site.xml
上一篇:
原创文章,转载请注明: 转载自
本文链接地址: