How do you retrieve the replication factor info in Hdfs files?
Solution 1
Try to use command hadoop fs -stat %r /path/to/file
, it should print the replication factor.
Solution 2
You can run following command to get replication factor,
hadoop fs -ls /user/xxxx
The second column in the output signify replication factor for the file and for the folder it shows -
, as shown in below pic.
Solution 3
Apart from Alexey Shestakov's answer, which works perfectly and does exactly what you ask, other ways, mostly found here, include:
hadoop dfs -ls /parent/path
which shows the replication factors of all the /parent/path
contents on the second column.
Through Java, you can get this information by using:
FileStatus.getReplication()
You can also see the replication factors of files by using:
hadoop fsck /filename -files -blocks -racks
Finally, from the web UI of the namenode, I believe that this information is also available (didn't check that).
Solution 4
We can use following commands to check replication of the file.
hdfs dfs -ls /user/cloudera/input.txt
or
hdfs dfs -stat %r /user/cloudera/input.txt
Solution 5
In case if you need to check replication factor of a HDFS directory
hdfs fsck /tmp/data
shows the average replication factor of /tm/data/ HDFS folder
brain storm
Updated on June 06, 2022Comments
-
brain storm about 2 years
I have set the replication factor for my file as follows:
hadoop fs -D dfs.replication=5 -copyFromLocal file.txt /user/xxxx
When a
NameNode
restarts, it makes sure under-replicated blocks are replicated. Hence the replication info for the file is stored (possibly innameNode
). How can I get that information?