How does hdfs mv command work

17,040

Solution 1

When a user calls hdfs dfs -mv, HDFS guarantees atomicity of the rename operation. When this command is run, the client makes an RPC call to the NameNode. The NameNode implementation of this RPC holds a lock while modifying the inode tree, and only releases that lock after the rename has completed, either successfully or unsuccessfully. (It may fail due to things like permission or quota violations.)

Since the implementation executes entirely within the NameNode and only manipulates file system metadata, there is no actual data movement involved. There is in fact no interaction with DataNodes during a hdfs dfs -mv command. All of a file's blocks remain the same and the block list associated with the inode remains the same. The NameNode simply takes that file's inode from one place and moves it to another place in the file system tree. There is no possibility of corrupting block data.

Since the NameNode provides a guaranteed atomic implementation of rename, there is also no chance of metadata corruption. It's not possible to end up in a "half-completed" state, with the file existing in both places, or even worse, getting deleted completely.

Now I need to add a subtle variation on the above answer. Most of the time, when running HDFS shell commands, it's typical to interact with HDFS as the backing file system. However, this is not the only possible file system implementation. The Apache Hadoop distro ships with alternative file system plugins for S3, Azure Storage and OpenStack Swift. There are also numerous vendors who have created their own file system plugins. Whether or not these alternative file systems provide atomic rename semantics is an implementation detail of those other file systems. The S3 and Swift plugins implement rename as copy-then-delete, so they definitely don't provide an atomicity guarantee. The Azure Storage plugin does provide some optional support for atomic rename by making use of Azure Storage blob leases, but it's not the default behavior.

Also, as a consequence of this, it's not possible to run hdfs dfs -mv crossing different file systems. You'd have to use the copy commands for that, and then it would involve full data copies. Here is what happens when you try to rename across file systems. The example attempts to run hdfs dfs -mv for a source file in my HDFS installation and a destination on the local file system. The command is rejected.

> hdfs dfs -mv hdfs:///testData file:///tmp/testData
mv: `hdfs:///testData': Does not match target filesystem

The last part of your question asks whether or not it's possible to corrupt data while copying. Hadoop will perform checksum verification while reading a file, so it is not expected that clients would see corrupted data. DistCp also can perform checksum comparisons between source and destination as a post-processing step.

Solution 2

mv (move) is just a meta data operation. There is no data movement like in cp (copy).

You can easily test it. I will explain with example.

  1. I have a file /tmp/1.txt.

    I run following command:

    hdfs fsck /tmp/1.txt -files -blocks -locations 
    

    I get following output:

    /tmp/1.txt 5 bytes, 1 block(s):  OK
    0. BP-1788638071-172.23.206.41-1439815305280:blk_1073747956_7133 len=5 repl=1 [DatanodeInfoWithStorage[192.168.56.1:50010,DS-cf19d920-d98b-4877-9ca7-c919df1a869a,DISK]]
    
  2. I move (mv) file /tmp/1.txt to /tmp/1_renamed.txt, which is under same directory /tmp.

    I run following command:

    hdfs fsck /tmp/1_renamed.txt -files -blocks -locations 
    

    I get following output:

    /tmp/1_renamed.txt 5 bytes, 1 block(s):  OK
    0. BP-1788638071-172.23.206.41-1439815305280:blk_1073747956_7133 len=5 repl=1 [DatanodeInfoWithStorage[192.168.56.1:50010,DS-cf19d920-d98b-4877-9ca7-c919df1a869a,DISK]]
    
  3. I move (mv) file /tmp/1_renamed.txt to /tmp1/1.txt, which is under a different directory /tmp1.

    I run following command:

    hdfs fsck /tmp1/1.txt -files -blocks -locations 
    

    I get following output:

    /tmp1/1.txt 5 bytes, 1 block(s):  OK
    0. BP-1788638071-172.23.206.41-1439815305280:blk_1073747956_7133 len=5 repl=1 [DatanodeInfoWithStorage[192.168.56.1:50010,DS-cf19d920-d98b-4877-9ca7-c919df1a869a,DISK]]
    

You can see that, the block report after all the 3 mv operations is the same:

0. BP-1788638071-172.23.206.41-1439815305280:blk_1073747956_7133 len=5 repl=1 [DatanodeInfoWithStorage[192.168.56.1:50010,DS-cf19d920-d98b-4877-9ca7-c919df1a869a,DISK]]

It confirms that, mv just renames the file name in Name Node. In the other answer given by "Chris Nauroth", he has clearly explained how the mv operation is executed.

Data Corruption: It is possible that data could get corrupted while copying using cp or distcp. But, in both the cases, you can check for corruption.

  1. cp command

    hadoop fs -checksum can be used for checking the checksum of a file.

    I copied a file /tmp/1GB/part-m-00000 to another directory /tmp1/part-m-00000. Then I executed the following command:

    hadoop fs -checksum /tmp/1GB/part-m-00000 /tmp1/part-m-00000
    
    /tmp/1GB/part-m-00000   MD5-of-262144MD5-of-512CRC32    0000020000000000000400008f15c32887229c0495a23547e2f0a29a
    /tmp1/part-m-00000      MD5-of-262144MD5-of-512CRC32    0000020000000000000400008f15c32887229c0495a23547e2f0a29a
    

    You can see that checksum for the original and copied files match. So, after copying files, you can execute hadoop fs -checksum command to check if the checksums of 2 files match.

  2. distcp command

    By default, distcp compares the checksums of source and destination files, after the completion of copy operation. If the checksums don't match, then distcp marks that copy operation as FAILED. You can disable checksum comparison, by calling distcp with -skipcrccheck option.

Share:
17,040
Manjinder Aulakh
Author by

Manjinder Aulakh

Updated on July 16, 2022

Comments

  • Manjinder Aulakh
    Manjinder Aulakh almost 2 years

    I would like to know how does mv command in hdfs work?

    1. Is it just a symbolic change without any actual data movement?

      • If moveTo directory exists (may be on diff partition)
      • If moveTo is a new directory
    2. Is it possible to corrupt data while moving large files within hadoop? So is cp or distcp a safer option?

  • ak0817
    ak0817 almost 4 years
    Very well explained. Thanks Chris!