cpio VS tar and cp

5,091

Solution 1

This is an extremely generic overview:

CPIO does a better job of duplication a file system, including taking backups. It preserves things like hardlinks, fifos, and other not-a-standard-file features. Most implementations of CPIO do everything TAR does, including reading and writing .tar files. CPIO usually takes a list of files from standard input to archive; this makes it very easy to pipe a list from something else (like find).

CPIO passthrough is very useful if you have a very long list of files you want to copy from directory A to directory B. (For example, you could use find to locate all files that have changed in the last 2 years on your system)

TAR does a better job of simply dumping all your standard files to/from a tape (or archive file). It's a bit simpler to use (for most common tasks). It meets most people's simple backup demands easily; and most of it's popularity is from this fact.

And now for the fine print. There's several different versions and implementations of both CPIO and TAR. Each one has different features and some have different command line options. There are things that each can do where the other can not; if you find yourself limited by one, try the other. Everyone has a favorite, and 99% of the time either will accomplish the task.

Solution 2

I understand from the comments and other background that cpio is less ubiquitous now and inconsistent between versions. But cpio has one advantage I recently found invaluable when dealing with a large number of corrupt tar archives. It does not stop at the first error in a tar file but attempts to skip bad data and extract as much as possible. For example,

tar xf ./sample.corrupt.tar

will print

tar: Skipping to next header
tar: Exiting with failure status due to previous errors

after the first encountered error, whereas

cpio -F ./sample.corrupt.tar -i -v

will print the extracted files and for each error will print:

cpio: invalid header: checksum error
cpio: warning: skipped 6 bytes of junk

cpio: invalid header: checksum error
cpio: warning: skipped 2 bytes of junk

etc...

The tar format expects each archive header to be aligned on a 512 boundary, but if corruption mis-aligns the headers, cpio makes a best effort to extract as much as possible

Solution 3

On AE 3 redhat, I found that cpio had a size limitation of 2 GBytes on an output stream. However, tar did not have this limitation.

Other systems might have different limitations.

Share:
5,091

Related videos on Youtube

user1698102
Author by

user1698102

Elitists are oppressive, anti-intellectual, ultra-conservative, and cancerous to the society, environment, and humanity. Please help make Stack Exchange a better place. Expose elite supremacy, elitist brutality, and moderation injustice to https://stackoverflow.com/contact (complicit community managers), in comments, to meta, outside Stack Exchange, and by legal actions. Push back and don't let them normalize their behaviors. Changes always happen from the bottom up. Thank you very much! Just a curious self learner. Almost always upvote replies. Thanks for enlightenment! Meanwhile, Corruption and abuses have been rampantly coming from elitists. Supportive comments have been removed and attacks are kept to control the direction of discourse. Outright vicious comments have been removed only to conceal atrocities. Systematic discrimination has been made into policies. Countless users have been harassed, persecuted, and suffocated. Q&A sites are for everyone to learn and grow, not for elitists to indulge abusive oppression, and cover up for each other. https://paste.ubuntu.com/p/K3kPdgGzVd https://paste.ubuntu.com/p/4sTKWKhsKF/ https://paste.ubuntu.com/p/NNm5sNbRgK paste.ubuntu.com/p/Qh6wNZDXR My posts on various stackexchange sites are now under attack by D.W., a cs.stackexchange.com's moderator, who has been removing my posts and suspending my account there and now try his best to suppress me on the network. https://softwareengineering.stackexchange.com/posts/419086/revisions https://math.meta.stackexchange.com/q/32539/ (https://i.stack.imgur.com/4knYh.png) and https://math.meta.stackexchange.com/q/32548/ (https://i.stack.imgur.com/9gaZ2.png) https://meta.stackexchange.com/posts/353417/timeline (The moderators defended continuous harassment comments showing no reading and understanding of my post) https://cs.stackexchange.com/posts/125651/timeline (a PLT academic had trouble with the books I am reading and disparaged my self learning posts, and a moderator with long abusive history added more insults.) https://stackoverflow.com/posts/61679659/revisions (homework libels) Much more that have happened.

Updated on September 18, 2022

Comments

  • user1698102
    user1698102 almost 2 years

    I just learned that cpio has three modes: copy-out, copy-in and pass-through.

    I was wondering what are the advantages and disadvantages of cpio under copy-out and copy-in modes over tar. When is it better to use cpio and when to use tar?

    Similar question for cpio under pass-through mode versus cp.

    Thanks and regards!

  • Adam Katz
    Adam Katz almost 8 years
    (I originally posted this answer to the SO question. Upon seeing this version, I decided to copy it here.)
  • kkm
    kkm over 3 years
    2GB may be the size limit of the older cpio archive format, known as the "binary" format. Many distros come with cpio(1) that supports multiple formats but uses the binary by default ~~for compatibility with those System III archives grandpa saved on their PDP-11~~. For GNU cpio, the flag is -H, and my latest and greatest 2011 edition of its manual lists as many as 8 formats, of which whopping 3 are not called "old" or "obsolete." BTW, I've seen bin-formated cpio files with 16-bit words in the header stored in either byte order. That's where the compatibility with that gramps' archive ends…