Tar a directory, but don't store full absolute paths in the archive

266,910

Solution 1

tar -cjf site1.tar.bz2 -C /var/www/site1 .

In the above example, tar will change to directory /var/www/site1 before doing its thing because the option -C /var/www/site1 was given.

From man tar:

OTHER OPTIONS

  -C, --directory DIR
       change to directory DIR

Solution 2

The option -C works; just for clarification I'll post 2 examples:

  1. creation of a tarball without the full path: full path /home/testuser/workspace/project/application.war and what we want is just project/application.war so:

    tar -cvf output_filename.tar  -C /home/testuser/workspace project
    

    Note: there is a space between workspace and project; tar will replace full path with just project .

  2. extraction of tarball with changing the target path (default to ., i.e current directory)

    tar -xvf output_filename.tar -C /home/deploy/
    

    tar will extract tarball based on given path and preserving the creation path; in our example the file application.war will be extracted to /home/deploy/project/application.war.

    /home/deploy: given on extract
    project: given on creation of tarball

Note : if you want to place the created tarball in a target directory, you just add the target path before tarball name. e.g.:

tar -cvf /path/to/place/output_filename.tar  -C /home/testuser/workspace project

Solution 3

Seems -C option upto tar v2.8.3 does not work consistently on all the platforms (OSes). -C option is said to add directory to the archive but on Mac and Ubuntu it adds absolute path prefix inside generated tar.gz file.

tar target_path/file.tar.gz -C source_path/source_dir

Therefore the consistent and robust solution is to cd in to source_path (parent directory of source_dir) and run

tar target_path/file.tar.gz source_dir

or

tar -cf target_path/file.tar.gz source_dir

in your script. This will remove absolute path prefix in your generated tar.gz file's directory structure.

Solution 4

The following command will create a root directory "." and put all the files from the specified directory into it.

tar -cjf site1.tar.bz2 -C /var/www/site1 .

If you want to put all files in root of the tar file, @chinthaka is right. Just cd in to the directory and do:

tar -cjf target_path/file.tar.gz *

This will put all the files in the cwd to the tar file as root files.

Solution 5

One minor detail:

tar -cjf site1.tar.bz2 -C /var/www/site1 .

adds the files as

tar -tf site1.tar.bz2
./style.css
./index.html
./page2.html
./page3.html
./images/img1.png
./images/img2.png
./subdir/index.html

If you really want

tar -tf site1.tar.bz2
style.css
index.html
page2.html
page3.html
images/img1.png
images/img2.png
subdir/index.html

You should either cd into the directory first or run

tar -cjf site1.tar.bz2 -C /var/www/site1 $(ls /var/www/site1)
Share:
266,910
QuentinC
Author by

QuentinC

Updated on July 08, 2022

Comments

  • QuentinC
    QuentinC almost 2 years

    I have the following command in the part of a backup shell script:

    tar -cjf site1.bz2 /var/www/site1/
    

    When I list the contents of the archive, I get:

    tar -tf site1.bz2
    var/www/site1/style.css
    var/www/site1/index.html
    var/www/site1/page2.html
    var/www/site1/page3.html
    var/www/site1/images/img1.png
    var/www/site1/images/img2.png
    var/www/site1/subdir/index.html
    

    But I would like to remove the part /var/www/site1 from directory and file names within the archive, in order to simplify extraction and avoid useless constant directory structure. Never know, in case I would extract backuped websites in a place where web data weren't stored under /var/www.

    For the example above, I would like to have :

    tar -tf site1.bz2
    style.css
    index.html
    page2.html
    page3.html
    images/img1.png
    images/img2.png
    subdir/index.html
    

    So, that when I extract, files are extracted in the current directory and I don't need to move extracted files afterwards, and so that sub-directory structures is preserved.

    There are already many questions about tar and backuping in stackoverflow and at other places on the web, but most of them ask for dropping the entire sub-directory structure (flattening), or just add or remove the initial / in the names (I don't know what it changes exactly when extracting), but no more.

    After having read some of the solutions found here and there as well as the manual, I tried :

    tar -cjf site1.bz2 -C . /var/www/site1/
    tar -cjf site1.bz2 -C / /var/www/site1/
    tar -cjf site1.bz2 -C /var/www/site1/ /var/www/site1/
    tar -cjf site1.bz2 --strip-components=3 /var/www/site1/
    

    But none of them worked the way I want. Some do nothing, some others don't archive sub-directories anymore.

    It's inside a backup shell script launched by a Cron, so I don't know well, which user runs it, what is the path and the current directory, so always writing absolute path is required for everything, and would prefer not changing current directory to avoid breaking something further in the script (because it doesn't only backup websites, but also databases, then send all that to FTP etc.)

    How to achieve this?

    Have I just misunderstood how the option -C works?

  • Freedom_Ben
    Freedom_Ben almost 10 years
    Don't miss the dot at the end, that's important ;-)
  • Andy Lorenz
    Andy Lorenz over 9 years
    how about if you also want to select the files to backup based on a wildcard? -C /var/www/site1 *.dat doesn't work :(
  • Lars Brinkhoff
    Lars Brinkhoff over 9 years
    (d=$PWD && cd /var/www/site1 && tar -cjf $d/site1.tar.bz2 *.dat)
  • Siva
    Siva about 9 years
    how to add wildcard for file selection in the last example?
  • Lars Brinkhoff
    Lars Brinkhoff almost 9 years
    The dot tells tar to archive everything in the current directory. And -C sets the current directory.
  • jorfus
    jorfus over 8 years
    This works great. I find it useful to preserve the directory name (just not the full path), so I did the following: tar -czvf site1.tar.gz -C /var/www/ site1 (Note the space, I'm still using the -C, to cd to the parent dir, and specifying the dir to tar instead of dot)
  • Christian Long
    Christian Long almost 8 years
    The trailing dot refers to the current directory after tar has changed it, not the directory you are in when you run the command. So, if you're in /home/user1 and you do tar -cf mine.tar -C /var/www/site1 . it will tar up /var/www/site1, not /home/user1.
  • thutt
    thutt about 7 years
    Do not use a * instead of the . in the end. This does not work as it means something different ;-)
  • Xen2050
    Xen2050 about 7 years
    Using the * doesn't save any "hidden" .files or .folders. (fyi, using -C together with * fails, the shell expands the current dir, not the -C dir)
  • Gert van den Berg
    Gert van den Berg over 6 years
    The problem with wildcards is that the shell expands them to the matching filenames and that tar doesn't expand them if they are quoted...
  • Mika571
    Mika571 over 6 years
    I get a leading dot in the path of the tar e.g. ./folders how can this be removed?
  • Alex
    Alex over 5 years
    or tar cvjf name.tar.bz2 -C "$p/../" $(basename $p) to pack $p directory
  • MrCalvin
    MrCalvin over 5 years
    Why does this not work: tar -cjf site1.tar.bz2 -C /var/www/site1 /var/www/site1 ?
  • EL_DON
    EL_DON about 5 years
    Use of the -C option DID remove absolute path prefixes inside the generated tar.gz file on fedora 29. Is your answer specific to some system?
  • EL_DON
    EL_DON about 5 years
    Why are you calling it "point"? It's just ., which is the current directory. In the context of the tar.gz's structure, that's just the base/root/top level, right?
  • Admin
    Admin about 5 years
    See the snapshot for details image. My way is more correct to use, It's my opinion.
  • Chinthaka Senanayaka
    Chinthaka Senanayaka about 5 years
    @EL_DON: I did not test -C option on Fedora, but ideally tar application software should work consistently on every platform unless it is a bug in tar application. -C option, I tested on Mac 10.8 and Mac 10.13 and Ubuntu (version I cannot remember). But as of tar v2.8.3, the command has been changed to tar -cf target_path/file.tar.gz source_dir and still if you add -C option it will not remove absolute path prefix inside generated tar.gz file.
  • EL_DON
    EL_DON about 5 years
    I tested again on a centOS system. After creating all the paths in the example and running the command (with -cvf added after tar), I find the resulting tar.gz file does not have absolute paths inside of it, which is consistent with several other answers. If you think tar is broken or outdated on both of the systems I've used for testing, please link to some documentation that would support your answer. I think the -C option changes directory before executing (as in other answers). When I omit it, tar tries to add junk from ./, including paths from starting from ./.
  • Chinthaka Senanayaka
    Chinthaka Senanayaka about 5 years
    I used this doc: linux.die.net/man/1/tar Yes, the doc says -C would do the path change, but on my Mac 10.13 it is not working. this can be an inconsistent behavior of tar app. That means this is a bug. If you are writing a shell script to run on all unix platforms then better be safe with running code that will work on all OSes.
  • EL_DON
    EL_DON about 5 years
    Your answer doesn't say that there may be a bug and the more robust solution for cross-platform compatibility is to cd first. Your answer says the tool works in the opposite way of how the docs say it works and how it works on my system, so it's a wrong answer. You could easily fix it.
  • Chinthaka Senanayaka
    Chinthaka Senanayaka about 5 years
    @EL_DON: Fixed the answer for platform consistent tar command, thanks
  • sdc
    sdc about 4 years
    I tried this on Ubuntu 18.04 and no luck. I'm not sure what I am missing. My stdout is displaying it correctly when I package it, but when I untar it, it still has the full path
  • dragon788
    dragon788 over 3 years
    If you use ls -A you get hidden files too, WITHOUT trying to traverse the .. and . files which is a common source of confusing if doing a tar or rsync where it tries to resolve symlinks.
  • thom_nic
    thom_nic over 3 years
    @Mika571 you can use find to specify the files without the leading ./ like: find "$DIR" -type f -printf '%P\0' |tar -cjvf "$OUTFILE" -C "$DIR" --null -T - (replace $DIR and $OUTFILE as appropriate.) See: stackoverflow.com/a/2596736/213983
  • fuweichin
    fuweichin over 2 years
    To pack all files in /var/www/site1, based on @Lars Brinkhoff's tar -cjf site1.tar.bz2 -C /var/www/site1 ., just replace . with $(ls -1 /var/www/html | tr "\n" " "), that is tar -cjf site1.tar.bz2 -C /var/www/site1 $(ls -1 /var/www/html | tr "\n" " ").
  • pzkpfw
    pzkpfw about 2 years
    won't this create what's commonly referred to as a "tar bomb"? I would much prefer tar -cjf site1.tar.bz2 -C /var/www site1 since that would only create the site1 folder on extraction, and not just dump all files in your current folder.
  • Cililing
    Cililing about 2 years
    -C option still not working on MacOS Big Sur 11.6.2 (bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6)