remove file but exclude all files in a list
Solution 1
The rm
command is commented out so that you can check and verify that it's working as needed. Then just un-comment that line.
The check directory
section will ensure you don't accidentally run the script from the wrong directory and clobber the wrong files.
You can remove the echo deleting
line to run silently.
#!/bin/bash
cd /home/me/myfolder2tocleanup/
# Exit if the directory isn't found.
if (($?>0)); then
echo "Can't find work dir... exiting"
exit
fi
for i in *; do
if ! grep -qxFe "$i" filelist.txt; then
echo "Deleting: $i"
# the next line is commented out. Test it. Then uncomment to removed the files
# rm "$i"
fi
done
Solution 2
This python script can do this:
#!/usr/bin/env python3
import os
no_remove = set()
with open('./dont-delete.txt') as f:
for line in f:
no_remove.add(line.strip())
for f in os.listdir('.'):
if f not in no_remove:
print('unlink:' + f )
#os.unlink(f)
Important part is to uncomment the os.unlink()
function.
NOTE: add this script and dont-delete.txt
to your dont-delete.txt
so that they both are on the list, and keep them in the same directory.
Solution 3
Here's a one-liner:
comm -2 -3 <(ls) <(sort dont_delete) | tail +2 | xargs -p rm
ls
prints all files in the current directory (in sorted order)sort dont_delete
prints all the files we don't want to delete in sorted order- the
<()
operator turns a string into a file-like object - The
comm
commands compares two pre-sorted files and prints out lines on which they differ - using the
-2 -3
flags causescomm
to only print lines contained in the first file but not the second, which will be the list of files that are safe to delete - the
tail +2
call is just to remove the heading of thecomm
output, which contains the name of the input file - Now we get a list of files to delete on standard out. We pipe this output to
xargs
which will turn the output stream into a list of arguments forrm
. The-p
option forcesxargs
to ask for confirmation before executing.
Solution 4
Unless the output of ls /home/me/myfolder2tocleanup/
exceeds the maximum shell argument limit ARG_MAX
which is around 2MB for Ubuntu, I would suggest the following.
A one line command implementation that will do the job, would be as follows:
- Copy the
dont-delete.txt
file to the directory containing the files to be deleted like so:
cp dont-delete.txt /home/me/myfolder2tocleanup/
-
cd
to the directory containing the files to be deleted like so:
cd /home/me/myfolder2tocleanup/
- Do a dry-run to test the command and make it print the names of the files that it detects as to be deleted without actually deleting them, like so:
ls -p | grep -v / | sed 's/\<dont-delete.txt\>//g' | sort | comm -3 - <(sort dont-delete.txt) | xargs echo | tr " " "\n"
- If you are satisfied with the output, delete the files by running the command like so:
ls -p | grep -v / | sed 's/\<dont-delete.txt\>//g' | sort | comm -3 - <(sort dont-delete.txt) | xargs rm
Explaination:
-
ls -p
will list all the files and directories in the current directory and the option-p
will add a/
to the directory names. -
grep -v /
will exclude directories by removing all items containing a/
in their names. -
sed 's/\<dont-delete.txt\>//g'
will exclude thedont-delete.txt
file, so it does not get deleted in the process. -
sort
will, just to make sure, sort the remaining output ofls
. -
comm -3 - <(sort dont-delete.txt)
will sort thedont-delete.txt
file, compare it to the sorted output ofls
and exclude filenames that exist in both. -
xargs rm
will remove all the remaining filenames in the already processed output ofls
. This means all the items in the current directory will be removed except for directories, files listed in thedont-delete.txt
file and thedont-delete.txt
file itself
In the dry-run part:
-
xargs echo
will print the files that should be removed. -
tr " " "\n"
will translate spaces into new lines for easier readability.
Notice:
In some cases parsing the output of ls
might be better avoided.
Solution 5
FWIW it looks like you can do this natively in zsh
, using the (+cmd)
glob qualifier.
To illustrate, let's start with some files
% ls
bar baz bazfoo keepfiles.txt foo kazoo
and a whitelist file
% cat keepfiles.txt
foo
kazoo
bar
First, read the whitelist into an array:
% keepfiles=( "${(f)$(< keepfiles.txt)}" )
or perhaps better
% zmodload zsh/mapfile
% keepfiles=( ${(f)mapfile[./keepfiles.txt]} )
(the equivalent of bash's mapfile
builtin - or its synonym readarray
). Now we can check whether a key (filename) exists in the array using ${keepfiles[(I)filename]}
which returns 0 if no match is found:
% print ${keepfiles[(I)foo]}
1
% print ${keepfiles[(I)baz]}
0
%
We can use this to make a function that returns true
if there are no matches for $REPLY
in the array:
% nokeep() { (( ${keepfiles[(I)$REPLY]} == 0 )); }
Finally, we use this function as a qualifier in our command:
% ls *(+nokeep)
baz bazfoo keepfiles.txt
or, in your case
% rm -- *(+nokeep)
(You'll likely want to add the name of the whitelist file itself to the whitelist.)
Related videos on Youtube
stefan83
Updated on September 18, 2022Comments
-
stefan83 over 1 year
I need to cleanup a folder periodically. I get a filelist which contains text, which files are allowed. Now I have to delete all files which are not in this file.
Example:
dont-delete.txt
:dontdeletethisfile.txt reallyimportantfile.txt neverdeletethis.txt important.txt
My folder do clean-up contains this as example:
ls /home/me/myfolder2tocleanup/
:dontdeletethisfile.txt reallyimportantfile.txt neverdeletethis.txt important.txt this-can-be-deleted.txt also-waste.txt never-used-it.txt
So this files should be deleted:
this-can-be-deleted.txt also-waste.txt never-used-it.txt
I search something to create a delete command with an option to exclude some files provided by file.
-
mook765 over 7 yearsIs this a homework?
-
Gujarat Santana over 7 yearsI hope you're not his teacher. lol
-
Sergiy Kolodyazhnyy over 7 years@gujarat We're not free homework service, so the comment is justified. As for the question itself, it may be useful to others, so it's open so far.
-
Gujarat Santana over 7 years@Serg I'm totally agree with you
-
-
David Foerster over 7 yearsI changed your code to use a
set
instead of a list for O(1) instead of O(n) look-up in the second part. -
stefan83 over 7 yearsthanks for your help, i'm normally a windows guy, but python seams too be cool =)
-
stefan83 over 7 yearsthx for your help, now I have my solution !
-
David Foerster over 7 years@stefan83: Python runs just as well on Windows.
-
David Foerster over 7 yearsI edited your code to avoid useless use of
ls
and the useless capturing of the output ofgrep
if all you want to know is whether there was a match or not. I also used fixed-string patterns to avoid escaping issues. -
Apologician over 7 years@DavidFoerster Thanks for the contribution. However, when you changed the
while
loop to afor
loop you inadvertently changed theiteration key
fromi
tof
. in the declaration, which broke the code. I fixed it. -
David Foerster over 7 yearsOops, force of habit. I tend to abbreviate shell variable names for file names as
f
. ;-P (…and +1 for your answer which I forgot earlier.) -
Jacques MALAPRADE almost 6 yearsI tried this with a text file of the file names separated by a newline. It ended up deleting all the files in the directory.
-
nyxz almost 6 yearsI guess your "keep list" was wrong.
-
nyxz almost 6 yearsI've added example usage.
-
Negar almost 5 years@gardenhead, I tired your code but it removes all files in the directory and keep only the first and the last file in the dont-delete list. do you have any idea for this problem? thanks in advance.
-
Tex over 2 yearsThis is better than the accepted answer, as if the keep list is of length M and you have N files to filter, this solution is O(MlgM + N)
-
PesKchan over 2 yearsi tried thi, instead of files i have folders and sub-folder with files inside. It didn't work why is it so?