How to find folders over 1GB and execute another command in linux terminal?

32,377

Solution 1

A common problem when dealing with file/directory names is when they contain whitespace. *nix filepaths can even contain \n newlines. To get around all whitespace issues, you need to work with a null delimiter \x00.

#!/bin/bash
#
# Parameter 1 ("$1"):  Remove sub-directories from this directory
# Parameter 2 ("$2"):  Remove sub-directories larger than this many bytes 
#
# Example, To remove sub-directories bigger than 1 GB from your HOME directory
#   
#    script "$HOME"  $((2**30))     
#        
dir="$1"; shopt -s extglob; dir="${dir%%+(/)}"  # remove trailing / from directory path
[[ -d "$dir" ]] || { echo "\$1: directory NOT found: $1"; exit 1; }

size=$2  # size in bytes
[[ -z $2 || -n ${2//[0-9]} ]] && { echo "\$2: size-threshold must be numeric: $2"; exit 2; }

du -0b "$dir" |                        # output with \x00 as end-of-path
 sort -zrn  |                          # sort dirs,largest first
  awk -vRS="\x00" -vORS="\x00" -v"size=$size" -v"dir=$dir" -v"prev=\x00" '{
     if( $1<=size ) next               # filter by size; skip small dirs
     match( $0, "\x09" )               # find du TAB-delimiter           
     path = substr( $0, RSTART+1 )     # get directory path 
     if( path ~ "^"dir"/*$" ) next     # filter base dir; do not kill it! 
     match( path, "^" prev ".+" )      # print (ie. process) parent dirs only
     if( RSTART == 0 ) { print path }
     prev = path }' |
   xargs -0 -I{} echo rm -vr {}        # remove the `echo` to run live!!!!

Solution 2

Caution deletes all the Files & Directories above 1GB in the given path

du -sh -t1000000000 /some/path/* | awk -F" " '{print $2}' | xargs rm -rf

Solution 3

To find folders larger than 10G: du -h /mnt/backup/ |awk '$1 ~ /[0-9]*G/ {print}' |sort -nr|sed 's/G//g' |awk '{ if ( $1 > 10.0 ) print }'

You can change the 10.0 to any number and /mnt/backup to any path, it will print out folders that match with their size in GBs.

Solution 4

What you're asking for is a terrible idea. This is mostly because of how what you asked for works: If a folder foo contains more than 1GB, every parent folder of foo also contains more than that (because it contains the folder foo).

Thus, if you scan /home/myuser/myfolder/ for things larger, and /home/myuser/myfolder/bar/quz/baz/foo is, /home/myuser/myfolder/bar/quz/baz, /home/myuser/myfolder/bar/quz/ /home/myuser/myfolder/bar/, and /home/myuser/myfolder/ will all be marked for deletion.

You can get around this with the -S option to du.

This gives a result (THAT I DO NOT RECOMMEND RUNNING)

du -Sb $DIR | grep '^[0-9]\{10\}' | cut -f 2- | xargs -d "\n" rm -rf

This will fail on directories whose names contain newline characters. Fixing it to not have that flaw is left as an exercise to the reader.

If you want another size make up a regex to match it. du -b returns sizes in bytes, so work from there. HINT: 365MB or more would be '^\([0-9]\{10\}\|[4-9][0-9]\{8\}\|3[7-9][0-9]\{7\}\|36[6-9][0-9]\{6\}\)'.

Share:
32,377

Related videos on Youtube

xmux
Author by

xmux

Updated on September 18, 2022

Comments

  • xmux
    xmux over 1 year

    I want to find the folders which sizes are over 1GB and then if they are over then I want to erase them.

    I found some commands like

    find /some/path -type d -size +1G -exec ls {} \;
    

    or

    du -h /some/path | grep ^[0-9.]*G
    

    or (over 600M)

    du -h /some/path/ | grep ^[6-9][0-9][0-9][0-9.]*M | sort
    

    But these two commands are not really helping to me because the find command is not finding any folders although there are folders over 1GB but the linux thinks they are some small KB. Is there any command to achieve that?

  • xmux
    xmux almost 12 years
    can u give me an example script for this situation? thanks
  • darnir
    darnir almost 12 years
    See if someone can help you with the sed script. I have just started learning sed. Will try and cook up a script if I can.
  • xmux
    xmux almost 12 years
    where should i write the path in this script?
  • xmux
    xmux almost 12 years
    yes i know i want to delete all the folders because if they are over 1GB i dont need them any more..
  • Peter.O
    Peter.O almost 12 years
    If you save the script as rm-dirs-gt, the you can just call it from the command-line as: rm-dirs-gt "$HOME" (or whatever path you choose) ... The size threshold , which must be specified in bytes, is currently hard-coded as $((2**30)), which is 1GB ... It would be quite simple to have it as a second parameter ... I'll add the size-threshold parameter to my answer.
  • xmux
    xmux almost 12 years
    do u know how can i exlude the main folder from remove?
  • xmux
    xmux almost 12 years
    i think i found out, but the problem is the /path/ is getting removed also du -h /some/path/ | grep ^[0-9.]*G | cut -f 2- | xargs echo rm -rf
  • xmux
    xmux almost 12 years
    i got du illegal option -- 0 and awk: invalid -v option error from this script
  • Peter.O
    Peter.O almost 12 years
    Which version of du and awk are you using? Mine is du (GNU coreutils) 7.4, and GNU Awk 3.1.6.. With awk the variables can also be passed as args, but the -v method it the preferred way.
  • xmux
    xmux almost 12 years
    sorry, my mistake, the code works! can you exclude the main folder from removing?
  • Peter.O
    Peter.O almost 12 years
    It should already be excluded... and that's why the echo before the rm is very important, while testing... Are you actually seeing the main directory being shown as a command for removal? .. This is the line which should take car of that if( path ~ "^"dir"/*$" ) next # filter base dir; do not kill it! ... it and the initial stripping of / from the directory parameter $1
  • xmux
    xmux almost 12 years
    ohh the old script was ok! now with this new script where should i write the directory path?
  • xmux
    xmux almost 12 years
    you are right i dont see the main directory.. the old script works great thanks!
  • xmux
    xmux almost 12 years
    i changes this script "/some/path/" $((2**30)) and i got this: /some/path/: Is a directory Terminated
  • Peter.O
    Peter.O almost 12 years
    script is just an generic indicator for whatever name you choose to save the script as: Just use the name you save it as (instead of "script") ... Note that you must make scripts executable by running the command chmod +x my-script ...(where my-script is whatever name you decide to save it as)... A good generally used directory to put your scripts in is a subdirectory called bin in you home directory ... Also you can add that bin directory to you $PATH, so you can run it via its name only, otherwise you need to specify its path.
  • xmux
    xmux almost 12 years
    is there any way remove the main folder from selection? for example /home/myuser/myfolder/ will be not removed with that command
  • zebediah49
    zebediah49 almost 12 years
    "You can get around this with the -S option to du."-- though that will still get flagged if that directory itself has too many large files. Solution I guess would be du -Sb $DIR/*, so you're not ever running it on the $DIR in question.
  • Burgi
    Burgi over 6 years
    This appears to be very similar to other answers. Can you go into a little more detail on what is different with your answer?