How to make bash glob a string variable?

38,167

Solution 1

You can force another round of evaluation with eval, but that's not actually necessary. (And eval starts having serious problems the moment your file names contain special characters like $.) The problem isn't with globbing, but with the tilde expansion.

Globbing happens after variable expansion, if the variable is unquoted, as here(*):

$ x="/tm*" ; echo $x
/tmp

Another thing that happens for unquoted expansions is word splitting, which will be an issue if the patterns in question contain characters in IFS, usually whitespace. To prevent this issue, word splitting needs to be disabled by setting IFS to the empty string.

So, in the same vein, this is similar to what you did, and works:

$ IFS=
$ mkdir -p ~/public/foo/ ; touch ~/public/foo/x.launch
$ i="$HOME/public/*"; j="*.launch"; k="$i/$j"
$ echo $k
/home/foo/public/foo/x.launch

But with the tilde it doesn't:

$ i="~/public/*"; j="*.launch"; k="$i/$j"
$ echo $k
~/public/*/*.launch

This is clearly documented for Bash:

The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, ...

Tilde expansion happens before variable expansion so tildes inside variables are not expanded. The easy workaround is to use $HOME or the full path instead.

(* expanding globs from variables is usually not what you want)


Another thing:

When you loop over the patterns, as here:

exclude="foo *bar"
for j in $exclude ; do
    ...

note that as $exclude is unquoted, it's both split, and also globbed at this point. So if the current directory contains something matching the pattern, it's expanded to that:

$ IFS=
$ i="$HOME/public/foo"
$ exclude="*.launch"
$ touch $i/real.launch
$ for j in $exclude ; do           # glob, no match
    printf "%s\n" "$i"/$j ; done
/home/foo/public/foo/real.launch

$ touch ./hello.launch
$ for j in $exclude ; do           # glob, matches in current dir!
    printf "%s\n" "$i"/$j ; done
/home/foo/public/foo/hello.launch  # not the expected result

To work around this, use an array variable instead of a split string:

$ IFS=
$ exclude=("*.launch")
$ exclude+=("*.not this")
$ for j in "${exclude[@]}" ; do printf "%s\n" "$i"/$j ; done
/home/foo/public/foo/real.launch
/home/foo/public/foo/some file.not this

Though note that if the patterns don't match anything, they'll by default be left as-is. So if the directory is empty, .../*.launch would be printed etc.


Something similar could be done with find -path, if you don't mind what directory level the targeted files should be. E.g. to find any path ending in /e2e/*.js:

$ dirs="$HOME/public $HOME/private"
$ pattern="*/e2e/*.js"
$ find $dirs -path "$pattern"
/home/foo/public/one/two/three/e2e/asdf.js

We have to use $HOME instead of ~ for the same reason as before, and $dirs needs to be unquoted on the find command line so it gets split, but $pattern should be quoted so it isn't accidentally expanded by the shell.

(I think you could play with -maxdepth on GNU find to limit how deep the search goes, if you care, but that's a bit of a different issue.)

Solution 2

You can save it as an array instead of a string to use later in many cases and let the globbing happen when you define it. In your case, for example:

k=(~/code/public/*/*.launch)
for i in "${k[@]}"; do

or in the later example, you'll need to eval some of the strings

dirs=(~/code/private/* ~/code/public/*)
for i in "${dirs[@]}"; do
    for j in $exclude; do
        eval "for k in $i/$j; do tmutil addexclusion \"\$k\"; done"
    done
done

Solution 3

Old post but I stumbled in here so thought this may help others :)

There is a kinda-glob() function in bash... it's called compgen -G It outputs a value per line so needs a readarray to work.

Try this:

i='~/code/public/*'
j='*.launch'
k=$i/$j # $k='~/code/public/*/*.launch'
readarray -t files < <(compgen -G "$k") # $k is globbed here
for i in "${files[@]}"
do
    echo "Found -$i-"
done

Solution 4

@ilkkachu answer solved the main globbing issue. Full credit to him.

V1

However, due to exclude containing entries both with and without wildcard(*), and also they may not exist in all, extra checking is needed after the globbing of $i/$j. I am sharing my findings here.

#!/bin/bash
exclude="
*.launch
.DS_Store
.classpath
.sass-cache
.settings
Thumbs.db
bower_components
build
connect.lock
coverage
dist
e2e/*.js
e2e/*.map
libpeerconnection.log
node_modules
npm-debug.log
testem.log
tmp
typings
"

dirs="
$HOME/code/private/*
$HOME/code/public/*
"

# loop $dirs
for i in $dirs; do
    for j in $exclude ; do
        for k in $i/$j; do
            echo -e "$k"
            if [ -f $k ] || [ -d $k ] ; then
                # Only execute command if dir/file exist
                echo -e "\t^^^ Above file/dir exist! ^^^"
            fi
        done
    done
done

Output Explaination

Following is the partial output to explain the situation.

/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/a.launch
    ^^^ Above file/dir exist! ^^^
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/b.launch
    ^^^ Above file/dir exist! ^^^
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/.DS_Store
    ^^^ Above file/dir exist! ^^^

The above are self explanatory.

/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/.classpath
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/.sass-cache
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/.settings
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/Thumbs.db
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/bower_components
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/build
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/connect.lock
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/coverage
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/dist

The above show up because the exclude entry($j) has no wildcard, $i/$j become a plain string concatenation. However the file/dir does not exist.

/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/e2e/*.js
/Volumes/HD2/JS/code/public/simple-api-example-ng2-express/e2e/*.map

The above show up as exclude entry($j) contain wildcard but has no file/directory match, the globbing of $i/$j just return the original string.

V2

V2 use single quote, eval and shopt -s nullglob to get clean result. No file/dir final checking require.

#!/bin/bash
exclude='
*.launch
.sass-cache
Thumbs.db
bower_components
build
connect.lock
coverage
dist
e2e/*.js
e2e/*.map
libpeerconnection.log
node_modules
npm-debug.log
testem.log
tmp
typings
'

dirs='
$HOME/code/private/*
$HOME/code/public/*
'

for i in $dirs; do
    for j in $exclude ; do
        shopt -s nullglob
        eval "k=$i/$j"
        for l in $k; do
            echo $l
        done
        shopt -u nullglob
    done
done

Solution 5

With zsh:

exclude='
*.launch
.classpath
.sass-cache
Thumbs.db
...
'

dirs=(
~/code/private/*
~/code/public/*
)

for f ($^dirs/${^${=~exclude}}(N)) {
  echo $f
}

${^array}string is to expand as $array[1]string $array[2]string.... $=var is to perform word splitting on the variable (something other shells do by default!), $~var does globbing on the variable (something other shells also by default (when you generally don't want them to, you'd have had to quote $f above in other shells)).

(N) is a glob qualifier that turns on nullglob for each of those globs resulting from that $^array1/$^array2 expansion. That makes the globs expand to nothing when they don't match. That also happens to turn a non-glob like ~/code/private/foo/Thumbs.db into one, which means that if that particular doesn't exist, it's not included.

Share:
38,167

Related videos on Youtube

John Siu
Author by

John Siu

Updated on September 18, 2022

Comments

  • John Siu
    John Siu over 1 year

    System Info

    OS: OS X

    bash: GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin16)

    Background

    I want time machine to exclude a set of directories and files from all my git/nodejs project. My project directories are in ~/code/private/ and ~/code/public/ so I'm trying to use bash looping to do the tmutil.

    Issue

    Short Version

    If I have a calculated string variable k, how do I make it glob in or right before a for-loop:

    i='~/code/public/*'
    j='*.launch'
    k=$i/$j # $k='~/code/public/*/*.launch'
    
    for i in $k # I need $k to glob here
    do
        echo $i
    done
    

    In the long version below, you will see k=$i/$j. So I cannot hardcode the string in the for loop.

    Long Version

    #!/bin/bash
    exclude='
    *.launch
    .classpath
    .sass-cache
    Thumbs.db
    bower_components
    build
    connect.lock
    coverage
    dist
    e2e/*.js
    e2e/*.map
    libpeerconnection.log
    node_modules
    npm-debug.log
    testem.log
    tmp
    typings
    '
    
    dirs='
    ~/code/private/*
    ~/code/public/*
    '
    
    for i in $dirs
    do
        for j in $exclude
        do
            k=$i/$j # It is correct up to this line
    
            for l in $k # I need it glob here
            do
                echo $l
            #   Command I want to execute
            #   tmutil addexclusion $l
            done
        done
    done
    

    Output

    They are not globbed. Not what I want.

    ~/code/private/*/*.launch                                                                                   
    ~/code/private/*/.DS_Store                                                                                  
    ~/code/private/*/.classpath                                                                                 
    ~/code/private/*/.sass-cache                                                                                
    ~/code/private/*/.settings                                                                                  
    ~/code/private/*/Thumbs.db                                                                                  
    ~/code/private/*/bower_components                                                                           
    ~/code/private/*/build                                                                                      
    ~/code/private/*/connect.lock                                                                               
    ~/code/private/*/coverage                                                                                   
    ~/code/private/*/dist                                                                                       
    ~/code/private/*/e2e/*.js                                                                                   
    ~/code/private/*/e2e/*.map                                                                                  
    ~/code/private/*/libpeerconnection.log                                                                      
    ~/code/private/*/node_modules                                                                               
    ~/code/private/*/npm-debug.log                                                                              
    ~/code/private/*/testem.log                                                                                 
    ~/code/private/*/tmp                                                                                        
    ~/code/private/*/typings                                                                                    
    ~/code/public/*/*.launch                                                                                    
    ~/code/public/*/.DS_Store                                                                                   
    ~/code/public/*/.classpath                                                                                  
    ~/code/public/*/.sass-cache                                                                                 
    ~/code/public/*/.settings                                                                                   
    ~/code/public/*/Thumbs.db                                                                                   
    ~/code/public/*/bower_components                                                                            
    ~/code/public/*/build                                                                                       
    ~/code/public/*/connect.lock                                                                                
    ~/code/public/*/coverage                                                                                    
    ~/code/public/*/dist                                                                                        
    ~/code/public/*/e2e/*.js                                                                                    
    ~/code/public/*/e2e/*.map                                                                                   
    ~/code/public/*/libpeerconnection.log                                                                       
    ~/code/public/*/node_modules                                                                                
    ~/code/public/*/npm-debug.log                                                                               
    ~/code/public/*/testem.log                                                                                  
    ~/code/public/*/tmp                                                                                         
    ~/code/public/*/typings
    
    • Thomas N
      Thomas N over 7 years
      Single quotes stop shell interpolation in Bash, so you might try double-quoting your variable.
    • John Siu
      John Siu over 7 years
      @ThomasN no, that does not work. k is a calculated string, and I need it stay that way till the loop. Please check my long version.
    • John Siu
      John Siu over 7 years
      @ThomasN I updated the short version to make it clearer.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    Note how $exclude contains wildcards, you'd need to disable globbing before using the split+glob operator on it and restore it for the $i/$j and not use eval but use "$i"/$j
  • John Siu
    John Siu over 7 years
    Are you the one answer with find? I actually am exploring that route too, as the for-loop is getting complicated. But I am having difficulty with the '-path'.
  • John Siu
    John Siu over 7 years
    Credit to you as your info about tilde '~' is more direct to the main issue. I will post the final script and explanation in another answer. But full credit to you :D
  • John Siu
    John Siu over 7 years
    Both you and ilkkachu give good answer. However his answer identified the issue. So credit to him.
  • ilkkachu
    ilkkachu over 7 years
    @JohnSiu, yeah, using find was what first came to mind. It might be usable too, depending on the exact need. (or better too, for some uses.)
  • ilkkachu
    ilkkachu over 7 years
    @JohnSiu, updated again, to explicitly mention an issue I think Stéphane was pointing out in the comments to another answer
  • John Siu
    John Siu over 7 years
    Stéphane's comment is not very relevant in this case. You can check out my final script at the bottom. However I properly will try using find as that should eliminate the need to check file/dir exist.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    i=~/public/* will work. It's just that ~ is not expanded inside double quotes.
  • John Siu
    John Siu over 7 years
    This is really nice. I tested and works. However, it seems zsh is more sensitive to newline when using single quote. The way exclude is enclosed is affecting the output.
  • John Siu
    John Siu over 7 years
    I come up with a v2 for bash, which is cleaner, but still not as compact as your zsh script, lol
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    One problem is that in for j in $exclude, the globs in $exclude could get expanded at the time of that $exclude expansion (and calling eval on that is asking for trouble). You'd want globbing enabled for for i in $dir, and for l in $k, but not for for j in $exclude. You'd want a set -f before the latter and a set +f for the other. More generally, you'd want to tune your split+glob operator before using it. In any case, you don't want split+glob for echo $l, so $l should be quoted there.
  • John Siu
    John Siu over 7 years
    @StéphaneChazelas are you refering to v1 or v2? For v2, both exclude and dirs are in single quote(), so no globbing till eval`.
  • Stéphane Chazelas
    Stéphane Chazelas over 7 years
    Globbing takes place upon unquoted variable expansion in list contexts, that (leaving a variable unquoted) is what we sometimes call the split+glob operator. There's no globbing in assignments to scalar variables. foo=* and foo='*' is the same. But echo $foo and echo "$foo" are not (in shells like bash, it's been fixed in shells like zsh, fish or rc, see also the link above). Here you do want to use that operator, but in some places only the split part, and in others only the glob part.
  • John Siu
    John Siu over 7 years
    @StéphaneChazelas Thanks for the info!!! Took me sometime but I understand the concern now. This very valuable!! Thank you!!!
  • ilkkachu
    ilkkachu about 6 years
    @kevinarpe, I think arrays are basically meant for just that, and yes, "${array[@]}" (with the quotes!) is documented (see here and here) to expand to the elements as distinct words without splitting them further.
  • Sixtyfive
    Sixtyfive over 4 years
    x="/tm*" might work, but x="/tm{p,q}" doesn't. Is there a way to make that work, too?
  • ilkkachu
    ilkkachu over 4 years
    @sixtyfive, brace expansion happens before variable expansion, so no. But in that particular case (with single-character variation and if you're looking for existing files) you can use x="/tmp[pq]".
  • Sixtyfive
    Sixtyfive over 4 years
    Thanks, ilkkachu, I didn't even know about []. Would make a good addition to the answer, too! :)
  • ilkkachu
    ilkkachu over 4 years
    @sixtyfive, well, [abc] is a standard part of glob patterns, like ?, I don't think it's necessary to go cover all of them here.
  • lbt
    lbt over 3 years
    There is a kinda-glob() function in bash... it's called compgen -G I added an answer below