Bash script error with strings with paths that have spaces and wildcards

51,161

Solution 1

Inside quotes, the * will not expand to a list of files. To use such a wildcard successfully, it must be outside of quotes.

Even if the wildcard did expand, the expression "${FILES}" would result in a single string, not a list of files.

One approach that would work would be:

#!/bin/bash
DIR="/home/john/my directory/"
for f in "$DIR"/*.txt
do
  echo "${f}"
done

In the above, file names with spaces or other difficult characters will be handled correctly.

A more advanced approach could use bash arrays:

#!/bin/bash
FILES=("/home/john/my directory/"*.txt)
for f in "${FILES[@]}"
do
  echo "${f}"
done

In this case, FILES is an array of file names. The parens surrounding the definition make it an array. Note that the * is outside of quotes. The construct "${FILES[@]}" is a special case: it will expand to a list of strings where each string is one of the file names. File names with spaces or other difficult characters will be handled correctly.

Solution 2

While using arrays as shown by John1024 makes a lot more sense, here, you can also use the split+glob operator (leaving a scalar variable unquoted).

Since you only want the glob part of that operator, you need to disable the split part:

#! /bin/sh -
# that also works in any sh, so you don't even need to have or use bash

file_pattern="/home/john/my directory/*.txt"
# all uppercase variables should be reserved for environment variables

IFS='' # disable splitting

for f in $file_pattern # here we're not quoting the variable so
                       # we're invoking the split+glob operator.
do
  printf '%s\n' "$f" # avoid the non-reliable, non-portable "echo"
done

Solution 3

Single line based solution, (to run in Terminal):
/usr/bin/find "./" -not -type d -maxdepth 1 -iname "*.txt" -print0 | while IFS= read -r -d $'\0' f ; do { echo "${f}"; }; done; unset f;

for your/OP's case, change the "./" into "/home/john/my directory/"

To use in a script file:

#!/bin/bash
/usr/bin/find "./" -not -type d -maxdepth 1 -iname "*.txt" -print0 | while IFS= read -r -d $'\0' f ; do {
    echo "${f}";
    # your other commands/codes, etc
};
done;
unset f;

Above functionality can be achieved in this (recommended) way too:

#!/bin/bash
while IFS= read -r -d $'\0' f ; do {
    echo "${f}";
    # your other commands/codes, etc
};
done < <(/usr/bin/find "./" -not -type d -maxdepth 1 -iname "*.txt" -print0);
unset f;

Brief/Short DESCRIPTION:

"./" : this is the current directory. specify a directory path.
-not -type d : here the -not is configuring it to skip the next mentioned type, & next mentioned -type is d = directories, so it will skip directories. Use f instead of d to skip files. Use l instead of d to skip symlink-files.
-maxdepth 1 : is configuring it to find files within current (aka: one) directory level only. To find file inside each & first sub-directory level, set maxdepth to 2. If -maxdepth not used, then it will search recursively (inside sub-directories), etc.
-iname "*.jpg" : here the -iname is configuring it to find files and ignore (upper/lower)-case in filename/extension. The -name does not ignore case. The -lname finds symlinks. etc.
-print0 : it prints pathname of current file to standard output, followed by an ASCII NUL character (character code 0), which we will later detect by using the read in while.
IFS= : here its used in case a filename ends with a space. We are using NUL / '' / \0 with IFS to detect each found filename. As "find" is configured to separate them with \0 which is produced by -print0.
read -r -d $'\0' fileName : the $'\0' is '' / NUL . The -r was used in case a file name has a backslash.
   read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name ...] [...]
    -r     Backslash does not act as an escape character. The backslash is considered to be part of the line. In particular, a backslash-newline pair may not be used as a line continuation.
done < <(...) : Process-Substitution used here to send/pipe output of "find" into the "read" of "while"-loop. More info: https://www.gnu.org/software/bash/manual/html_node/Process-Substitution.html , https://wiki.bash-hackers.org/syntax/expansion/proc_subst , https://tldp.org/LDP/abs/html/abs-guide.html#PROCESS-SUB , https://tldp.org/LDP/abs/html/abs-guide.html#PSUBP


In other answer by @John1024, he has shown great bash-based solution, that does not use the "find", an external utility.
"find" is very effective & fast, i prefer it when there is too many files to handle.

in @John1024's solution it will print out the matching-rule line when there is no file in the directory, so below [ ! -e "${f}" ]... line is used to skip that,
here is single-line solution to use in Terminal directly:
DIR="/home/john/my directory/" ; for f in "$DIR"*.txt ; do { [ ! -e "${f}" ] && continue; echo "${f}"; }; done; unset DIR;

Here is a script:

#!/bin/bash
DIR="/home/john/my directory/";
for f in "$DIR"*.txt ; do {
    [ ! -e "${f}" ] && continue;
    echo "${f}";
    # your codes/commands, etc
};
done;
unset DIR;

Note: if the directory in DIR has "/" slash (directory indicator) at-end, then in matching-rule, again using the "/" is not necessary,
Or, do opposite: in DIR do not use the "/" at-end and so use it in matching-rule "$DIR"/*.txt


That extra checking by using the [ ! -e "${f}" ]... code can be avoided, if below shell-option (aka: "shopt") is used or enabled:
shopt -s nullglob

If a shopt state was changed by a script, then in other bash based script program it creates unexpected/unanticipated problems.

To have consistent behavior across all scripts that uses bash, bash-shell-option's state should be recorded/saved inside your script, and once you're done using your primary functions in your script, then that shell-option should be set back to previous settings.

We use backtick (aka: grave-accent, aka: backquote) `...` command substitution (for internal bash-command codes, etc), to not-spawn a new sub-shell, retain literal meaning of backslash, for wider support (aka: portability), etc, As backtick based internal bash-commands, etc can be often executed in same shell as the script, etc, And so it is little-bit faster & better, and also better for the purpose we are handling here. If you prefer $(...) command-substitution then use that, anyone have freedom & right to choose what they prefer, avoid, etc. More info: here.

So above script is again shown, & this time with previous settings of a shopt restored, before ending the script:

#!/bin/bash
DIR="/home/john/my directory/";

ub="/usr/bin";

# shopt = shell-option(s).
# Setting-up "previous-nullglob" state to "enabled"/"on"/"1":
p_nullglob=1;
# The "shopt -s" command output shows list of enabled shopt list, so if 
# nullglob is NOT set to ON/enabled, then setting "previous_nullglob" as 0
[ "`shopt -s | ${ub}/grep nullglob`" == "" ] && p_nullglob=0;
# Enabling shell-options "nullglob":
shopt -s nullglob;

# previous code, but without the extra checking [ ! -e "${f}" ]... line:
for f in "$DIR"*.txt ; do {
    echo "${f}";
    # your codes/commands, etc
};
done;

# As we have utilized enabled nullglob shopt, its now in enabled state,
# so if previously it was disabled only-then we will disable it:
[ "$p_nullglob" -eq "0" ] && shopt -u nullglob;

unset DIR;
unset p_nullglob ub;

Output of shopt -p shoptName (for example: shopt -p dotglob) can be
either, this: shopt -u shoptName (the u is unset/disabled/off/0)
or, this: shopt -s shoptName (the s is set/enabled/on/1)
The position of the letter "s" or "u" is always at 7 (because, in bash, a string's letter position begins from 0, that is, the 1st letter of a string is at position 0)
We can obtain this "u" or "s" and store it in a variable, so that we can use it to restore previous state.
And if we apply this (mentioned in above) way to save/restore shopt state, then we can avoid using external tool "grep".

To view "txt" file that begins with ".", that is, to view hidden "txt" file, we need to enable "dotglob" shopt.

So this time in below, "dotglob" is included & enabled to show HIDDEN "txt" files:

#!/bin/bash
DIR="/home/john/my directory/";

p_nullglob="u";
pSS="`shopt -p nullglob`";
[ "${pSS:7:1}" == "s" ] && p_nullglob="s";

p_dotglob="u";
pSS="`shopt -p dotglob`";
[ "${pSS:7:1}" == "s" ] && p_dotglob="s";

shopt -s nullglob dotglob;

for f in "$DIR"*.txt ; do {
    echo "${f}";
    # your codes/commands, etc
};
done;

[ "$p_nullglob" == "u" ] && shopt -u nullglob;
[ "$p_dotglob" == "u" ] && shopt -u dotglob;

unset DIR;
unset p_nullglob p_dotglob pSS;

There are more simple way to save/restore shopt option/value.
Isaac posted here, how to save+restore Env/Shopt variable/option state/value.

Saving shopt state of "nullglob":
  p_nullglob="`shopt -p nullglob`";
  ... # your primary-function codes/commands, etc lines
Restoring back previous shopt state of "nullglob", before exiting script:
  eval "$p_nullglob" ;

Multiple shopt states can be saved this way:
  p_multipleShopt="`shopt -p nullglob dotglob`";
and restore process is same as before:
  eval "$p_multipleShopt" ;

Save ALL shopt states this way:
  p_allShopt="`shopt -p`";
and restore process is same as before:
  eval "$p_allShopt" ;

So here is another bash based solution:

#!/bin/bash
DIR="/home/john/my directory/";

p_allShopt="`shopt -p`";
shopt -s nullglob dotglob;

for f in "$DIR"*.txt ; do {
    echo "${f}";
    # your codes/commands, etc
};
done;

eval "$p_allShopt" ;

unset DIR p_allShopt;

Using eval is safe in above, as the variable "$p_allShopt" is not holding a data provided by a user or a data thats not-sanitized, That var is holding output of bash internal command shopt.
If you still want to avoid eval then use below solution:

#!/bin/bash
DIR="/home/john/my directory/";

p_allShopt="`shopt -p`";
shopt -s nullglob dotglob;

for f in "$DIR"*.txt ; do {
    echo "${f}";
    # your codes/commands, etc
};
done;

while IFS= read -a oneLine ; do {
    ${oneLine} ;
};
done < <(echo "$p_allShopt") ;

unset DIR p_allShopt oneLine;

Few (other) notable & related SHOPT that may be useful, are:

  • nocaseglob : If set, Bash matches filenames in a case-insensitive fashion when performing filename expansion.
  • nocasematch : If set, Bash matches patterns in a case-insensitive fashion when performing matching while executing case or [[ conditional commands, when performing pattern substitution word expansions, or when filtering possible completions as part of programmable completion.
  • dotglob : If set, Bash includes filenames beginning with a ‘.’ in the results of filename expansion. The filenames ‘.’ and ‘..’ must always be matched explicitly, even if dotglob is set.
  • nullglob : If set, Bash allows filename patterns which match no files to expand to a null string, rather than themselves.
  • extglob : If set, the extended pattern matching features described above (see Pattern Matching) are enabled.
  • globstar : If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match.

Solution 4

What you can do is to leave only the wildcard characters outside of quotes.
Something like:
for a in "files with spaces"*".txt"
do
processing
done
If the wildcards themselves expand to spaces, then you'll need t use a file per line approach, like use ls -l to generate the list of files and use bash read to get each file.

Share:
51,161

Related videos on Youtube

John
Author by

John

Updated on September 18, 2022

Comments

  • John
    John almost 2 years

    I am having trouble getting the basics of Bash scripting down. Here's what I have so far:

    #!/bin/bash
    FILES="/home/john/my directory/*.txt"
    
    for f in "${FILES}"
    do
      echo "${f}"
    done
    

    All I want to do is list all the .txt files in a for loop so I can do stuff with them. But the space in the my directory and the asterisk in *.txt just aren't playing nicely. I tried using it with and without double quotes, with and without curly braces on variable names and still can't print all the .txt files.

    This is a very basic thing, but I'm still struggling because I'm tired and can't think straight.

    What am I doing wrong?

    I've been able to successfully apply the script above if my FILES don't have a space or an asterisk... I had to experiment with or without the use of double quotes and braces to get it to work. But the moment I have both spaces and an asterisk, it messes everything up.

  • pospi
    pospi almost 8 years
    It's worth noting that if you're passing paths like this around through functions, you need to ensure you quote the variable on its own rather than concatenating it as part of a larger string: for f in "$DIR"/*.txt = fine for f in "$DIR/*.txt" = breaks