When is double-quoting necessary?

67,641

First, separate zsh from the rest. It's not a matter of old vs modern shells: zsh behaves differently. The zsh designers decided to make it incompatible with traditional shells (Bourne, ksh, bash), but easier to use.

Second, it is far easier to use double quotes all the time than to remember when they are needed. They are needed most of the time, so you'll need to learn when they aren't needed, not when they are needed.

In a nutshell, double quotes are necessary wherever a list of words or a pattern is expected. They are optional in contexts where a raw string is expected by the parser.

What happens without quotes

Note that without double quotes, two things happen.

  1. First, the result of the expansion (the value of the variable for a parameter substitution like ${foo}, or the output of the command for a command substitution like $(foo)) is split into words wherever it contains whitespace.
    More precisely, the result of the expansion is split at each character that appears in the value of the IFS variable (separator character). If a sequence of separator characters contains whitespace (space, tab or newline), the whitespace is counts as a single character; leading, trailing or repeated non-whitespace separators lead to empty fields. For example, with IFS=" :", :one::two : three: :four  produces empty fields before one, between one and two, and (a single one) between three and four.
  2. Each field that results from splitting is interpreted as a glob (a wildcard pattern) if it contains one of the characters \[*?. If that pattern matches one or more file names, the pattern is replaced by the list of matching file names.

An unquoted variable expansion $foo is colloquially known as the “split+glob operator”, in contrast with "$foo" which just takes the value of the variable foo. The same goes for command substitution: "$(foo)" is a command substitution, $(foo) is a command substitution followed by split+glob.

Where you can omit the double quotes

Here are all the cases I can think of in a Bourne-style shell where you can write a variable or command substitution without double quotes, and the value is interpreted literally.

  • On the right-hand side of an assignment.

    var=$stuff
    a_single_star=*
    

    Note that you do need the double quotes after export, because it's an ordinary builtin, not a keyword. This is only true in some shells such as dash, zsh (in sh emulation), yash or posh; bash and ksh both treat export specially.

    export VAR="$stuff"
    
  • In a case statement.

    case $var in …
    

    Note that you do need double quotes in a case pattern. Word splitting doesn't happen in a case pattern, but an unquoted variable is interpreted as a pattern whereas a quoted variable is interpreted as a literal string.

    a_star='a*'
    case $var in
      "$a_star") echo "'$var' is the two characters a, *";;
       $a_star) echo "'$var' begins with a";;
    esac
    
  • Within double brackets. Double brackets are shell special syntax.

    [[ -e $filename ]]
    

    Except that you do need double quotes where a pattern or regular expression is expected: on the right-hand side of = or == or != or =~.

    a_star='a*'
    if [[ $var == "$a_star" ]]; then echo "'$var' is the two characters a, *"
    elif [[ $var == $a_star ]]; then echo "'$var' begins with a"
    fi
    

    You do need double quotes as usual within single brackets [ … ] because they are ordinary shell syntax (it's a command that happens to be called [). See Single or double brackets

  • In a redirection in non-interactive POSIX shells (not bash, nor ksh88).

    echo "hello world" >$filename
    

    Some shells, when interactive, do treat the value of the variable as a wildcard pattern. POSIX prohibits that behaviour in non-interactive shells, but a few shells including bash (except in POSIX mode) and ksh88 (including when found as the (supposedly) POSIX sh of some commercial Unices like Solaris) still do it there (bash does also attempt splitting and the redirection fails unless that split+globbing results in exactly one word), which is why it's better to quote targets of redirections in a sh script in case you want to convert it to a bash script some day, or run it on a system where sh is non-compliant on that point, or it may be sourced from interactive shells.

  • Inside an arithmetic expression. In fact, you need to leave the quotes out in order for a variable to be parsed as an arithmetic expression.

    expr=2*2
    echo "$(($expr))"
    

    However, you do need the quotes around the arithmetic expansion as they are subject to word splitting in most shells as POSIX requires (!?).

  • In an associative array subscript.

    typeset -A a
    i='foo bar*qux'
    a[foo\ bar\*qux]=hello
    echo "${a[$i]}"
    

An unquoted variable and command substitution can be useful in some rare circumstances:

  • When the variable value or command output consists of a list of glob patterns and you want to expand these patterns to the list of matching files.
  • When you know that the value doesn't contain any wildcard character, that $IFS was not modified and you want to split it at whitespace characters.
  • When you want to split a value at a certain character: disable globbing with set -f, set IFS to the separator character (or leave it alone to split at whitespace), then do the expansion.

Zsh

In zsh, you can omit the double quotes most of the times, with a few exceptions.

  • $var never expands to multiple words, however it expands to the empty list (as opposed to a list containing a single, empty word) if the value of var is the empty string. Contrast:

    var=
    print -l $var foo        # prints just foo
    print -l "$var" foo      # prints an empty line, then foo
    

    Similarly, "${array[@]}" expands to all the elements of the array, while $array only expands to the non-empty elements.

  • The @ parameter expansion flag sometimes requires double quotes around the whole substitution: "${(@)foo}".

  • Command substitution undergoes field splitting if unquoted: echo $(echo 'a'; echo '*') prints a * (with a single space) whereas echo "$(echo 'a'; echo '*')" prints the unmodified two-line string. Use "$(somecommand)" to get the output of the command in a single word, sans final newlines. Use "${$(somecommand; echo _)%?}" to get the exact output of the command including final newlines. Use "${(@f)$(somecommand)}" to get an array of lines from the command's output.

Share:
67,641

Related videos on Youtube

kjo
Author by

kjo

Updated on September 18, 2022

Comments

  • kjo
    kjo over 1 year

    The old advice used to be to double-quote any expression involving a $VARIABLE, at least if one wanted it to be interpreted by the shell as one single item, otherwise, any spaces in the content of $VARIABLE would throw off the shell.

    I understand, however, that in more recent versions of shells, double-quoting is no longer always needed (at least for the purpose described above). For instance, in bash:

    % FOO='bar baz'
    % [ $FOO = 'bar baz' ] && echo OK
    bash: [: too many arguments
    % [[ $FOO = 'bar baz' ]] && echo OK
    OK
    % touch 'bar baz'
    % ls $FOO
    ls: cannot access bar: No such file or directory
    ls: cannot access baz: No such file or directory
    

    In zsh, on the other hand, the same three commands succeed. Therefore, based on this experiment, it seems that, in bash, one can omit the double quotes inside [[ ... ]], but not inside [ ... ] nor in command-line arguments, whereas, in zsh, the double quotes may be omitted in all these cases.

    But inferring general rules from anecdotal examples like the above is a chancy proposition. It would be nice to see a summary of when double-quoting is necessary. I'm primarily interested in zsh, bash, and /bin/sh.

    • sunnysideup
      sunnysideup about 11 years
      Your observed behaviour in zsh depends on the settings and is influenced by the SH_WORD_SPLIT option.
    • Stéphane Chazelas
      Stéphane Chazelas over 9 years
    • Charles Duffy
      Charles Duffy almost 8 years
      As an aside -- all-caps variable names are used by variables with meaning to the operating system and shell; the POSIX specification explicitly advises using lower-case names for application defined variables. (While the specification quoted is specifically focusing on environment variables, environment variables and shell variables share a namespace: Attempting to create a shell variable with a name already used by an environment variable overwrites the latter). See pubs.opengroup.org/onlinepubs/009695399/basedefs/…, fourth paragraph.
  • Cyker
    Cyker over 7 years
    In fact, you need to leave the quotes out in order for a variable to be parsed as an arithmetic expression. Why am I able to make your example work with quotes: echo "$(("$expr"))"
  • Cyker
    Cyker over 7 years
    This is what man bash says: The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially.
  • Cyker
    Cyker over 7 years
    Also, for anyone who is interested, the formal names of split+glob are word splitting and pathname expansion.
  • Charles Duffy
    Charles Duffy about 7 years
    FYI -- over on StackOverflow, I've had someone pull the "optional when a raw string is expected" language in this answer out to defend not quoting an argument to echo. It might be worth trying to make the language even more explicit ("when a raw string is expected by the parser", perhaps?)
  • Aravind Srinivas
    Aravind Srinivas over 6 years
    Great answer. There's one other place you don't need quotes, which is when using + in variable substitution, e.g. ${VARNAME+replacement}, although you do need to quote the replacement if it has whitespace or contains any variables, e.g. ${VARNAME:+--opt="$VARNAME"}.
  • Friartek
    Friartek over 6 years
    Sorry, ignore my previous comment, it was added before complete and edited within time limit. 1) "${$(somecommand; echo _)%?}" technically outputs the exact output of the command, but it also adds an additional newline to the end. "${"${$(somecommand; echo _)%?}"%$'\n'}" appears to correct this. 2) It isn't stated what the expected output of "${(@f)$(somecommand)}", to get an array, should be. The result is, all trailing newlines are remove. If you require an array with trailing newlines, "${(@f)"${$(somecommand; echo _)%?}"%$'\n'}" should give you the desired result.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 6 years
    @Friartek No: "${$(somecommand; echo _)%?}" is the exact output of the command, including a final newline if there is one (and there usually is). If you want the output of the command without trailing newlines, it's just "$(somecommand; echo _)".
  • Friartek
    Friartek over 6 years
    @Gilles Hi. Sorry, not what I'm seeing. This is how I understand what is going on. $() will strip all trailing blank lines from a command within the parentheses. By adding ;echo _ to the end of the output of "somecommand", this tacks on an additional line with an underscore so any trailing blank lines from "somecommand" will be protected from $(). %? then deletes the underscore, leaving an empty line which wasn't there before. This is especially noticeable when there are no trailing blank lines from "somecommand". Am I missing or misinterpreting something here?
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 6 years
    @Friartek You keep mentioning an “empty line which wasn't there before”, but I don't understand why you think there is such a thing. Adding ;echo _ adds an underscore and a line break. The command substitution removes this line break. ${…%?} removes the underscore. You're left with exactly the output of the command, including its trailing line breaks if any.
  • Friartek
    Friartek over 6 years
    @Gilles Being ill and on antibiotics is no excuse for dumb mistakes on my part. Let myself get bit by the evil quote gods while working with arrays, then applied those results back to your original code. ${…%?} does work as you said and is a simple way around $(...) stripping trailing "empty lines". Still having a problem with the term "line break", took it to mean some quite different. My use case now is array=(${(@f)"${$(somecommand; echo _)%?}"}). I apologize to you and all here for the noise. I will delete some of my previous comments so others wont make my mistake.
  • Anderson Fernandes Silva
    Anderson Fernandes Silva over 3 years
    "Except that you do need double quotes where a pattern or regular expression is expected: on the right-hand side of = or == or != or =~." Perhaps I'm misunderstanding this, but AFAIK, a RHS regular expression (i.e. after =~) should always be unquoted (since Bash 3.2 at least).
  • All The Rage
    All The Rage over 3 years
    I set a variable like: o='OneDrive - MyCompany'. Then I use it: cd "$o"/Docu. Then I press TAB to do word completion on the folder name. The result is cd $o/Documents. I hit enter and it fails because it stole my double quotes. I want to use "$o" as a very fast shortcut for that directory name. I don't want to type "$o/Docu and then press TAB and then ", which works. When I start typing I don't want to think about whether I might press TAB to expand something later in the path. I just want to follow the rule to double quote variables. That doesn't work here. Any suggestions?
  • All The Rage
    All The Rage over 3 years
    I really want to be able to put backslashes before the spaces when defining the variable and have them be respected after expanding it. Is there any way I can do that? It would solve this problem so nicely.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 3 years
    @user7392 I don't know. You should ask a new question.
  • RichieHH
    RichieHH about 3 years
    Super answer. Just reinforcing what a mess it as and to... Quote unless you have a specific reason not to