Passing generated empty strings as command line arguments

11,267

Solution 1

in

./input $(cmd)

Because, $(cmd) is unquoted, that's a split+glob operator. The shell retrieves the output of cmd, removes all the trailing newline characters, then splits that based on the value of the $IFS special parameter, and then performs filename generation (for instance turns *.txt into the list of non-hidden txt files in the current directory) on the resulting words (that latter part not with zsh) and in the case of ksh also performs brace expansion (turns a{b,c} into ab and ac for instance).

The default value of $IFS contains the SPC, TAB and NL characters (also NUL in zsh, other shells either remove the NULs or choke on them). Those (not NUL) also happen to be IFS-whitespace characters¹, which are treated specially when it comes to IFS-splitting.

If the output of cmd is " a b\nc \n", that split+glob operator will generate a "a", "b" and "c" arguments to ./input. With IFS-white-space characters, it's impossible for split+glob to generate an empty argument because sequences of one or more IFS-whitespace characters are treated as one delimiter. To generate an empty argument, you'd need to choose a separator that is not an IFS-whitespace character. Actually, any non-whitespace character will do (best to also avoid multi-byte characters which are not supported by all shells here).

So for instance if you do:

IFS=:          # split on ":" which is not an IFS-whitespace character
set -o noglob  # disable globbing (also brace expansion in ksh)
./input $(cmd)

And if cmd outputs a::b\n, then that split+glob operator will result in "a", "" and "b" arguments (note that the "s are not part of the value, I'm just using them here to help show the values).

With a:b:\n, depending on the shell, that will result in "a" and "b" or "a", "b" and "". You can make it consistent across all shells with

./input $(cmd)""

(which also means that for an empty output of cmd (or an output consisting only of newline characters), ./input will receive one empty argument as opposed to no argument at all).

Example:

cmd() {
  printf 'a b:: c\n'
}
input() {
  printf 'I got %d arguments:\n' "$#"
  [ "$#" -eq 0 ] || printf ' - <%s>\n' "$@"
}
IFS=:
set -o noglob
input $(cmd)

gives:

I got 3 arguments:
 - <a b>
 - <>
 - < c>

Also note that when you do:

./input ""

Those " are part of the shell syntax, they are shell quoting operators. Those " characters are not passed to input.


¹ IFS whitespace characters, per POSIX being the characters classified as [:space:] in the locale and that happen to be in $IFS though in ksh88 (on which the POSIX specification is based) and in most shells, that's still limited to SPC, TAB and NL. The only POSIX compliant shell in that regard I found was yash. ksh93 and bash (since 5.0) also include other whitespace (such as CR, FF, VT...), but limited to the single-byte ones (beware on some systems like Solaris, that includes the non-breaking-space which is single byte in some locales)

Solution 2

You could generate the whole command line programmatically, and either copy-paste it, or run it through eval, e.g.:

$ perl -e 'printf "./args.sh %s\n", q/"" / x 10' 
./args.sh "" "" "" "" "" "" "" "" "" "" 

$ eval "$(perl -e 'printf "./args.sh %s\n", q/"" / x 100')"
$#: 100
$1: ><

(q/"" / is one of Perl's ways of quoting a string, x 100 makes hundred copies of it and concatenates them.)

eval processes its argument(s) as shell commands, running all quote processing and expansions. This means that if any of the input comes from untrusted sources, you'll need to be careful in generating the evaluated code to prevent vulnerabilities.

If you want the number of empty arguments variable, that should be doable without issues (at least I can't come up with how the second operand to Perl's x could be misused as it folds the operand to an integer):

$ n=33
$ eval "$(perl -e 'printf "./args.sh %s\n", q/"" / x $ARGV[0]' "$n")"
$#: 33
$1: ><

Solution 3

But what do you want to pass in fact? Empty quotes or empty strings? Both are valid arguments and this simple bash script can help illustrate this:

#!/bin/bash

printf "Argument count: %s.\n" "${#@}"

It just prints the number of arguments passed to it. I'll call it s for brevity.

$ ./s a
Argument count: 1.
$ ./s a b
Argument count: 2.
$ ./s a b ""
Argument count: 3.
$ ./s a b "" ""
Argument count: 4.
$ ./s a b "" "" \"\"
Argument count: 5.

As you can see the empty strings are just empty strings - the quotes are removed at parsing time - and they're still valid arguments. The shell feeds them into the command. But "" can be passed on as well. It's not an empty string though. It contains two characters.

Under the hood, for C, strings are NUL (\0) terminated and no quotes are needed to represent them.

Share:
11,267

Related videos on Youtube

Alex Hoppus
Author by

Alex Hoppus

Updated on September 18, 2022

Comments

  • Alex Hoppus
    Alex Hoppus almost 2 years

    I do the following:

    $ ./input ""
    

    Output:

    argc 2
    argv[1][0] 0
    

    But if I want to pass (several) empty quotes in program based manner:

    $ python -c 'print "\"\""'
    ""
    
    ./input $(python -c 'print "\"\""')
    

    gives:

    argc 2
    argv[1][0] 22 - (22 hex value for ")
    

    So how can I generate something like:

    $ ./input "" "" "" ""
    

    and get result same as in example 1 ?

    • Admin
      Admin over 5 years
      You should post your C code too.
    • Alex Hoppus
      Alex Hoppus over 5 years
      Why do you need it? It is clear from a context that is prints argc and argv[1][0] as a hex value.
    • ilkkachu
      ilkkachu over 5 years
      As far as the problem here seems to be how the shell translates the programmatically generated output to the actual arguments this seems on-topic to me. (Asking how to generate some output with Python would be SO stuff, but I suppose that wasn't the issue.)
  • Alex Hoppus
    Alex Hoppus over 5 years
    Mmm... Let's put it in a different way: I want pass 100 empty quotes. In C program empty quote passed as a parameter is a string which contains only 0x00. I can do it easy for one parameter like i said: ./input "" , but if i need 100 NULL parameters, it is hard to type in console ./input "" "" "" "" "" .... so i want to generate this command.
  • Admin
    Admin over 5 years
    @AlexHoppus Then you need 100 NULs and xargs: $ perl -e'print"\0"x100' | xargs -0 ./s.
  • Alex Hoppus
    Alex Hoppus over 5 years
    Okay I have already done this via xargs, but the problem is that this ./input program also have an stdin input (the program reads some values using scanf()). And if i do python -c 'print "\x00 "*65+"\ \\\n\r"+"\x00 "*34' | xargs ./input i don't know how to pass this input to my app (./input) when it is launched via xargs