Right syntax for awk usage in combination with other command inside xargs sh –c

22,506

Solution 1

The second single quote terminates the first single-quoted string 'echo {}; awk '. Then {print $1} is unquoted, and then there is another single-quoted string ' {} | uniq'. This should be clear in any editor with syntax highlighting; it's also clear if you look at the syntax highlighting in your question.

Here the simplest approach would be to avoid nested quoting altogether. Pass the awk script as an argument to sh.

xargs -I {} sh -c 'echo "$1"; awk "$0"' '{print $1}' {} | uniq'

(I also replaced the {} inside the script by the corresponding argument to sh. Never use {} inside a script: it would be parsed as shell syntax, not as a file name, so it would fail catastrophically on any file name containing shell special characters.)

To effectively include a single quote in a single-quoted literal, use '\'' (formally this ends the single-quoted literal, then adds a single quote which is interpreted literally due to the preceding backflash, then starts another single-quoted literal; but the effect is the same).

xargs -I {} sh -c 'echo {}; awk '\''{print $1}'\'' {} | uniq'

Alternatively, use single quotes at one level and double quotes at the other level, but it gets trickier.

(I assume that your nonsensical commands such as ls * are just an extremely simplified example.)

Solution 2

You don't need xargs at all.

As I read elsewhere on this site (sorry, can't recall just where) from a top user:

Yes, xargs is a cool toy. No, you don't need to use it.

This:

ls * | xargs -I {} sh -c 'echo {}; awk '{print $1}' {} | uniq'

Can be fully replaced with this:

for f in *; do echo "$f"; awk '{print $1}' "$f" | uniq; done

This gives you a significant security improvement over your previous version, to say nothing of readability and actual functionality. (Of course the first version doesn't work at all due to attempted nesting of single quotes which is impossible**.)

Even if you fix the quoting of your version, however, you are laying yourself wide open. By stuffing the name of an arbitrary filename into a shell command inside of -c, you are effectively running eval on that filename, and so there are numerous exploits that could be made simply by crafting specific filenames. For example, touch ';rm -rf "$HOME" #' would result in your home directory being removed.


For fully guaranteed handling of odd filenames, including filenames that could be interpreted as awk option flags, use the following:

for f in *; do printf '%s\n' "$f"; awk '{print $1}' < "$f" | uniq; done
Share:
22,506

Related videos on Youtube

Ekaterina
Author by

Ekaterina

Updated on September 18, 2022

Comments

  • Ekaterina
    Ekaterina over 1 year

    How to make this command work:

    ls * | xargs -I {} sh -c 'echo {}; awk '{print $1}' {} | uniq'
    

    It should do the simple thing: print for each file in the folder its name and uniq values in the first column

    It does not work because the $ symbol is recognized as an end of the string symbol, and there should be something to do with quotes, I guess.

    The error message:

    awk: cmd. line:1: {print
    awk: cmd. line:1:       ^ unexpected newline or end of string
    
    • Ekaterina
      Ekaterina almost 8 years
      I explained that the line meant to do: print for each file in the folder its name and uniq values in the first column. for file in *;do echo $file ; awk '{print $1}' $file | uniq ;done - will do the same, but there should be a way to do it with xargs, I don't like that I miss it
  • Wildcard
    Wildcard almost 8 years
    +1, but he doesn't really need to use xargs at all. He can just perform the for loop directly on the glob expansion (which is what I suggest doing).
  • Alessio
    Alessio almost 8 years
    true. using xargs is only necessary if the list of files to be processed is generated by some other process (find, for example). updating. also used -- in case filename args start with -.
  • Alessio
    Alessio almost 8 years
    nesting of single-quotes is NOT impossible. It's just difficult to do correctly and ugly & difficult to read afterwards.
  • Wildcard
    Wildcard almost 8 years
    @cas, there is no way to escape a single quote within single quotes. There is no way to escape anything within single quotes. You have to break out of single quotes to include a literal single quote in an argument, which is not the general meaning of "nesting." Command substitution with $(...) has nesting; single quotes simply do not. The fact that you can work around the impossibility of embedding single quotes within single quotes doesn't really make that workaround "nesting," IMO.
  • don_crissti
    don_crissti almost 8 years
    @Wildcard - obviously the for loop is the better option but not sure why anyone would suggest it - OP already knows about the for loop - see the comments on the question... This is a XY question where OP knows about X but just wants to do it with Y.
  • Alessio
    Alessio almost 8 years
    I'm assuming that the echo and awk are just stand-ins for some more difficult task....and this provides a useful example of how to avoid over-complicating things when using xargs or find. You can use ridiculous quoting shenanigans if you really want to, but it's simpler to just write a trivial throwaway script. I've made the same point when people get into quoting difficulties with ssh - just write a script, scp it, and then run it.