Command substitution in for loop not working

6,038

Solution 1

A few things wrong in your code:

Using unquoted command substitution ($(...)) without setting $IFS

Leaving expansions unquoted is the split+glob operator. The default is to split on space, tab and newline. Here, you only want to split on newline, so you need to set IFS to that as otherwise that means that will not work properly if filenames contain space or tab characters

Using unquoted command substitution without set -f.

Leaving expansions unquoted is the split+glob operator. Here you don't want globbing, that is the expansion of wildcards such as scala* into the list of matching files. When you do not want the shell to do globbing, you have to disable it with set -f

ls aliased to ls -F

The issue above is aggravated by the fact that you have ls aliased to ls -F. Which adds / to directories and * to executable files. So, typically, because scala is executable, ls -F outputs scala*, and as a globbing pattern, it is expanded to all the filenames that start with scala which explains why it seems like egrep -v is not filtering files out.

Assuming filenames don't contain newline characters

newline is as valid a character as any in a filename. So parsing the output of ls typically doesn't work. As for instance the output of ls in a directory that contains a and b files is the same as in a directory that contains one file called a\nb.

Above egrep will filter the lines of the filenames, not the filenames

Using egrep instead of grep -E

egrep is deprecated. grep -E is the standard equivalent.

Not escaping the . regex operator.

Above, you used egrep to enable extended regular expressions, but you don't use any of the extended RE specific operator. The only RE operator you're using is . to match any character, while it looks like that's not what you intended. So you might as well have used grep -F here. Or use grep -v '\.bat'.

Not anchoring the regexp on end-of-line

egrep .bat will match any line that contains any character followed by bat, so that's the regexp that means anything that contains bat not in first position. It should have been grep -v '\.bat$'.

Leaving $f unquoted

Leaving an expansion unquoted is the split+glob operator. There, you want neither, so $f should be quoted ("$f").

Use echo

echo expands the ANSI C escape sequences in its arguments and/or treats strings like -n or -e specially depending on the echo implementation (and/or the environment).

Use printf instead.

So a better solution:

for f in *; do
  case $f in
    (*.bat);;
    (*) printf '%s\n' "$f"
  esac
done

Though if there's no non-hidden file in the current directory, that will still output *. You can work around that in zsh by changing * to *(N) or in bash by running shopt -s nullglob.

Solution 2

You should not use ls to parse files this way.

First set extglob globstar by shopt -s extglob globstar then

for f in !(*.bat)
  do
   printf '%s\n' "$f"
  done

Using find

find . -type f ! -name '*.bat' 

Use the negation operator for a safer treat of files.

Solution 3

There's nothing inherently wrong with your for loop:

$ for f in $(ls | egrep -v .bat); do echo $f; done

Here's my output for the above command:

$ for f in $(ls | egrep -v .bat); do echo $f; done
fsc
scala
scalac
scaladoc
scalap

Possible fix

One suggestion would be to quote the argument to egrep, like so:

$ for f in $(ls | egrep -v '.bat'); do echo $f; done

Also take a look at the Bash Pitfalls wiki for other potential issues when scripting things in Bash such as this. In particular this sounds like the most plausible reason why you're encountering this problem. Check to see if any of your files being returned contain spaces.

Debugging

Another thing to try is to enable verbose debugging of your bash command, prior to running it so you can see what's going on behind the scenes.

This enables debugging:

$ set -x

Then run your command:

$ for f in $(ls | egrep -v .bat); do echo $f; done
++ ls --color=auto
++ egrep --color=auto -v .bat
+ for f in '$(ls | egrep -v .bat)'
+ echo fsc
fsc
+ for f in '$(ls | egrep -v .bat)'
+ echo scala
scala
+ for f in '$(ls | egrep -v .bat)'
+ echo scalac
scalac
+ for f in '$(ls | egrep -v .bat)'
+ echo scaladoc
scaladoc
+ for f in '$(ls | egrep -v .bat)'
+ echo scalap
scalap

Then disable it:

$ set +x

Issue with Cygwin?

Given the examples work fine on a number of native Linux systems the problem is most likely rooted in having something to do with Cygwin and/or it's particular versions of bash and egrep. I'd pay particular attention to the field separator in Bash, $IFS to see if there is an issue between the various line separators on Windows (0x0d,0x0a) vs. Unix (0x0a).

I do not have an installation of Cygwin so I have no method for proving this hypothesis.

Share:
6,038
mike
Author by

mike

Updated on September 18, 2022

Comments

  • mike
    mike almost 2 years

    I want to keep all files not ending with .bat

    I tried

    for f in $(ls | egrep -v .bat); do echo $f; done
    

    and

    for f in $(eval ls | egrep -v .bat); do echo $f; done
    

    But both approaches yield the same result, as they print everything. Whereas ls | egrep -v .bat and eval ls | egrep -v .bat work per se, if used apart from the for loop.

    EDIT

    It's interesting to see that if I leave out the -v flag, the loop does what it should and lists all files ending with .bat.


    Feel free to edit the question title, as I was not sure what the problem is.

    I'm using GNU bash, version 4.1.10(4)-release (i686-pc-cygwin).


    EXAMPLE

    $ ls -l | egrep -v ".bat"
    total 60K
    -rwx------+ 1 SYSTEM SYSTEM 5.3K Jun  6 20:31 fsc*
    -rwx------+ 1 SYSTEM SYSTEM 5.3K Jun  6 20:31 scala*
    -rwx------+ 1 SYSTEM SYSTEM 5.3K Jun  6 20:31 scalac*
    -rwx------+ 1 SYSTEM SYSTEM 5.3K Jun  6 20:31 scaladoc*
    -rwx------+ 1 SYSTEM SYSTEM 5.3K Jun  6 20:31 scalap*
    

    Command is working, but not in the for loop.

    $ for f in $(ls | egrep -v .bat); do echo $f; done
    fsc
    fsc.bat
    scala
    scala.bat
    scalac
    scalac.bat
    scaladoc
    scaladoc.bat
    scalap
    scalap.bat
    scalac
    scalac.bat
    scaladoc
    scaladoc.bat
    scalap
    scalap.bat
    

    DEBUG $ set -x

    mike@pc /cygdrive/c/Program Files (x86)/scala/bin
    $ for f in $(ls | egrep -v .bat); do echo $f; done
    ++ ls -hF --color=tty
    ++ egrep --color=auto -v .bat
    + for f in '$(ls | egrep -v .bat)'
    + echo fsc
    fsc
    + for f in '$(ls | egrep -v .bat)'
    + echo fsc.bat
    fsc.bat
    // and so on
    
    • Mat
      Mat almost 11 years
      I strongly suggest you look at things in the following question: unix.stackexchange.com/questions/47151/…; except all the solutions based on ls.
    • Marek Zakrzewski
      Marek Zakrzewski almost 11 years
      @Mat that doesn't explain how you should do it within a for loop.
    • mike
      mike almost 11 years
      @MBR Why did it work for you? Did you also use bash? Can anyone explain that?
    • MBR
      MBR almost 11 years
      @mike I am also using GNU bash (though a more recent version, 4.2.25). No idea why it works in my case and not in yours... Anyway, val0x00ff answer seems to be satisfying.
    • mike
      mike almost 11 years
      Hmm, I'm still interested in why exactly the fail occurred.
    • slm
      slm almost 11 years
      This works for me too. Why did you drop the double quotes wrapping ".bat" in the for loop?
    • mike
      mike almost 11 years
      Just by chance, it did not change anything though.
    • Raul Santelices
      Raul Santelices almost 6 years
      Applications such as the antivirus have been suggested as possible causes of this behavior in Cygwin. See stackoverflow.com/questions/48927435/…
  • mike
    mike almost 11 years
    I get bash: !: event not found . Why shouldn't I use ls in that way?
  • Marek Zakrzewski
    Marek Zakrzewski almost 11 years
    because you have not enabled extglob See above.
  • mike
    mike almost 11 years
    Thx! And for completness: one should not use ls in a for loop the way I did it, because some file names could contain newline characters.
  • mike
    mike almost 11 years
    It was not about the printing. I wanted to dynamically create symlinks to the scala files. I guess that's also possible with find and exec, but looping was easier for me ;D
  • mike
    mike almost 11 years
    I added debug data to the question. The single quotes did not change anything. The spaces are in the path name, but I'm doing a local operation here. The file names are spaces-free, I don't think that is the problem.
  • slm
    slm almost 11 years
    @val0x00ff - on technical merits I 100% agree with you, but when doing one off scripts, I've done this exact thing and never had any issues. As long as you understand the implications as the operator it doesn't matter. Just my $0.02. If a script will only ever see the light of day on a single system, the different implementations of ls doesn't matter either.
  • slm
    slm almost 11 years
    @mike - what does $IFS look like? echo $IFS.
  • mike
    mike almost 11 years
    @slm ...edited this comment. It's not empty, it's a \n.
  • Stéphane Chazelas
    Stéphane Chazelas almost 11 years
    You're missing the fact that globbing is performed upon command substitution since it was not disabled and that ls was aliased to ls -F.
  • Stéphane Chazelas
    Stéphane Chazelas almost 11 years
    globstar is not needed here.
  • slm
    slm almost 11 years
    @StephaneChazelas - I don't doubt you're right, but I've never grasped the subtlety of the ins/outs of what's going on here. If I swap in an ls -F to my example it still works. I don't understand your comment about the globbing being performed upon command substitution. I suspect others also miss these details and hence why seem to see this family of questions so frequently here. A canonical answer is definitely needed on this topic (you've pretty much started one with your A to this Q).
  • mike
    mike almost 11 years
    Okay, quite a few things I should improve. But to get it straight. The loop did not fail per se. The grep did work, but because of the * added by ls -F the variable f in $f got expanded. So either stop the expansion or remove the F flag from the ls command. The opposite command, i. e. without the v flag in grep, did work, because there was nothing to expand a file name matching <filename>.bat* to.
  • Stéphane Chazelas
    Stéphane Chazelas almost 11 years
    @mike, no, the wildcards (as added by ls -F) got expanded upon the command substitution ($(...)). If there had been a file called scala*, that wildcard would also have been expanded upon the expansion of $f (and you'd had seen all the filenames starting with scala separated by spaces on one line)
  • mike
    mike almost 11 years
    Oh, of course. That makes totally sense considering the debug output.