Regular expression problem(s) in Bash: [^negate] doesn't seem to work

40,169

Solution 1

Are you sure what you want is happening? When you run ls /directory | grep '[^term]' you are essentially grepping for not the letters t e r m. This means if a file has other letters in its name it will still appear in the output of ls. Take the following directory for instance:

$ ls
alpha  brave  bravo  charlie  delta

Now if I run ls |grep '^[brav]' I get the following:

$ ls |grep '^[brav]'
alpha
brave
bravo

As you can see, not only did I get brave and bravo I also got alpha because the character class [] will get any letter from that list.

Consequently, if I run ls |grep '[^brav]' I will get all the files that do not contain the characters b r a v anywhere in the name.

$ ls |grep '[^brav]'
alpha
bravo
brave
charlie
delta

If you notice it included the entire directory listing because all the files had at least one letter that was not included in the character class.

So as Kanvuanza said, to grep for the inverse of "term" as opposed to the characters t e r m you should do it using grep -v.

For instance:

$ ls |grep -v 'brav'
alpha
charlie
delta

Also if you don't want the files that have any characters in the class use grep -v '[term]'. That will keep any files from showing up that have any of those characters. (Kanvuanza's answer)

For instance:

$ ls |grep -v '[brav]'

As you can see there were no files listed because all the files in this directory included at least one letter from that class.

Addendum:

I wanted to add that using PCRE it is possible to use just regex to filter out using negate expressions. To do this you would use something known as a negative look-ahead regex: (?!<regex>).

So using the example above, you can do something like this to get results you want without using grep flags.

$ ls | grep -P '^(?!brav)'
alpha
charlie
delta

To deconstruct that regex, it first matches on a start of a line ^ and then looks for strings that do not match brav to follow afterwards. Only alpha, charlie, and delta match so those are the only ones that are printed.

Solution 2

I guess that grep -v flag does what you want. From the man page:

-v, --invert-match
    Invert the sense of matching, to select non-matching lines.

You can use ls /directory | grep -v [term] to print any non matching lines.

Share:
40,169

Related videos on Youtube

erch
Author by

erch

my about me is blink at the moment

Updated on September 18, 2022

Comments

  • erch
    erch over 1 year

    When I execute ls /directory | grep '[^term]' in Bash I get a regular listing, as if the grep command is ignored somehow. I tried the same thing with egrep, I tried to use it with double and single quotes, but to no better results. When I try ls /directory | grep '^[term] I get all entries beginning with term - as expected.

    I have tried out this command in an online editor, where I can test my regex and it worked as it should. But not in Bash. So it works in a simulation, but not in real life.

    I work on Crunchbang Linux 10. I hope this is enough information and am looking forward to every hint, because failing to execute on such a basic level and wasting hours of time is really frustrating!

    • Bernhard
      Bernhard about 11 years
      I am confused because of the negate in the title. Do you want to grep lines starting with term. Or do you want to grep for lines not containing term at all?
    • erch
      erch about 11 years
      @Bernhard: I want a listing without the term in the square brackets. It doesn't have to be 'term' exactly! As far as I understood it, [^abc] means that anything containing a, b or c or any combination of it should not be in the listing.
  • erch
    erch about 11 years
    I am aware of this option, but am I wrong in assuming that [^xyz] is the opposite of [xyz] and should work in any case? I also want to avoid editing any settings anywhere on such a basic level. Using an inverting option and/or editing settings sure is a nice way around, but as far as I understood it, this should work without, out of the box.
  • erch
    erch about 11 years
    This means if a file has other letters in its name it will still appear in the output of ls. This answers quite a few questions! :) So the best way for the moment seems to be the -v option. Thanks for your support! This question really ruined my afternoon, where your answer brightens my evening!
  • erch
    erch about 11 years
    Grep support for negation is horrible! And this are the hints that are the real icing on the cake. I have the same issues with egrep and I'm far away from using [at least for me seemingly] more advanced commands at the moment. Can you suggest a command that provides better results and less headache?
  • vonbrand
    vonbrand about 11 years
    @cellar.dweller, grep's handling of character classes is just fine. It just means something quite different than what you (mis)understand. [abc] means one of a, b, or c; [^abc] means anything but the above. It is one character.
  • tink
    tink about 11 years
    @cellar.dweller: I think your biggest issue is a misunderstanding of regex, specifically character classes within regex.
  • vonbrand
    vonbrand about 11 years
    @Kanvuanza, the correct syntax is e.g. [^[:digit:]], and my (limited) tests show it works for all the standard classes.
  • erch
    erch about 11 years
    @vonbrand I just tried ls /dir | grep '[^[:digit:]]' and got a regular listing which contains also files with numbers in it. Same goes for egrep.
  • vonbrand
    vonbrand about 11 years
    @cellar.dweller, you are asking for all files whose names include one non-digit. The only files which won't match are those with names only digits.
  • erch
    erch about 11 years
    after trying out quite some cominations I finally seem to have understood. Also, where the problem in understanding it was. It is day two for me in learning about Regular Expressions and I am grateful for your patience. Thanks you very much.
  • Pedro Lacerda
    Pedro Lacerda about 11 years
    @vonbrand, now I understand it. Thanks you too.
  • Abhishek Kashyap
    Abhishek Kashyap over 5 years
    +1 for negative look-ahead regex.