Grep and ignoring leading whitespace

6,065

Solution 1

Just use awk (using grep seems redundant to me, since awk already can match a regular expression):

awk '$0~/\s*\#define\s*\[.*\]\s*.*/ {print $3}' *.h

Going through the expression in more detail:

$0 ~ /regexp/          # look for the regular expression in the record
      \s*              # whitespace, any number of times
         \#define      # literal string, '#' has to be scaped
                 \s*   # same as above
                    .* # any character, any number of times, this is
                       # your hex code and you can refine the regex here
{ print $3 }           # print the third field if the record matches

To have this run recursively, e.g.

mkdir -p a/b/c
echo "   #define [name] 0x0001" > a/a.h
echo "   #define [name] 0x0002" > a/b/b.h
echo "   #define [name] 0x0003" > a/b/c/c.h
tree
.
└── a
    ├── a.h
    └── b
        ├── b.h
        └── c
            └── c.h

3 directories, 3 files

since awk needs to be given a list of files to operate on, you could:

find . -type f -name "*.h" \
  -exec awk '$0~/\s*\#define\s*\[.*\]\s*.*/ {print $3}' {} \;
0x0002
0x0003
0x0001

Solution 2

Use grep -o to print only the matching part of the line.

Obviously take the \s part out again, because you don't want that part.

Share:
6,065

Related videos on Youtube

Gammerx
Author by

Gammerx

Updated on September 18, 2022

Comments

  • Gammerx
    Gammerx over 1 year

    I've been working on a bash script to search for names of defines, and then grab the hex values for them and put them in the list. Once I have the list of names I'll attempt to search for "#define [name]" using -w to ensure exact match, and then awk '{ print $3 }' to grab the hex value.

    However it works if the line in the header file is similar to

    a.h:#define [name] 0x0001
    

    But it does NOT work if it is similar to

    a.h:    #define [name] 0x0001
    

    How can I get around this? I have tried this

    grep -nrw "\s*#define[[:space:]]*$p" . --include=*.h | awk '{ print $3 }'
    

    I thought the \s* would ignore the leading whitespace before #define but it doesn't. Am I doing something wrong?

    • Matej Vrzala M4
      Matej Vrzala M4 almost 9 years
      I think you need to use extended grep or the -E flag to use \s
    • Gammerx
      Gammerx almost 9 years
      I am, but I believe the issue comes from leading white space causing #define to be shifted from $1 to $2
    • jimmij
      jimmij almost 9 years
      You should describe the problem better, it is hard to understand what do you want to achieve.
  • Gammerx
    Gammerx almost 9 years
    Problem is that with -o, now I don't get the hex value since it only grabs #define and $p
  • Gammerx
    Gammerx almost 9 years
    can this handle recursive as well? (checking *.h from subdirs)
  • Gammerx
    Gammerx almost 9 years
    Very well done, this is awesome. Based on some suggestions and testing I have found that grep -Ro '#define\s*\(.*\)' *.h works. -o was tricky to get the entire line but if you just tell it to include anything that comes after, I can now just use awk to get element 3. since grep includes the header file name, if there was a space before #define it would mess it up.