Matching a number sequence in grep
The {4}
is an extended regular expression. grep
will not understand it unless you specify the -E
option:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
try with
grep -E '[0-9]{4}'
example
$ echo abcd1234abcd | grep -o -E '[0-9]{4}'
1234
you can also use the [:digit:]
character class to avoid problems with locales where the order of the symbols could be different:
$ echo abcd1234abcd | grep -o -E '[[:digit]]{4}'
1234
if for any reason you don't want to use extended regular expressions you can use
grep -o '[0-9][0-9][0-9][0-9]'
Related videos on Youtube
Yonathan Klijnsma
Updated on September 18, 2022Comments
-
Yonathan Klijnsma over 1 year
So I'm trying to match a year number sequence with grep and this should be easy. I'm just a bit stumped that my simplistic regex isn't working.
What I'm doing is running a tool which archives some files but it needs to check for the date of the file to put it in the correct directory. I already have properly formatted input which comes to me as:
<span class='t-d'>1994-Oct-28</span>
This is just one example, when I have this I want to grab just the 1994 part of it and use this to continue archiving to the correct year. I was assuming something like this would be sufficient:
grep -o '[0-9]{4}'
But this doesn't seem to match on anything. When I try something like:
grep -o '[0-9]'
it matches all the separate numbers, so 1 9 9 4 2 and 8.
So my syntax is wrong but as for as my knowledge goes this matches a number of 0 to 9 4 times, the {} specifying length either in a range or exact range. If someone could help me with this simple syntax it would be highly appreciated.