grep equivalent of the kwrite regex [A-Z][A-Z]+
Solution 1
You're using the right syntax in your first example; the problem is +
is only considered special when using "extended" regular expressions. From the man page of the GNU implementation of grep
:
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
(\?
, \+
, and \|
are non-standard GNU extensions though).
So, you either need to escape the +
(assuming GNU grep
or compatible):
$ grep "^[A-Z][A-Z]\+" filename
Use the standard \{1,\}
equivalent of GNU's \+
:
$ grep '^[A-Z][A-Z]\{1,\}' filename
or even here:
$ grep '^[A-Z]\{2,\}' filename
Or turn on extended regular expressions, by passing grep
the -E
flag or just running egrep
(egrep
is the command that introduced those extended regular expressions in the late 70s):
$ grep -E "^[A-Z][A-Z]+" filename
$ egrep "^[A-Z][A-Z]+" filename
In any case, all those would be functionally equivalent to:
$ grep '^[A-Z][A-Z]' filename
So you don't even need the +
operator.
In your other example you tried:
$ grep "^[A-Z][A-Z]*" filename
*
works in basic regular expressions, but it matches 0 or more times, not 1 or more. The solution in your answer works because it says "match a capital, then another capital, then 0 or more capitals". The method in the question says "match a capital, then 1 or more capitals", which is the same. You can also use {min,max}
to specify exactly how many you want, and if you leave out max
it allows any number (this also requires extended regular expressions):
$ egrep "^[A-Z]{2,}"
(as a history note, egrep
didn't support {min,max}
initially (and still doesn't in Solaris 11 /bin/egrep
for instance). \{min,max\}
support was added to grep
before {min,max}
was added to egrep
(which in the case of egrep
did break backward compatibility)).
Solution 2
You just need to add an extra [A-Z]. So, it's
me@ROOROO:~/$ grep "^[A-Z][A-Z][A-Z]*" filename
Related videos on Youtube
Matthew
Updated on September 18, 2022Comments
-
Matthew over 1 year
So, it took me ages, but I finally learned to think in terms of regular expressions, thanks to using them in
kwrite
.But I still don't know how to translate that knowledge to
grep
. I love mygrep
, when I know what I'm doing with it, but the manual has always given me a headache.I'd like to match stuff like the following lines:
CAPITALSFOLLOWING anewline. CAPI TALSFOLL owing ANEW line.
That is, lines that begin with two or more capital letters. But I can't figure out how.
In
kwrite
, I would match these lines using:\n[A-Z][A-Z]+
But
grep
... hmm. I have a feeling like it's something like:me@ROOROO:~/$ grep "^[A-Z]something" filename
but
me@ROOROO:~/$ grep "^[A-Z][A-Z]+" filename
doesn't work (returns an empty file). A google search for the term 'grep match one or more occurrence' lead me to believe that
me@ROOROO:~/$ grep "^[A-Z][A-Z]*" filename
was the right syntax. But, alas, that doesn't do the trick.
-
Gilles 'SO- stop being evil' about 12 yearsIn the old days, each tool had its own regexp syntax. By default,
grep
uses its traditional syntax; usegrep -E
to have a more habitual syntax where a backslash followed by a non-alphanumeric character is never special.
-