Bash scripting grep -E "[a-z,A-Z,0-9\.\-]{2,}" /etc/hostname

22,461

Here's what your regular expression means, from left to right:

[

start of a character set (or character class). It matches one character from the set.

a-z,A-Z,0-9

inside a character set means match any one character a-z or A-Z or 0-9. The commas are actually optional here, unless you're trying to literally match a comma.

\.\-

. is a special character that matches any character but inside a character set it has no special meaning and doesn't have to be escaped. The - here doesn't have to be escaped, if it's the first or last character in a set it matches a literal -, it only takes on special meaning when between two other characters in a set.

]

end of the character set. The set matches any one character a-z or A-Z or 0-9or . or -.

{2,}

is a quantifier. It means that the previous regex is to be matched 2 or more times.

So the command can be cut down to this:

grep -E "[a-zA-Z0-9.-]{2,}" /etc/hostname

When used with the -P flag grep interprets the pattern as a Perl regular expression. Perl regular expressions are nearly identical to Python regex. It's a more powerful mode than -E in my opinion. In Perl mode your command becomes:

grep -P "[a-zA-Z\d.-]{2,}" /etc/hostname
Share:
22,461
Can Buyukburc
Author by

Can Buyukburc

Updated on September 18, 2022

Comments

  • Can Buyukburc
    Can Buyukburc over 1 year

    I had been working on a script and trying to understand it Here is a piece I could not understand.

    grep -E "[a-z,A-Z,0-9\.\-]{2,}" /etc/hostname In this code I do understand that it tries to get data from /etc/hostname. Anything that starts with a-z or A-Z or numbers can be.

    But, starting with:

    \.\-]{2,}
    

    this part I could not solve it. Can anyone explain whats that from?

    • Can Buyukburc
      Can Buyukburc about 6 years
      I also did some tries and realised that when I play with {2,} coloring of the grep changes for example : {1,} makes the first part of the domain name red and dot and rest becomes normal color. and when I removed the "/./" part only dot becomes normal color but rest becomes red. red is the coloring from terminal for grep
    • Hauke Laging
      Hauke Laging about 6 years
      "Anything that starts with a-z or A-Z or numbers can be." No. If you want to mark the beginning of a string then you need ^. Without that the pattern can be anywhere in the string.
    • roaima
      roaima about 6 years
      The sub-pattern will also match a comma
  • Can Buyukburc
    Can Buyukburc about 6 years
    when I do {1,} ıt colors first part but writes the whole. When I do {,1} it again colors all of it. when I do {,} it again colors all of it
  • smw
    smw about 6 years
    +1 ... and neither is dash, provided it is either the first or last character in the bracket range. Also probably worth noting that the the commas are literal as well and their repetition is unnecessary (in case the OP believes they are acting as range separators).
  • Can Buyukburc
    Can Buyukburc about 6 years
    what should I look for so that I can understand the mentality or rule for the part {2,}
  • nagamani
    nagamani about 6 years
    The "mentality" of {2,} is that whoever wrote the line of code your'e asking about wanted to match [a-zA-Z0-9.-] two or more times. In other words he didn't want single character matches. If you want to learn about the "mentality or rule" of regular expressions in general pythex.org and the relevant Wiki pages are as good a start as any.
  • Benjamin W.
    Benjamin W. over 5 years
    You could also use [[:alnum:],.-] (if the comma is desired).
  • Kusalananda
    Kusalananda over 5 years
    The set also matches commas and backslashes.