Match regex in ksh

39,232

Solution 1

Ksh has regular expressions, but not in the usual syntax (not in the version in Solaris 10).

if [[ $var = *@(foo|bar)*([0-9]) ]]; then …

In the manual, look under “conditional expressions” for what's inside the brackets and under “file name generation” for the pattern syntax.

Solution 2

Using case with glob patterns might work for you. The composite pattern *(pattern-list) means "Matches zero or more occurrences of the given patterns" and @(pattern-list) means "Matches exactly one of the given patterns."

matcher() {
  typeset var="$1"
  case "$var" in
    *@(foo|bar)*([0-9])) print "$var matched" ;;
    *) print "$var did not match" ;;
  esac
}

for var in foo bar baz foo123 abc_foo132 abc_foo123z bar1 1bar1 1bar1a; do 
  matcher "$var"
done

Outputs:

foo matched
bar matched
baz did not match
foo123 matched
abc_foo132 matched
abc_foo123z did not match
bar1 matched
1bar1 matched
1bar1a did not match

Solution 3

Why not use egrep(1)? Gives you all a regex user could wish for:

 if echo "$var" | egrep -s '(foo|bar)[0-9]*$'    # -s means "silent"
  then
    ...

Additional note for Solaris: With Solaris you may want to check the manpage for egrep - there is annother egrep version that is located at /usr/xpg4/bin/egrep that supports some more options and differs in functionality when it comes to advanced regex stuff.

Solution 4

I did something like this, using sed. I don't know how good it is, but at least it worked ^^

if [ -z "$(echo "$var" | sed -e 's/(foo|bar)[0-9]*$//')" ]; then
    print "variable matched regex"
fi
Share:
39,232

Related videos on Youtube

rahmu
Author by

rahmu

Updated on September 18, 2022

Comments

  • rahmu
    rahmu over 1 year

    I am looking to do something like this in KSH:

    if (( $var = (foo|bar)[0-9]*$ )); then
        print "variable matched regex"
    fi
    

    Is it possible at all?

    For the record I'm using Ksh Version M-11/16/88i on a Solaris 10 machine.

    • Admin
      Admin over 12 years
      Do you realize the regular expression [foo|bar] means "match a single character from the set (a,b,f,o,r,|)"? If you mean "match 'foo' or 'bar'" you want (foo|bar)
    • Admin
      Admin over 12 years
      True, didn't notice that. I will update accordingly.
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 12 years
    You need to put double quotes around variable substitutions, as always (otherwise you'll get wrong results or errors on some inputs containing wildcard characters or whitespace). Even with the right quoting, this assumes that the input doesn't contain a newline. Furthermore, your method is highly convoluted; grep is more natural, but there's a way that's built into ksh.
  • Tim Kennedy
    Tim Kennedy over 12 years
    sometimes Solaris will have GNU Egrep installed as well, as gegrep. Either from the Companion CD, or from SunFreeWare, and usually in /opt/sfw/bin or /usr/local/bin.
  • ktf
    ktf over 12 years
    @Tim: true, but I'd never rely on that. From my experience with customer production systems you'd better stick to what the base OS provides since what we would consider cool and helpful is often not allowed in production. +1 anyway ;-)
  • Tim Kennedy
    Tim Kennedy over 12 years
    100% agreed. i try to always use scripts/programs/perl modules that are installed by default. keeps things portable.
  • chepner
    chepner over 8 years
    This is a pattern match, not a regular expression match (which uses the =~ operator).
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 8 years
    @chepner It doesn't use regex syntax, but it is a regular expression. Ksh patterns include all regular expression operators.
  • Ali
    Ali over 2 years
    what does the *@ in the regular expression stands for ?
  • Gilles 'SO- stop being evil'
    Gilles 'SO- stop being evil' over 2 years
    @Ali In the ksh regular expression syntax (which is not the usual regex syntax), * stands for any sequence of characters and @(foo|bar) matches either foo or bar.
  • user1683793
    user1683793 over 2 years
    In this case, the *([0-9]) says to match zero or more occurrences of what is in the parens, NOT any sequence of characters. Observe, the regexp will not pass var="bar##3". Meanwhile, it will pass var if we have to have zero or more digits. as in var="bar321"