RegExp testing with ash shell (BusyBox)

9,649

Solution 1

You have three tools that can do regular expressions. These all assume that $in contains na-examplename-01.

  1. grep

    $ printf "%s\n" "$in" | ./grep -E '^[a-z]{2,3}-[a-z]+[0-9]*-[0-9]+$'
    na-examplename-01
    
  2. sed

    $ printf "%s\n" "$in" | ./sed -n '/^[a-z]\{2,3\}-[a-z]\+[0-9]*-[0-9]\+$/p'
    na-examplename-01
    
  3. awk

    $ printf "%s\n" "$in" | ./awk '/^[a-z]{2,3}-[a-z]+[0-9]*-[0-9]+$/'
    na-examplename-01
    

Note that those match on each line inside $in as opposed to the content of $in as a whole. For instance, they would match on the second and third line of a $in defined as

in='whatever
xx-a-1
yy-b-2'

As Stéphane pointed out in his answer, it's a good idea to prepend these commands with LC_ALL=C to ensure that your locale does not confuse the character ranges.

Solution 2

awk sounds like a good candidate:

input='whatever
even spaces
and newlines
xxx-blah12-0' # should not match

input='na-examplename-01' # should match

if
  LC_ALL=C awk '
    BEGIN{
      exit(!(ARGV[1] ~ /^[a-z]{2,3}-[a-z]+[0-9]*-[0-9]+$/))
    }' "$input"
then
  echo it matches
else
  echo >&2 it does not match
fi

Solution 3

You could use grep in extended regex mode like this:

echo na-examplename-01 | grep -E '^[a-z]{2,3}-[a-z]+[0-9]*-[0-9]+$'

You should use the interval parameter to make this more easy to read. [a-z][a-z]|[a-z][a-z][a-z] would be [a-z]{2,3}.

[a-z]+ is the same as [a-z][a-z]*

For the grep snytax, take a look at https://www.gnu.org/software/findutils/manual/html_node/find_html/grep-regular-expression-syntax.html

Share:
9,649

Related videos on Youtube

rosi97
Author by

rosi97

Updated on September 18, 2022

Comments

  • rosi97
    rosi97 almost 2 years

    I need to do a RegExp pattern test on a certain bit of user input. This is the pattern I need to test the value against.

    ^([a-z]{2,3})\-([a-z][a-z]*[0-9]*)\-(\d+)$
    

    An example match would be: na-examplename-01

    The shell I have available is BusyBox a.k.a ash, so I don't have full bash functionality.

    What are my options for RegExp pattern tests when using BusyBox?

    Note: I cannot use expr, as it is not available in my install.

    I have the following functions available:

    arp, ash, awk, basename, bash, bunzip2, bzcat, bzip2, cat, chmod,
    chown, chvt, clear, cp, crond, crontab, cryptpw, cut, date, dd,
    deallocvt, df, dirname, dmesg, dnsdomainname, dos2unix, du, egrep,
    eject, env, fbset, fgconsole, fgrep, find, findfs, flock, free, fstrim,
    ftpget, ftpput, fuser, getopt, grep, groups, gunzip, gzip, head,
    hostname, httpd, hwclock, id, ifconfig, ifdown, ifplugd, ifup, install,
    ionice, iostat, ip, kill, killall, killall5, less, ln, loadkmap,
    logger, login, ls, lsof, md5sum, mkdir, mkdosfs, mkfifo, mkfs.vfat,
    mknod, mkpasswd, mkswap, mktemp, more, mount, mountpoint, mpstat, mv,
    nbd-client, nc, netstat, nice, nohup, nslookup, ntpd, od, pgrep, pidof,
    ping, ping6, pmap, printenv, ps, pstree, pwd, pwdx, rdate, readlink,
    realpath, renice, reset, rm, rmdir, route, sed, seq, setconsole,
    setserial, sh, sleep, smemcap, sort, stat, su, switch_root, sync,
    sysctl, tail, tar, tee, telnet, time, top, touch, tr, traceroute,
    traceroute6, true, ttysize, umount, uname, uniq, unix2dos, unxz,
    uptime, usleep, vconfig, vi, watch, wc, wget, which, whoami, whois,
    xargs, xz, xzcat, zcat
    
    • Stéphane Chazelas
      Stéphane Chazelas about 9 years
      Do you need to extract those 3 groups or just check that the pattern matches?
    • rosi97
      rosi97 about 9 years
      Just that the given input matches the format the RegExp pattern is
    • rosi97
      rosi97 about 9 years
      Sorry. I've added an example of what the pattern would/should match. The hyphens are part of the pattern. Your break down of the pattern is correct.
    • K1773R
      K1773R about 9 years
      @StéphaneChazelas my regex includes a ^, should be correct.
    • rosi97
      rosi97 about 9 years
      @K1773R is correct. I made a mistake with my RegExp rule. The beginning part should only allow two or three letter characters so: ^[a-z]{2,3} would work.
    • K1773R
      K1773R about 9 years
      @James [a-z][a-z]* is the same as [a-z]+
    • Wildcard
      Wildcard over 7 years
      You say you don't have Bash functionality available, but you list bash in your available commands.
    • rosi97
      rosi97 over 7 years
      @Wildcard Its an alias, that points to ash
  • rosi97
    rosi97 about 9 years
    Good point about the locale with LC_ALL=C
  • rosi97
    rosi97 about 9 years
    Thanks for showing me where I can optimise my pattern test.
  • Stéphane Chazelas
    Stéphane Chazelas about 9 years
    or grep -xE '[a-z]{2,3}-[a-z]+[0-9]*-[0-9]+'
  • K1773R
    K1773R about 9 years
    @StéphaneChazelas good one ;)
  • Wildcard
    Wildcard over 7 years
    Just for completeness, he actually has more than three tools than can do things with regexes; you've just given the most applicable ones. (vi, less and more can all handle regexes also.)