How can I match a string with a regex in Bash?
Solution 1
To match regexes you need to use the =~
operator.
Try this:
[[ sed-4.2.2.tar.bz2 =~ tar.bz2$ ]] && echo matched
Alternatively, you can use wildcards (instead of regexes) with the ==
operator:
[[ sed-4.2.2.tar.bz2 == *tar.bz2 ]] && echo matched
If portability is not a concern, I recommend using [[
instead of [
or test
as it is safer and more powerful. See What is the difference between test, [ and [[ ? for details.
Solution 2
A Function To Do This
extract () {
if [ -f $1 ] ; then
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
*.rar) rar x $1 ;;
*.gz) gunzip $1 ;;
*.tar) tar xvf $1 ;;
*.tbz2) tar xvjf $1 ;;
*.tgz) tar xvzf $1 ;;
*.zip) unzip $1 ;;
*.Z) uncompress $1 ;;
*.7z) 7z x $1 ;;
*) echo "don't know '$1'..." ;;
esac
else
echo "'$1' is not a valid file!"
fi
}
Other Note
In response to Aquarius Power in the comment above, We need to store the regex on a var
The variable BASH_REMATCH is set after you match the expression, and ${BASH_REMATCH[n]} will match the nth group wrapped in parentheses ie in the following ${BASH_REMATCH[1]} = "compressed"
and ${BASH_REMATCH[2]} = ".gz"
if [[ "compressed.gz" =~ ^(.*)(\.[a-z]{1,5})$ ]];
then
echo ${BASH_REMATCH[2]} ;
else
echo "Not proper format";
fi
(The regex above isn't meant to be a valid one for file naming and extensions, but it works for the example)
Solution 3
I don't have enough rep to comment here, so I'm submitting a new answer to improve on dogbane's answer. The dot . in the regexp
[[ sed-4.2.2.tar.bz2 =~ tar.bz2$ ]] && echo matched
will actually match any character, not only the literal dot between 'tar.bz2', for example
[[ sed-4.2.2.tar4bz2 =~ tar.bz2$ ]] && echo matched
[[ sed-4.2.2.tar§bz2 =~ tar.bz2$ ]] && echo matched
or anything that doesn't require escaping with '\'. The strict syntax should then be
[[ sed-4.2.2.tar.bz2 =~ tar\.bz2$ ]] && echo matched
or you can go even stricter and also include the previous dot in the regex:
[[ sed-4.2.2.tar.bz2 =~ \.tar\.bz2$ ]] && echo matched
Solution 4
Since you are using bash, you don't need to create a child process for doing this. Here is one solution which performs it entirely within bash:
[[ $TEST =~ ^(.*):\ +(.*)$ ]] && TEST=${BASH_REMATCH[1]}:${BASH_REMATCH[2]}
Explanation: The groups before and after the sequence "colon and one or more spaces" are stored by the pattern match operator in the BASH_REMATCH array.
Solution 5
shopt -s nocasematch
if [[ sed-4.2.2.$LINE =~ (yes|y)$ ]]
then exit 0
fi
Related videos on Youtube
user1587462
Updated on July 08, 2022Comments
-
user1587462 almost 2 years
I am trying to write a bash script that contains a function so when given a
.tar
,.tar.bz2
,.tar.gz
etc. file it uses tar with the relevant switches to decompress the file.I am using if elif then statements which test the filename to see what it ends with and I cannot get it to match using regex metacharacters.
To save constantly rewriting the script I am using 'test' at the command line, I thought the statement below should work, I have tried every combination of brackets, quotes and metacharaters possible and still it fails.
test sed-4.2.2.tar.bz2 = tar\.bz2$; echo $? (this returns 1, false)
I'm sure the problem is a simple one and I've looked everywhere, yet I cannot fathom how to do it. Does someone know how I can do this?
-
Alan Porter over 10 yearsBe careful with the glob wildcard matching in the second example. Inside [[ ]], the * is not expanded as it usually is, to match filenames in the current directory that match a pattern.Your example works, but it's really easy to over-generalize and mistakenly believe that * means to match anything in any context. It only works like that inside [[ ]]. Otherwise, it expands to the existing filenames.
-
Aquarius Power about 10 yearsI tried to use quotes on the regex and failed; this answer helped on making this work
check="^a.*c$";if [[ "abc" =~ $check ]];then echo match;fi
we need to store the regex on a var -
pevik over 9 yearsAlso to note that regexp (like in perl) must NOT be in parenthesis:
[[ sed-4.2.2.tar.bz2 == "*tar.bz2" ]]
wouldn't work. -
Good Person about 8 yearsalso note that with BSD tar you can use "tar xf" for all formats and don't need separate commands or this function whatsoever.
-
Skippy le Grand Gourou over 7 yearsFWIW, the syntax for negation (i.e. does not match) is
[[ ! foo =~ bar ]]
. -
Admin over 7 yearsdash doesn't support the
-n 1
parameter, neither does it put it automatically into a$REPLY
variable. Watch Out! -
Mark K Cowan about 7 years
a
on GNU tar orp
on BSD tar to explicitly tell it to automatically infer compression type from extension. GNU tar will not do it automatically otherwise, and I'm guessing from @GoodPerson 's comment that BSD tar does do it by default. -
miken32 over 6 yearsIf portability is a concern, then don't use the
=~
operator! -
James Brown over 6 yearsThe page you linked to mentiones RegularExpression matching =~ [is] (not available) [in] old test [ so I guess it's not an option in the instead of part.
-
mosh over 6 years7z can unpack .. AR, ARJ, CAB, CHM, CPIO, CramFS, DMG, EXT, FAT, GPT, HFS, IHEX, ISO, LZH, LZMA, MBR, MSI, NSIS, NTFS, QCOW2, RAR, RPM, SquashFS, UDF, UEFI, VDI, VHD, VMDK, WIM, XAR and Z. see 7-zip.org
-
void.pointer almost 6 yearsWhy do quotes cause the regex to not match? I thought it was a best practice to quote any variable usage, like
"$foo"
, so[[ "$foo" == "^release/" ]]
seems like it should work... -
i336_ almost 6 yearsThis is extremely dangerous; it only behaves without undefined behavior for you because you have no files in the current directory named the literal substring "pattern". Go ahead, create some files named like that, and substring expansion will match the files and break everything horribly with multicolored heisenbugs.
-
Rainer Schwarze over 5 yearsNote that index 0 contains the full match and index 1 and 2 contain the group matches.
-
Admin over 5 yearsBut I have done an experiment: with files `1pattern, pattern pattern2 and pattern in the current directory. This script works as expected. Could you please provide me with your test result? @i336_
-
user1934428 over 5 years@i336: I don't think so. Within
[[ ... ]]
, the rhs glob pattern does not expand according tho the current directory, as it would usually do. -
rosshjb almost 4 years@i336_ No. Within
[[...]]
, Bash doesn't perform filename expansion. In bash manual,Word splitting and filename expansion are not performed on the words between the [[ and ]];
-
user1934428 about 3 years@juancortez : It also does not really fulfil the requirments of the OP, who - for whatever reason - asked for matching a regexp.