Wildcard matching in Python
Solution 1
It looks like you're essentially implementing a subset of regular expressions. Luckily, Python has a library for that built-in! If you're not familiar with how regular expressions (or, as their friends call them, regexes) work, I highly recommend you read through the documentation for them.
In any event, the function re.search
is, I think, exactly what you're looking for. It takes, as its first argument, a pattern to match, and, as its second argument, the string to match it in. If the pattern is matched, search
returns an SRE_Match
object, which, conveniently, has a #start()
method that returns the index at which the match starts.
To use the data from your example:
import re
start_index = re.search(r'x.z', 'xxxxxgzg').start()
Note that, in regexes, .
- not *
-- is the wildcard, so you'll have to replace them in the pattern you're using.
Solution 2
Regex, like the accepted answer suggests, is one way of handling the problem. Although, if you need a simpler pattern (such as Unix shell-style wildcards), then the fnmatch
built in library can help:
Expressions:
-
*
- matches everything -
?
- matches any single character -
[seq]
- matches any character inseq
-
[!seq]
- matches any character not inseq
So for example, trying to find anything that would match with localhost
:
import fnmatch
my_pattern = "http://localhost*"
name_to_check = "http://localhost:8080"
fnmatch.fnmatch(name_to_check, my_pattern) # True
The nice part of this is that /
is not considered a special character, so for filename/URL matching this works out quite well without having to pre-escape all slashes!
User
Updated on June 05, 2022Comments
-
User almost 2 years
I have a class called Pattern, and within it two methods, equates and setwildcard. Equates returns the index in which a substring first appears in a string, and setwildcard sets a wild card character in a substring
So
p = Pattern('xyz') t = 'xxxxxyz' p.equates(t)
Returns 4
Also
p = Pattern('x*z', '*') t = 'xxxxxgzx' p.equates(t)
Returns 4, because * is the wildcard and can match any letter within t, as long as x and z match. What's the best way to implement this?
-
EddieOffermann about 2 yearsNote that fnmatch is case-insensitive: it applies os.path.normcase() before the comparison. If case-sensitivity is important, Lib/fnmatch provides fnmatch.fnmatchcase for this.