How to add a variable into my re.compile expression

19,402

Solution 1

You can do it like so...

>>> regex2 = re.compile('.*(%s).*'%what2look4)

Or you can use format:

>>> regex2 = re.compile('.*({}).*'.format(what2look4))

Solution 2

Use a String format:

search = "whattolookfor"
regex2=re.compile(".*({}).*".format(search))

The {} inside the string will be replaced with whattolookfor

Solution 3

If you're not careful, the answers above can get you into trouble. In most cases, you will want to use re.escape() to escape any possible regular expression metacharacters that appear in the string variable you are trying to insert. In addition, both f-strings and the .format() method require the use of curly braces {}, which are regular expression metacharacters themselves. At the very least, your linter will throw a fit if you try to mix the two.

Although it's much uglier, I would recommend building the regex pattern using string addition. It's the clearest, least error-prone method in this case. The printf style should work fine in Python, but I personally don't recommend it because the "%" symbol is the wildcard operator in SQL, and I find it confusing to see in a regex.

Consider the example below where we are looking for a file name that could be in any folder and that we expect to end with a date.

# Note that "\d" is a regular expression metacharacter!
file_name_var = "\data"

# Option 1: string addition
re.compile(r'^.*' + re.escape(file_name_var ) + r'_\d{4}-\d{2}-\d{2}.csv$')

# Option 2: printf style
re.compile(r'^.*%s_\d{4}-\d{2}-\d{2}.csv$' % re.escape(file_name_var ))
Share:
19,402
Kprakash
Author by

Kprakash

Updated on July 28, 2022

Comments

  • Kprakash
    Kprakash almost 2 years

    So I am trying to look through a file for a keyword that is represented by the variable what2look4. Whenever I run this program however it keeps returning blank data. The code is as follows:

    regex2=re.compile(".*(what2look4).*")
    

    I believe the problem is that the file is being searched for what2look4 as a string in itself instead of what that variable represents. Please correct me if I'm wrong, thanks for the help.