Python 3.7.4: 're.error: bad escape \s at position 0'

24,254

Solution 1

Just try import regex as re instead of import re.

Solution 2

Try fiddling with the backslashes to avoid that regex tries to interpret \s:

spaced_pattern = re.sub(r"\\\s+", "\\\s+", escaped_str)

now

>>> spaced_pattern
'The\\s+quick\\s+brown\\s+fox\\s+jumped'
>>> print(spaced_pattern)
The\s+quick\s+brown\s+fox\s+jumped

But why?

It seems that python tries to interpret \s like it would interpret r"\n" instead of leaving it alone like Python normally does. If you do. For example:

re.sub(r"\\\s+", r"\n+", escaped_str)

yields:

The
+quick
+brown
+fox
+jumped

even if \n was used in a raw string.

The change was introduced in Issue #27030: Unknown escapes consisting of '\' and ASCII letter in regular expressions now are errors.

The code that does the replacement is in sre_parse.py (python 3.7):

        else:
            try:
                this = chr(ESCAPES[this][1])
            except KeyError:
                if c in ASCIILETTERS:
                    raise s.error('bad escape %s' % this, len(this))

This code looks for what's behind a literal \ and tries to replace it by the proper non-ascii character. Obviously s is not in ESCAPES dictionary so the KeyError exception is triggered, then the message you're getting.

On previous versions it just issued a warning:

import warnings
warnings.warn('bad escape %s' % this,
              DeprecationWarning, stacklevel=4)

Looks that we're not alone to suffer from 3.6 to 3.7 upgrade: https://github.com/gi0baro/weppy/issues/227

Share:
24,254
Steele Farnsworth
Author by

Steele Farnsworth

Updated on March 04, 2021

Comments

  • Steele Farnsworth
    Steele Farnsworth over 3 years

    My program looks something like this:

    import re
    # Escape the string, in case it happens to have re metacharacters
    my_str = "The quick brown fox jumped"
    escaped_str = re.escape(my_str)
    # "The\\ quick\\ brown\\ fox\\ jumped"
    # Replace escaped space patterns with a generic white space pattern
    spaced_pattern = re.sub(r"\\\s+", r"\s+", escaped_str)
    # Raises error
    

    The error is this:

    Traceback (most recent call last):
      File "<input>", line 1, in <module>
      File "/home/swfarnsworth/programs/pycharm-2019.2/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
        pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
      File "/home/swfarnsworth/programs/pycharm-2019.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
        exec(compile(contents+"\n", file, 'exec'), glob, loc)
      File "/home/swfarnsworth/projects/medaCy/medacy/tools/converters/con_to_brat.py", line 255, in <module>
        content = convert_con_to_brat(full_file_path)
      File "/home/swfarnsworth/projects/my_file.py", line 191, in convert_con_to_brat
        start_ind = get_absolute_index(text_lines, d["start_ind"], d["data_item"])
      File "/home/swfarnsworth/projects/my_file.py", line 122, in get_absolute_index
        entity_pattern_spaced = re.sub(r"\\\s+", r"\s+", entity_pattern_escaped)
      File "/usr/local/lib/python3.7/re.py", line 192, in sub
        return _compile(pattern, flags).sub(repl, string, count)
      File "/usr/local/lib/python3.7/re.py", line 309, in _subx
        template = _compile_repl(template, pattern)
      File "/usr/local/lib/python3.7/re.py", line 300, in _compile_repl
        return sre_parse.parse_template(repl, pattern)
      File "/usr/local/lib/python3.7/sre_parse.py", line 1024, in parse_template
        raise s.error('bad escape %s' % this, len(this))
    re.error: bad escape \s at position 0
    

    I get this error even if I remove the two backslashes before the '\s+' or if I make the raw string (r"\\\s+") into a regular string. I checked the Python 3.7 documentation, and it appears that \s is still the escape sequence for white space.