python re.sub group: number after \number

106,251

The answer is:

re.sub(r'(foo)', r'\g<1>123', 'foobar')

Relevant excerpt from the docs:

In addition to character escapes and backreferences as described above, \g will use the substring matched by the group named name, as defined by the (?P...) syntax. \g uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE.

Share:
106,251

Related videos on Youtube

zhigang
Author by

zhigang

programmer

Updated on January 08, 2020

Comments

  • zhigang
    zhigang over 3 years

    How can I replace foobar with foo123bar?

    This doesn't work:

    >>> re.sub(r'(foo)', r'\1123', 'foobar')
    'J3bar'
    

    This works:

    >>> re.sub(r'(foo)', r'\1hi', 'foobar')
    'foohibar'
    

    I think it's a common issue when having something like \number. Can anyone give me a hint on how to handle this?

    • aliteralmind
      aliteralmind about 9 years
      This question has been added to the Stack Overflow Regular Expression FAQ, under "Groups".
    • Mark Ch
      Mark Ch almost 4 years
      this question took me quite a long time to find, because it doesn't feature the terms 'capture group' or 'numbered group reference', but I'm here eventually and glad you asked it.
    • smci
      smci almost 4 years
      Your issue is that r'\112' is getting interpreted as the octal literal 0112, ASCII'J', or decimal 74. Can't see how to force the backreference '\1' to get evaluated before string concatenation or ''.join()
  • speedplane
    speedplane over 7 years
    Don't be so hard on yourself. It's buried in the documentation so far deep that it would take most people far more time to read the docs than to google their question and have this answer come up on SO.
  • patrick
    patrick about 5 years
    The exact quote provided is found here in case you are looking for context
  • Fred Dufresne
    Fred Dufresne over 1 year
    @EricBellet You will most likely have to do that in a few lines. Even if it could be possible to one line it, it wouldn't be easy enough to maintain to be worth the risk. If you're playing code golf, then there's a way to conditionally match and reference the matched character using named groups, for example to find single or double quoted text in Python, you could do (?P<q>['"])(.*)(?P=q) where the (?P=q) references the named group (?P<q>['"]). For example, if the first character is a single quote, the last group will only match single quotes.

Related