Python: How can I include the delimiter(s) in a string split?

16,419

Solution 1

You can do that with Python's re module.

import re
s='(twoplusthree)plusfour'
list(filter(None, re.split(r"(plus|[()])", s)))

You can leave out the list if you only need an iterator.

Solution 2

import re
s = '(twoplusthree)plusfour'
l = re.split(r"(plus|\(|\))", s)
a = [x for x in l if x != '']
print a

output:

['(', 'two', 'plus', 'three', ')', 'plus', 'four']

Solution 3

Here is an easy way using re.split:

import re

s = '(twoplusthree)plusfour'
re.split('(plus)',  s)

Output:

['(two', 'plus', 'three)', 'plus', 'four']

re.split is very similar to string.split except that instead of a literal delimiter you pass a regex pattern. The trick here is to put () around the pattern so it gets extracted as a group.

Bear in mind that you'll have empty strings if there are two consecutive occurrencies of the delimiter pattern

Share:
16,419
Bill
Author by

Bill

Updated on June 05, 2022

Comments

  • Bill
    Bill almost 2 years

    I would like to split a string, with multiple delimiters, but keep the delimiters in the resulting list. I think this is a useful thing to do an an initial step of parsing any kind of formula, and I suspect there is a nice Python solution.

    Someone asked a similar question in Java here.

    For example, a typical split looks like this:

    >>> s='(twoplusthree)plusfour'
    >>> s.split(f, 'plus')
    ['(two', 'three)', 'four']
    

    But I'm looking for a nice way to add the plus back in (or retain it):

    ['(two', 'plus', 'three)', 'plus', 'four']
    

    Ultimately I'd like to do this for each operator and bracket, so if there's a way to get

    ['(', 'two', 'plus', 'three', ')', 'plus', 'four']
    

    all in one go, then all the better.