Python: Splitting by certain pattern
19,548
Solution 1
I prefer to use re.findall
and specify what I want instead of trying to describe the delimiter for re.split
>>> s = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'
>>> re.findall(r"\[[^\]]*\]",s)
['[5.955894, 45.817792]', '[10.49238, 45.817792]', '[10.49238, 47.808381]', '[5.955894, 47.808381]']
\[
matches [[^\]]*
matches anything but ]\]
matches ]
Solution 2
You need to use re.split
with look-ahead:
>>> s = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'
>>> re.split(",[ ]*(?=\[)", s)
['[5.955894, 45.817792]', '[10.49238, 45.817792]', '[10.49238, 47.808381]', '[5.955894, 47.808381]']
And don't use str
as variable. It's shadows the built-in.
The below pattern:
,[ ]*(?=\[)
will match the comma(,)
and some whitespaces, which is followed by a [
You can even do it with look-behind
. So, (?<=\]),[ ]*
will also work.
Author by
grssnbchr
Updated on June 15, 2022Comments
-
grssnbchr almost 2 years
I have the following
str = '[5.955894, 45.817792], [10.49238, 45.817792], [10.49238, 47.808381], [5.955894, 47.808381]'
I want to split it so that I have an array of strings like
['[5.955894, 45.817792]', '[10.49238, 45.817792]', ...]
So that the [...] objects are elements of the array. It is important that the enclosing [ and ] are included. I've come so far:
re.split('\D,\s\D', str)
But that gives me:
['[5.955894, 45.817792', '10.49238, 45.817792', '10.49238, 47.808381', '5.955894, 47.808381]']
Not really what I want.