How can I split by 1 or more occurrences of a delimiter in Python?
Solution 1
Just do not give any delimeter?
>>> a="test result"
>>> a.split()
['test', 'result']
Solution 2
>>> import re
>>> a="test result"
>>> re.split(" +",a)
['test', 'result']
>>> a.split()
['test', 'result']
Solution 3
Just this should work:
a.split()
Example:
>>> 'a b'.split(' ')
['a', '', '', '', '', '', 'b']
>>> 'a b'.split()
['a', 'b']
From the documentation:
If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].
Solution 4
Any problem with simple a.split()
?
Solution 5
If you want to split by 1 or more occurrences of a delimiter and don't want to just count on the default split()
with no parameters happening to match your use case, you can use regex to match the delimiter. The following will use one or more occurrences of .
as the delimiter:
s = 'a.b....c......d.ef...g'
sp = re.compile('\.+').split(s)
print(sp)
which gives:
['a', 'b', 'c', 'd', 'ef', 'g']
Related videos on Youtube
Adam Matan
Team leader, developer, and public speaker. I build end-to-end apps using modern cloud infrastructure, especially serverless tools. My current position is R&D Manager at Corvid by Wix.com, a serverless platform for rapid web app generation. My CV and contact details are available on my Github README.
Updated on February 26, 2020Comments
-
Adam Matan over 3 years
I have a formatted string from a log file, which looks like:
>>> a="test result"
That is, the test and the result are split by some spaces - it was probably created using formatted string which gave
test
some constant spacing.Simple splitting won't do the trick:
>>> a.split(" ") ['test', '', '', '', ... '', '', '', '', '', '', '', '', '', '', '', 'result']
split(DELIMITER, COUNT)
cleared some unnecessary values:>>> a.split(" ",1) ['test', ' result']
This helped - but of course, I really need:
['test', 'result']
I can use
split()
followed bymap
+strip()
, but I wondered if there is a more Pythonic way to do it.Thanks,
Adam
UPDATE: Such a simple solution! Thank you all.
-
Adam Matan over 13 yearsCool. Might help with other, none-whitespace delimiters.
-
Sakie over 13 yearsAs for why this works: a.split(None) is a special case, which in Python means "split on one or more whitespace chars". re.split() is the general case solution.
-
Sakie over 13 yearsre.split('\W+',mystring) is more equivalent string.split(None).
-
Wowbagger and his liquid lunch over 10 yearsThis is the only answer to the actual request, "split by 1 or more occurrences of a delimiter".
-
tbrittoborges over 8 yearsOne needs to use str.split(None, maxsplit) since the function does not accept keyword arguments. I wonder why.
-
Risinek over 7 yearsthis should be accepted Answer.... The other ones are not answering the real question...
-
Risinek over 7 yearsthe question was, how to split with delimiter+ (one or more). You answer is saying any of whitespace will be taken as delimiter, which is not correct answer
-
Risinek over 7 yearsthe question was, how to split with delimiter+ (one or more). You answer is saying any of whitespace will be taken as delimiter, which is not correct answer
-
BarathVutukuri over 4 years
re.split()
gives me an extra token if the string ends with a space. -
theferrit32 over 3 years@BarathVutukuri that is the correct behavior of a
split
function. If the input sequence ends with a delimiter, there is an empty term after that delimiter. Java's handling of this case is out of the ordinary, where the API documentation specifically says it discards trailing empty terms (but not leading ones) when no term limit is applied. Python, Javascript, C# do not discard trailing terms.