Splitting a string by list of indices
Solution 1
s = 'long string that I want to split up'
indices = [0,5,12,17]
parts = [s[i:j] for i,j in zip(indices, indices[1:]+[None])]
returns
['long ', 'string ', 'that ', 'I want to split up']
which you can print using:
print '\n'.join(parts)
Another possibility (without copying indices
) would be:
s = 'long string that I want to split up'
indices = [0,5,12,17]
indices.append(None)
parts = [s[indices[i]:indices[i+1]] for i in xrange(len(indices)-1)]
Solution 2
Here is a short solution with heavy usage of the itertools module. The tee
function is used to iterate pairwise over the indices. See the Recipe section in the module for more help.
>>> from itertools import tee, izip_longest
>>> s = 'long string that I want to split up'
>>> indices = [0,5,12,17]
>>> start, end = tee(indices)
>>> next(end)
0
>>> [s[i:j] for i,j in izip_longest(start, end)]
['long ', 'string ', 'that ', 'I want to split up']
Edit: This is a version that does not copy the indices list, so it should be faster.
Solution 3
You can write a generator if you don't want to make any modifications to the list of indices:
>>> def split_by_idx(S, list_of_indices):
... left, right = 0, list_of_indices[0]
... yield S[left:right]
... left = right
... for right in list_of_indices[1:]:
... yield S[left:right]
... left = right
... yield S[left:]
...
>>>
>>>
>>> s = 'long string that I want to split up'
>>> indices = [5,12,17]
>>> [i for i in split_by_idx(s, indices)]
['long ', 'string ', 'that ', 'I want to split up']
Yarin
Products PDF Buddy - Popular online PDF editor Gems Snappconfig - Smarter Rails app configuration
Updated on May 26, 2020Comments
-
Yarin almost 4 years
I want to split a string by a list of indices, where the split segments begin with one index and end before the next one.
Example:
s = 'long string that I want to split up' indices = [0,5,12,17] parts = [s[index:] for index in indices] for part in parts: print part
This will return:
long string that I want to split up
string that I want to split up
that I want to split up
I want to split upI'm trying to get:
long
string
that
I want to split up -
jamylak almost 12 yearsAnother way is,
[s[i:j] for i,j in izip_longest(indices,indices[1:])]
but I like your way better! -
schlamar almost 12 yearsThis copies the indices list with
indices[1:]
and creates a new list with double size by thezip
function -> Bad performance and memory consumption. -
jamylak almost 12 years@ms4py This is fine, performance is not an issue in this case, this is a very readable solution. If performance is an issue my suggestion can be used.
-
Yarin almost 12 yearseumiro- thank you, this works great. Can you explain how the +[None] part works?
-
eumiro almost 12 years@ms4py - ok, there's an updated version withou copying of the list and without zip. Although your
itertools
version is probably more performant. -
eumiro almost 12 years@Yarin -
indices[1:] + [None]
copies the array without the first element and adds aNone
at the end. So for yourindices
it looks like[5,12,17,None]
. I am using it to be able to access the last part of the string withs[17:None]
(the same likes[17:]
, just using two variables I have anyway). -
jamylak almost 12 years@Yarin
[1:None]
for example is the same as[1:]
-
Yarin almost 12 yearsThanks for the alt approach- ill have to check out itertools sometime
-
jamylak almost 12 years@ms4py What do you mean by that?
-
Levon almost 12 yearsNeat approach, learned something new. Is there an easy way to get rid of the extra blank at the end of the first 3 strings inside the expression? I tried
s[i:j].strip()
but that didn't work at all (not sure why not) -
jamylak almost 12 yearsIf you are gonna use this you may as well use the pairwise function straight from the itertools docs. Also using
next(end)
is preferred toend.next()
for python 3 compatibility. -
lonewarrior556 about 4 yearsNot sure it's your fortee but how would on do this in NodeJs?
-
Siva Sankar about 2 yearsThis had been a hectic for me since an hour and half. Thanks @eumiro