Is there a better way to use strip() on a list of strings? - python
Solution 1
You probably shouldn't be using list
as a variable name since it's a type. Regardless:
list = map(str.strip, list)
This will apply the function str.strip
to every element in list
, return a new list, and store the result back in list
.
Solution 2
You could use list comprehensions
stripped_list = [j.strip() for j in initial_list]
Solution 3
Some intriguing discussions on performance happened here, so let me provide a benchmark:
noslice_map : 0.0814900398254
slice_map : 0.084676027298
noslice_comprehension : 0.0927240848541
slice_comprehension : 0.124806165695
iter_manual : 0.133514881134
iter_enumerate : 0.142778873444
iter_range : 0.160353899002
So:
-
map(str.strip, my_list)
is the fastest way, it's just a little bit faster than comperhensions.- Use
map
oritertools.imap
if there's a single function that you want to apply (like str.split) - Use comprehensions if there's a more complicated expression
- Use
- Manual iteration is the slowest way; a reasonable explanation is that it requires the interpreter to do more work and the efficient C runtime does less
- Go ahead and assign the result like
my_list[:] = map...
, the slice notation introduces only a small overhead and is likely to spare you some bugs if there are multiple references to that list.- Know the difference between mutating a list and re-creating it.
Solution 4
I think you mean
a_list = [s.strip() for s in a_list]
Using a generator expression may be a better approach, like this:
stripped_list = (s.strip() for s in a_list)
offers the benefit of lazy evaluation, so the strip
only runs when the given element, stripped, is needed.
If you need references to the list to remain intact outside the current scope, you might want to use list slice syntax.:
a_list[:] = [s.strip() for s in a_list]
For commenters interested in the speed of various approaches, it looks as if in CPython the generator-to-slice approach is the least efficient:
>>> from timeit import timeit as t
>>> t("""a[:]=(s.strip() for s in a)""", """a=[" %d " % s for s in range(10)]""")
4.35184121131897
>>> t("""a[:]=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.9129951000213623
>>> t("""a=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.47947096824646
Comments
-
alvas over 3 years
For now i've been trying to perform strip() on a list of strings and i did this:
i = 0 for j in alist: alist[i] = j.strip() i+=1
Is there a better way of doing that?
-
KRyan over 11 yearsUpvoting for random anonymous uncommented downvote. If there is something wrong with the question, it's utterly meaningless to downvote without telling the author what.
-
Kos over 11 yearsIf you want to iterate using indices, do
for (i, value) in enumerate(alist)
-
Kos over 11 yearsI've added a benchmark which compares some options described here.
-
-
Kos over 11 yearsWhy say "supposedly slightly more efficient" instead of profiling and checking? And BTW
[:]
is useful because then it alters the same list, not re-assigns the variable to a new list. -
Admin over 11 yearsIt's less efficient because it has to copy N items instead of replacing the reference to the list. The only "advantage", which you may not need or want, is that the change is visible to anyone who has another reference to the original list object.
-
Kos over 11 years+1 that's the way. And if you want to alter the same list instance instead of binding the variable to a new one (say, not to break other references to this list), use the slice syntax like @kojiro said
-
Marcin over 11 yearsAn example where
map
is an excellent choice. (itertools.imap
might or might not be better, of course, as for example when assigning to a slice). -
Sean W. over 11 yearsimho, that's unpythonic.
-
Marcin over 11 years@Kos In that case, an iterator-based solution would be even better (as it avoids creating a whole list which is then unreferenced and awaiting garbage collection).
-
Marcin over 11 yearsI've changed this to a generator expression, as it's vastly more appropriate.
-
Surya over 11 yearsDo you think list comprehensions make code work faster?? or just smaller??
-
kojiro over 11 years@Marcin it might be a more appropriate approach, but it's an incorrect answer to the question asked. I edited the question to describe both options.
-
alvas over 11 yearsno worries, memory shouldn't be a problem since i'm reading a file, searching a string and dumping it away once i've found the index of a string. =)
-
karthikr over 11 yearsList comprehensions are very efficient for iterable object with simple rules. You may use maps and list comprehensions depending on the complexity. But yes, they do provide a quick and efficient implementation
-
Izkata over 11 yearsDo you mean
my_list = map(str.strip, list[:])
? 'Cause the other way gives me a NameError. -
Marcin over 11 years@kojiro If you are assigning to a slice, a generator is more appropriate. You have edited your question to eliminate slice assignment.
-
kojiro over 11 years@Marcin how is it more appropriate? I've added timeits and it doesn't seem to be as efficient. (I'm not equating efficiency and appropriateness, but I genuinely don't know why it would be more appropriate in the absence of efficiency.)
-
Marcin over 11 years@kojiro You'll likely see better efficiency for larger lists, as less memory allocation will occur; secondly in real usage it is likely to lead to better overall performance, as there will be less in the way of garbage collectible, but uncollected objects hanging around.
-
Marcin over 11 yearsAlso, it's generally nicer for everybody else if you don't copy the interpreter prompts
-
kojiro over 11 years@Marcin if I don't copy the interpreter prompts, how can you tell the difference between a command and its output?
-
Marcin over 11 years@kojiro I usually comment the output like so
# =>
, but a simple comment will suffice. -
Kos over 11 yearsI mean
my_list[:] = map(str.strip, my_list)
. See the code under the link. -
shantanoo over 10 yearsInstead of using map and storing the data in the list again, itertools.imap is better in case of python 2.x. In python 3.x map will return iter.