How to check if an element is an empty list in pandas?
Solution 1
You can do this:
df[df["col"].str.len() != 0]
Example:
import pandas as pd
df = pd.DataFrame({"col": [[1], [2, 3], [], [4, 5, 6], []]}, dtype=object)
print(df[df["col"].str.len() != 0])
# col
# 0 [1]
# 1 [2, 3]
# 3 [4, 5, 6]
Solution 2
This is probably the most efficient solution.
df[df["col"].astype(bool)]
Solution 3
Try this:
df[df['col'].apply(len).gt(0)]
Solution 4
bool
An empty list in a boolean context is False
. An empty list is what we call falsey. It does a programmer well to know what objects are falsey and truthy.
You can also slice a dataframe with a boolean list (not just a boolean series). And so, I'll use a comprehension to speed up the checking.
df[[bool(x) for x in df.col]]
Or with even less characters
df[[*map(bool, df.col)]]
Related videos on Youtube

Blaszard
I'm here to gain knowledge and insights on a variety of fields I'm interested in. Specifically, Programming & Software Development (Python and R; no longer use Swift and JavaScript/node.js) Data Science, Machine Learning, AI, & statistics Travel (started in 2016) Language (普通话, français, español, italiano, русский, 한국어) Politics, Economics, and Finance Currently (in 2020), my primary interest is Korean and Russian😈 PS: I'm not a native-English speaker. If you find any errors in my grammar and expressions, don't hesitate to edit it. I'll appreciate it👨🏻💼
Updated on October 31, 2022Comments
-
Blaszard less than a minute
One of the column in my df stores a list, and some of the raws have empty items in the list. For example:
[]
["X", "Y"]
[]
etc...
How can only take the raw whose list is not empty?
The following code does not work.
df[df["col"] != []] # ValueError: Lengths must match to compare df[pd.notnull(df["col"])] # The code doesn't issue an error but the result includes an empty list df[len(df["col"]) != 0] # KeyError: True
-
Blaszard over 3 yearsThe code works, thanks. But could you give me more explanation, especially why you need
.str
here? It is very unintuitive and near impossible to get to the code, unless you read the official doc from the top to the bottom. -
jdehesa over 3 years@Blaszard It is a bit of a "trick". functions under
.str
are meant to be used with string data. They are not really vectorized, it's just application of functions to each data item. In the case oflen
, it just applies the functionlen
to each object, so it works fine for strings, lists, or any other object to whichlen
can be applied. Quan Hoang's answer may be more meaningful. -
Quang Hoang over 3 years@piRSquared Thanks. Certainly, and some other answers say exactly that.
-
piRSquared over 3 yearsAhh, I see that now
-
weezilla over 1 yearAgree with jdehesa, this str len method is a good easy trick to remember. But beware of execution times on large dataframes. The Quang Hoang method seems to be vectorized and is MUCH faster.