How can I check if a Pandas dataframe's index is sorted
Solution 1
How about:
df.index.is_monotonic
Solution 2
If sort
is all allowed, try
all(df.sort_index().index == df.index)
If not, try
all(a <= b for a, b in zip(df.index, df.index[1:]))
The first one is more readable while the second one has smaller time complexity.
EDIT
Add another method I've just found. Similar with the second one but the comparison is vetorized
all(df.index[:-1] <= df.index[1:])
Solution 3
For non-indices:
df.equals(df.sort())
Solution 4
Just for the sake of completeness, this would be the procedure to check whether the dataframe index is monotonic increasing and also unique, and, if not, make it to be:
if not (df.index.is_monotonic_increasing and df.index.is_unique):
df.reset_index(inplace=True, drop=True)
NOTE
df.index.is_monotonic_increasing
is returningTrue
even if there are repeated indices, so it has to be complemented withdf.index.is_unique
.
API References
Pablojim
Updated on June 06, 2022Comments
-
Pablojim almost 2 years
I have a vanilla pandas dataframe with an index. I need to check if the index is sorted. Preferably without sorting it again.
e.g. I can test an index to see if it is unique by index.is_unique() is there a similar way for testing sorted?
-
Wes McKinney almost 11 yearsStrongly recommend using
is_monotonic
-
Tim Diels over 7 yearsUse
is_monotonic_increasing
to check for ascending order andis_monotonic_decreasing
to check for descending order.is_monotonic
has been deprecated; it's a misnomer as it only checks for increasing monotonicity. -
Joseph Garvin about 7 yearsDoes it do a test on the fly, or is it just telling you that at some earlier point you somehow promised pandas that it would be monotonic?
-
Mithril over 5 yearsIs it better to check
is_monotonic
beforesort_index
? Orsort_index
would check it automaticlly . -
Asclepius over 3 years@timdiels It is incorrect to say that
is_monotonic
is deprecated. It is not. Where does it say that it is? -
Tim Diels over 3 years@Acumenus It was deprecated at the time I posted that comment, but seems it no longer is. Still, I wouldn't use it as it only checks whether it is monotonically increasing while a monotonically decreasing function should also be considered monotonic. To be clear, when
is_monotonic_decreasing
,is_monotic
may beFalse
; surprise! -
Ahmed Fasih over 2 yearsDo note,
is_monotonic_increasing
andis_monotonic_decreasing
work for non-indexes also. -
Eli S over 2 yearsI don't think this is what reset_index does. It will change the index to the default, which for many or all cases is just the row index. I think as others pointed out df.sort_index() is the right tool. It can be done inplace if desired.
-
Manu Na Eira over 2 yearswell @Eli, the difference lies on whether the
index.is_unique
or not. If the dataframe index has repeated values, df.sort_index() won't get rid of them. So that, I think my answer above is still valid, and in the case you want to get a dataframe index monotonic increasing and unique (the desired index format in most cases, MultiIndexes aside),df.reset_index
is a quick way to get it. -
Eli S about 2 yearsThis may work if you start with a default (integer index) but it will not work with, say,
df = pd.DataFrame(index=[pd.Timestamp(2000,1,1),pd.Timestamp(2000,1,2),pd.Timestamp(2000,1,2),pd.Timestamp(2000,1,3)],data=[0,1,2,3])
. reset_index will keep all four values and replace the timestamps with integer indices [0,1,2,3]. There would be few cases where this is the desired behavior, no? -
Manu Na Eira about 2 yearsYes, watch out that
reset_index
will modify your indices to be increaing integers! I think having the dataframe index monotonic increasing and unique is good, as it makes the dataframe data (rows) uniquely identifiable by the index itself. In the example you provide @Eli, I would rather move the timestamps to a data column and create a new index withdf.reset_index(inplace=True)
.