Check if all elements in a group are equal using pandas GroupBy
Solution 1
Use groupby
and nunique
, and check whether the result is 1:
df.groupby([df.datetime.dt.date, 'rating']).signal.nunique().eq(1)
datetime rating
2018-12-20 HY True
IG False
2018-12-27 HY True
IG True
Name: signal, dtype: bool
Or, similarly, using apply
with set
conversion:
(df.groupby([df.datetime.dt.date, 'rating']).signal
.apply(lambda x: len(set(x)) == 1))
datetime rating
2018-12-20 HY True
IG False
2018-12-27 HY True
IG True
Name: signal, dtype: bool
PS., you don't need to assign a temp column, groupby
takes arbitrary grouper arguments.
Solution 2
Try to find out alternative without using groupby
just for fun
df.datetime=df.datetime.dt.date
s=pd.crosstab(df.datetime,[df.rating,df.signal])
s.eq(s.sum(axis=1,level=0),1).any(level=0,axis=1).stack()
Out[556]:
datetime rating
2018-12-20 HY True
IG False
2018-12-27 HY True
IG True
dtype: bool
Yuca
Applied Math and CS. Stack Overflow has been a life-saving, career-building tool throughout my professional years. I look forward to giving back to the community that has given me so much!
Updated on June 29, 2022Comments
-
Yuca almost 2 years
Is there a pythonic way to group by a field and check if all elements of each resulting group have the same value?
Sample data:
datetime rating signal 0 2018-12-27 11:33:00 IG 0 1 2018-12-27 11:33:00 HY -1 2 2018-12-27 11:49:00 IG 0 3 2018-12-27 11:49:00 HY -1 4 2018-12-27 12:00:00 IG 0 5 2018-12-27 12:00:00 HY -1 6 2018-12-27 12:49:00 IG 0 7 2018-12-27 12:49:00 HY -1 8 2018-12-27 14:56:00 IG 0 9 2018-12-27 14:56:00 HY -1 10 2018-12-27 15:12:00 IG 0 11 2018-12-27 15:12:00 HY -1 12 2018-12-20 15:14:00 IG 0 13 2018-12-20 15:14:00 HY -1 14 2018-12-20 15:50:00 IG -1 15 2018-12-20 15:50:00 HY -1 16 2018-12-27 13:26:00 IG 0 17 2018-12-27 13:26:00 HY -1 18 2018-12-27 13:44:00 IG 0 19 2018-12-27 13:44:00 HY -1 20 2018-12-27 15:06:00 IG 0 21 2018-12-27 15:06:00 HY -1 22 2018-12-20 15:48:00 IG 0 23 2018-12-20 15:48:00 HY -1
The grouping part can be done by
df.groupby([datetime.dt.date,'rating'])
However, I'm sure there must be a simple way to leverage the grouper and use a transform statement to return 1 if all the values from
signal
are the same.Desired output
2018-12-20 HY True IG False 2018-12-27 HY True IG True
-
JacobIRR over 5 yearsCould you check
len(set(your_values)) == 1
? -
RomanPerekhrest over 5 yearsI don't see any
'temp'
key in your input df -
cs95 over 5 yearsShould this be
[True, False, True, False]
? -
Yuca over 5 yearsI should have
False
for 2018-12-20, IG, andTrue
for everything else. -
Yuca over 5 years@RomanPerekhrest it is being generated from the assign statement
-
cs95 over 5 yearsHmm, I see 2018-12-27/IG has [0, -1] as the unique values. Can you take a look?
-
Yuca over 5 years@coldspeed fixed the sample data, thanks!
-
-
Yuca over 5 yearsI have a challenge for you, make it
false
only if it's not the last entry of the day :) -
Yuca over 5 yearsfollow up: is there an easy way to recover the index of the 'odd-one-out'? we can assume there's only one per day
-
cs95 over 5 years@Yuca do you mean per day there would only be one odd one out (either true or false)? Is it possible there can be no odd one out?
-
Yuca over 5 yearsyes, it is possible that there is no odd one out, I have a pretty ugly way of doing it that involves using grouby twice :S