How to sum negative and positive values separately when using groupby in pandas?
Solution 1
In [14]:
df.groupby(df['A'])['C'].agg([('negative' , lambda x : x[x < 0].sum()) , ('positive' , lambda x : x[x > 0].sum())])
Out[14]:
negative positive
A
bar -1.418788 2.603452
foo -0.504695 2.880512
Solution 2
You may groupby
on A
and df['C'] > 0
, and unstack
the result:
>>> right = df.groupby(['A', df['C'] > 0])['C'].sum().unstack()
>>> right = right.rename(columns={True:'positive', False:'negative'})
>>> right
C negative positive
A
bar -3.4423 NaN
foo -2.6277 0.857
The NaN
value is because all the A == bar
rows have negative value for C
.
if you want to add these to the original frame corresponding to values of groupby
key, i.e. A
, it would require a left join
:
>>> df.join(right, on='A', how='left')
A B C D negative positive
0 foo one 0.3742 0.3197 -2.6277 0.857
1 bar one -0.3563 -0.6296 -3.4423 NaN
2 foo two -0.3902 -1.3879 -2.6277 0.857
3 bar three -0.7834 -0.9597 -3.4423 NaN
4 foo two -1.2686 -0.2509 -2.6277 0.857
5 bar two -2.3025 -1.2960 -3.4423 NaN
6 foo one -0.9688 1.2477 -2.6277 0.857
7 foo three 0.4828 1.0047 -2.6277 0.857
Alireza
Updated on June 26, 2022Comments
-
Alireza almost 2 years
How to sum positive and negative values differently in
pandas
and put them let's say inpositive
andnegative
columns?I have this dataframe like below:
df = pandas.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'], 'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'], 'C' : np.random.randn(8), 'D' : np.random.randn(8)})
Output is as below:
df A B C D 0 foo one 0.374156 0.319699 1 bar one -0.356339 -0.629649 2 foo two -0.390243 -1.387909 3 bar three -0.783435 -0.959699 4 foo two -1.268622 -0.250871 5 bar two -2.302525 -1.295991 6 foo one -0.968840 1.247675 7 foo three 0.482845 1.004697
I used the below code to get negatives:
df['negative'] = df.groupby('A')['C'].apply(lambda x: x[x<0].sum()).reset_index()]
But the problem is when I want to add it to one of
dataframe
columns callednegative
it gives error:ValueError: Wrong number of items passed 2, placement implies 1
Again I know what it says that
groupby
has returned more than one column and cannot assign it todf['negatives']
but I don't know how to solve this part of the problem. I need to have positive col too.The desired outcome would be:
A Positive Negative 0 foo 0.374156 -0.319699 1 bar 0.356339 -0.629649
What is the right solution to the problem?
-
Alireza over 8 yearsThe final df should be grouped by
A
too. I dont want to group by two criteria, I just want to group by columnA
in yourright
dataframe. -
Alireza over 8 yearsWhen I groupby
A
I expect to see justfoo
andbar
in the result. Your solution doesn't meet that. I dont need columnB
if it is removed I would be OK with that -
behzad.nouri over 8 years@AlirezaHos you do not make sense. you need to edit the question and show what is the output you are looking for
-
Nader Hisham over 8 yearspost a sample of your desired result , and it would be better not to generate random data so that we make sure we have the same result
-
Alireza over 8 yearsI've added
The desired outcome would
part to question, please have a look. -
PV8 almost 4 yearsthis solution is a way faster then the other one! thx for that