Sum all columns with a wildcard name search using Python Pandas
14,907
Solution 1
I found the answer.
Using the data, dataframe from the question:
from pandas import *
P1Channels = data.filter(regex="P1")
P1Sum = P1Channels.sum(axis=1)
Solution 2
List comprehensions on columns allow more filters in the if
condition:
In [1]: df = pd.DataFrame(np.arange(15).reshape(5, 3), columns=['P1S1', 'P1S2', 'P2S1'])
In [2]: df
Out[2]:
P1S1 P1S2 P2S1
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
4 12 13 14
In [3]: df.loc[:, [x for x in df.columns if x.startswith('P1')]].sum(axis=1)
Out[3]:
0 1
1 7
2 13
3 19
4 25
dtype: int64
Solution 3
Thanks for the tip jbssm, for anyone else looking for a sum total, I ended up adding .sum()
at the end, so:
P1Sum= P1Channels.sum(axis=1).sum()
Author by
jbssm
I'm an Astrophysicist researcher in the field of solar atmosphere and heliospherical particle propagation.
Updated on June 06, 2022Comments
-
jbssm almost 2 years
I have a dataframe in python pandas with several columns taken from a CSV file.
For instance, data =:
Day P1S1 P1S2 P1S3 P2S1 P2S2 P2S3 1 1 2 2 3 1 2 2 2 2 3 5 4 2
And what I need is to get the sum of all columns which name starts with P1... something like P1* with a wildcard.
Something like the following which gives an error:
P1Sum = data["P1*"]
Is there any why to do this with pandas?
-
Brian Larsen over 3 yearsI find using
fnmatch
in the standard lib gives a lot of power in these "if" list comprehensions.