Pandas KeyError using pivot
13,155
You want .pivot_table
, not .pivot
.
import pandas
from io import StringIO
x = StringIO("""\
Date Stage SubStage Value
12/31/2015 1.00 a 0.896882891
1/1/2016 1.00 a 0.0458843
1/2/2016 1.00 a 0.126805588
1/3/2016 1.00 b 0.615824461
1/4/2016 1.00 b 0.245092069
1/5/2016 1.00 c 0.121936318
1/6/2016 1.00 c 0.170198128
1/7/2016 1.00 c 0.735872415
1/8/2016 1.00 c 0.542361912
1/4/2016 2.00 a 0.723769247
1/5/2016 2.00 a 0.305570257
1/6/2016 2.00 b 0.47461605
1/7/2016 2.00 b 0.173702623
1/8/2016 2.00 c 0.969260251
1/9/2016 2.00 c 0.017170798
""")
df = pandas.read_table(x, sep='\s+')
xtab = df.pivot_table(index='Date', columns=['Stage','SubStage'], values='Value')
print(xtab.to_string(na_rep='--'))
And that gives me:
Stage 1.0 2.0
SubStage a b c a b c
Date
1/1/2016 0.045884 -- -- -- -- --
1/2/2016 0.126806 -- -- -- -- --
1/3/2016 -- 0.615824 -- -- -- --
1/4/2016 -- 0.245092 -- 0.723769 -- --
1/5/2016 -- -- 0.121936 0.305570 -- --
1/6/2016 -- -- 0.170198 -- 0.474616 --
1/7/2016 -- -- 0.735872 -- 0.173703 --
1/8/2016 -- -- 0.542362 -- -- 0.969260
1/9/2016 -- -- -- -- -- 0.017171
12/31/2015 0.896883 -- -- -- -- --
Author by
trob
Updated on June 14, 2022Comments
-
trob almost 2 years
I'm new to Python and I would like to use Python to replicate a common excel task. If such a question has already been answered, please let me know. I've been unable to find it. I have the following pandas dataframe (data):
Date Stage SubStage Value 12/31/2015 1.00 a 0.896882891 1/1/2016 1.00 a 0.0458843 1/2/2016 1.00 a 0.126805588 1/3/2016 1.00 b 0.615824461 1/4/2016 1.00 b 0.245092069 1/5/2016 1.00 c 0.121936318 1/6/2016 1.00 c 0.170198128 1/7/2016 1.00 c 0.735872415 1/8/2016 1.00 c 0.542361912 1/4/2016 2.00 a 0.723769247 1/5/2016 2.00 a 0.305570257 1/6/2016 2.00 b 0.47461605 1/7/2016 2.00 b 0.173702623 1/8/2016 2.00 c 0.969260251 1/9/2016 2.00 c 0.017170798
In excel, I can use a pivot table to produce the following:
It seems reasonable to do the following in python:
data.pivot(index='Date', columns=['Stage', 'SubStage'], values='Value')
But that produces:
KeyError: 'Level Stage not found'
What gives?
-
Paul H almost 8 yearsdo your column labels have trailing or leading whitespace? check by inspecting with
data.columns
-
trob almost 8 yearsThey do not: data.columns Index(['Date', 'Stage', 'SubStage', 'Value'], dtype='object')
-
-
trob almost 8 yearsThis is helpful. Generally speaking, is pivot_table as opposed to pivot the agreed upon best practice? Why in the world are there two functions for the same concept?
-
Paul H almost 8 yearsThey serve different purposes, but I couldn't tell you what they are. I always
stack/unstack
with a multi-level index to do something like this. @trob -
trob almost 8 yearsExcellent. Thank you!