Assign new values to slice from MultiIndex DataFrame
Sort the frame, then select/set using a tuple for the multi-index
In [12]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C'])
In [13]: df
Out[13]:
A B C
bar one 0 -0.694240 0.725163 0.131891
two 1 -0.729186 0.244860 0.530870
baz one 2 0.757816 1.129989 0.893080
qux one 3 -2.275694 0.680023 -1.054816
two 4 0.291889 -0.409024 -0.307302
bar one 5 1.697974 -1.828872 -1.004187
In [14]: df = df.sortlevel(0)
In [15]: df
Out[15]:
A B C
bar one 0 -0.694240 0.725163 0.131891
5 1.697974 -1.828872 -1.004187
two 1 -0.729186 0.244860 0.530870
baz one 2 0.757816 1.129989 0.893080
qux one 3 -2.275694 0.680023 -1.054816
two 4 0.291889 -0.409024 -0.307302
In [16]: df.loc[('bar','two'),'A'] = 9999
In [17]: df
Out[17]:
A B C
bar one 0 -0.694240 0.725163 0.131891
5 1.697974 -1.828872 -1.004187
two 1 9999.000000 0.244860 0.530870
baz one 2 0.757816 1.129989 0.893080
qux one 3 -2.275694 0.680023 -1.054816
two 4 0.291889 -0.409024 -0.307302
You can also do it with out sorting if you specify the complete index, e.g.
In [23]: df.loc[('bar','two',1),'A'] = 999
In [24]: df
Out[24]:
A B C
bar one 0 -0.113216 0.878715 -0.183941
two 1 999.000000 -1.405693 0.253388
baz one 2 0.441543 0.470768 1.155103
qux one 3 -0.008763 0.917800 -0.699279
two 4 0.061586 0.537913 0.380175
bar one 5 0.857231 1.144246 -2.369694
To check the sort depth
In [27]: df.index.lexsort_depth
Out[27]: 0
In [28]: df.sortlevel(0).index.lexsort_depth
Out[28]: 3
The last part of your question, assigning with a list (note that you must have the same number of elements as you are trying to replace), and this MUST be sorted for this to work
In [12]: df.loc[('bar','one'),'A'] = [999,888]
In [13]: df
Out[13]:
A B C
bar one 0 999.000000 -0.645641 0.369443
5 888.000000 -0.990632 -0.577401
two 1 -1.071410 2.308711 2.018476
baz one 2 1.211887 1.516925 0.064023
qux one 3 -0.862670 -0.770585 -0.843773
two 4 -0.644855 -1.431962 0.232528
Related videos on Youtube
hadim
Researcher in biology. Working at the cross of multiple fields : #physics #biology #bioimaging #microscopy #modeling #datascience #opendata #bioinformatics
Updated on September 16, 2022Comments
-
hadim over 1 year
I would like to modify some values from a column in my DataFrame. At the moment I have a view from select via the multi index of my original
df
(and modifying does changedf
).Here's an example:
In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'qux', 'qux', 'bar']), np.array(['one', 'two', 'one', 'one', 'two', 'one']), np.arange(0, 6, 1)] In [2]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C']) In [3]: df A B C bar one 0 -0.088671 1.902021 -0.540959 two 1 0.782919 -0.733581 -0.824522 baz one 2 -0.827128 -0.849712 0.072431 qux one 3 -0.328493 1.456945 0.587793 two 4 -1.466625 0.720638 0.976438 bar one 5 -0.456558 1.163404 0.464295
I try to modify a slice of
df
to a scalar value:In [4]: df.ix['bar', 'two', :]['A'] Out[4]: 1 0.782919 Name: A, dtype: float64 In [5]: df.ix['bar', 'two', :]['A'] = 9999 # df is unchanged
I really want to modify several values in the column (and since indexing returns a vector, not a scalar value, I think this would make more sense):
In [6]: df.ix['bar', 'one', :]['A'] = [999, 888] # again df remains unchanged
I'm using pandas 0.11. Is there is a simple way to do this?
The current solution is to recreate df from a new one and modify values I want to. But it's not elegant and can be very heavy on complex dataframe. In my opinion the problem should come from .ix and .loc not returning a view but a copy.
-
hadim almost 11 yearsSorry for the title but I am not english native speaker and the topic is bit complex, so it's hard to find a good one :-) If you want to submit me one title, I can change the current one.
-
Andy Hayden almost 11 yearsI tweaked it, but I wouldn't worry about a downvote like that. Happy pandaing.
-