KeyError in for loop of dataframe in pandas

python for-loop pandas dataframe keyerror

10,191

Solution 1

I've tried the following code and it works on my PC. I use .loc with the aim to avoid potential key error.

import pandas as pd
import numpy as np

# just following your previous post to simulate your data
np.random.seed(0)
dates = np.random.choice(pd.date_range('2015-01-01 00:00:00', '2015-06-30 00:00:00', freq='1h'), 10000)
company = np.random.choice(['company' + x for x in '1 2 3 4 5'.split()], 10000)
df = pd.DataFrame(dict(recvd_dttm=dates, CompanyName=company)).set_index('recvd_dttm').sort_index()
df['C'] = 1
df.columns = ['CompanyName', '']
result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg({df.columns[1]: sum}).reset_index()
result.columns = ['Month', 'CompanyName', 'counts']
pivot_table = result.pivot(index='CompanyName', columns='Month', values='counts')




colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
    "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ]

month = []
company = []
color = []
rate = []
for y in pivot_table.index:
    for m in pivot_table.columns:
        month.append(m)
        company.append(y)
        num_calls = pivot_table.loc[y, m]
        rate.append(num_calls)
        color.append(colors[min(int(num_calls)-2, 8)])

Solution 2

Try changing the loop to

for m in pivot_table.columns:

It seems you can achieve the same thing without any loops though. You're looping through the row index and column index to access each entry individually and appending them to a list, so rate is just a list of all elements in the data frame. You can achieve this by

rate= pivot_table.stack().astype(int).tolist()
color = [colours[min(x - 2, 8)] for x in rate]

Am i missing something here?

10,191

Author by

jenryb

Updated on June 05, 2022

Comments

jenryb almost 2 years

I am putting my data into a bokeh layout of a heat map, but am getting a KeyError: '1'. It occurs right at the line num_calls = pivot_table[m][y] does anybody know why this would be?

The pivot table I am using is below:

pivot_table.head()
Out[101]: 
Month                            1    2    3    4    5    6    7    8    9   CompanyName                                                                   
Company 1                 182  270  278  314  180  152  110  127  129   
Company 2           163  147  192  142  186  231  214  130  112   
Company 3       126   88   99  139   97   97   96   37   79   
Company 4   84   89   71   95   80   89   83   88  104   
Company 5       91   96   94   66   81   77   87   83   68   

Month                            10   11   12  
CompanyName                                    
Company 1               117  127   81  
Company 2            117   93  101  
Company 3       116  111   95  
Company 4   93   78   64  
Company 5        83   95   65

Below is the section of code leading up to the error:

pivot_table = pivot_table.reset_index()
pivot_table['CompanyName'] = [str(x) for x in pivot_table['CompanyName']]
Companies = list(pivot_table['CompanyName'])
months = ["1","2","3","4","5","6","7","8","9","10","11","12"]
pivot_table = pivot_table.set_index('CompanyName')

# this is the colormap from the original plot
colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
    "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ]

# Set up the data for plotting. We will need to have values for every
# pair of year/month names. Map the rate to a color.
month = []
company = []
color = []
rate = []
for y in Companies:
    for m in months:
        month.append(m)
        company.append(y)
        num_calls = pivot_table[m][y]
        rate.append(num_calls)
        color.append(colors[min(int(num_calls)-2, 8)])

and upon request:

pivot_table.info()
<class 'pandas.core.frame.DataFrame'>
Index: 46 entries, Company1 to LastCompany
Data columns (total 12 columns):
1.0     46 non-null float64
2.0     46 non-null float64
3.0     46 non-null float64
4.0     46 non-null float64
5.0     46 non-null float64
6.0     46 non-null float64
7.0     46 non-null float64
8.0     46 non-null float64
9.0     46 non-null float64
10.0    46 non-null float64
11.0    46 non-null float64
12.0    46 non-null float64
dtypes: float64(12)
memory usage: 4.5+ KB

and

pivot_table.columns
Out[103]: Index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0], dtype='object')

Also the bokeh code is here: http://docs.bokeh.org/en/latest/docs/gallery/unemployment.html

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Python DataFrames For Loop with If Statement not working

Error iterating through a Pandas series

Appending multiple dataframes in pandas with for loops

KeyError in Dataframe

How to append rows in a pandas dataframe in a for loop?

Is there a way in Pandas to use previous row value in dataframe.apply when previous value is also calculated in the apply?

What is the most efficient way to loop through dataframes with pandas?

Pandas dataframe to excel: AttributeError: 'list' object has no attribute 'to_excel'

Plot line graph from Pandas dataframe (with multiple lines)

Best way to subset a pandas dataframe