Key error on pandas merge (left join)

16,637

The KeyError arises from this:

df = pd.merge(df_purchase, df_login[['count']] <- this selects just count column,
                       how='left', on=['age', 'gender', 'region'])

You've specifically selected just a single column from df_login, you need this:

df = pd.merge(df_purchase, df_login,
                       how='left', on=['age', 'gender', 'region'])

I'm assuming that this is not your complete data as you have no common values in the age and region column in df_login.

Share:
16,637
metersk
Author by

metersk

Updated on July 20, 2022

Comments

  • metersk
    metersk almost 2 years

    I have two dataframes below, df_purchase(1) and df_login(2)

    +--------+-----+--------+------------+--------------------+-------------+--------------------------+
    |        | age | gender |    ttp     |       count        | sum(amount) |          region          |
    +--------+-----+--------+------------+--------------------+-------------+--------------------------+
    |  49427 | 63  | M      | 824.731412 | 2                  | 25.00       | Omaha, Nebraska          |
    |  28433 | 49  | M      | 1.166250   | 2                  | 41.94       | Catasauqua, Pennsylvania |
    |   4162 | 29  | M      | 5.620949   | 2                  | 51.78       | Eagle Center, Iowa       |
    |  18747 | 43  | M      | 153.502072 | 2                  | 23.84       | Pacific, Washington      |
    |  45173 | 59  | M      | 0.027257   | 2                  | 13.98       | De Soto, Missouri        |
    +--------+-----+--------+------------+--------------------+-------------+--------------------------+
    
    +--------+-----+--------+------------+--------------------+-------------+--------------------------+
    |        | age | gender | count      | region             |             |                          |
    | 671766 | 84  | M      | 13900      | New York, New York |             |                          |
    | 671166 | 84  | F      | 7619       | New York, New York |             |                          |
    | 672209 | 85  | F      | 6483       | New York, New York |             |                          |
    | 672671 | 85  | M      | 5808       | New York, New York |             |                          |
    | 195201 | 34  | M      | 3817       | New York, New York |             |                          |
    +--------+-----+--------+------------+--------------------+-------------+--------------------------+
    

    I am trying to join df_logins to df_purchase on age, gender and region with the following pandas code:

    df = pd.merge(df_purchase, df_login[['count']],
                           how='left', on=['age', 'gender', 'region'])
    

    However, I keep getting this error: KeyError: 'age' Any thoughts?