Index - Match using Pandas

18,999

Solution 1

The best equivalent of INDEX MATCH is DataFrame.lookup:

df2 = df2.set_index('dates')
df1['price'] = df2.lookup(df1['dates'], df1['currency'])

Solution 2

Reshaping your df2 makes it a lot easier to do a straightforward merge:

In [42]: df2.set_index("dates").unstack().to_frame("value")
Out[42]:
            value
    dates
aud 01-Jan   0.72
    02-Jan   0.73
    03-Jan   0.74
    30-Jan   0.71
gbp 01-Jan   1.29
    02-Jan   1.30
    03-Jan   1.40
    30-Jan   1.26
eur 01-Jan   1.15
    02-Jan   1.16
    03-Jan   1.17
    30-Jan   1.18

In this form, you just need to match the df1 fields with df2's new index as such:

In [43]: df1.merge(df2.set_index("dates").unstack().to_frame("value"), left_on=["currency", "dates"], right_index=True)
Out[43]:
    dates currency  amount  value
0  02-Jan      aud     100   0.73
1  03-Jan      gbp     330   1.40

You can also left merge it if you don't want to lose missing data (I had to fix your df1 a little for this:

In [44]: df1.merge(df2.set_index("dates").unstack().to_frame("value"), left_on=["currency", "dates"], right_index=True, how="left")
Out[44]:
    dates currency  amount  value
0  02-Jan      aud     100   0.73
1  03-Jan      gbp     330   1.40
2  04-Jan      eur     500    NaN
Share:
18,999

Related videos on Youtube

Viraj Desai
Author by

Viraj Desai

Updated on June 04, 2022

Comments

  • Viraj Desai
    Viraj Desai almost 2 years

    I have the following 2 data frames:

    df1 = pd.DataFrame({
        'dates': ['02-Jan','03-Jan','30-Jan'],
        'currency': ['aud','gbp','eur'],
        'amount': [100,330,500]
    })
    
    df2 = pd.DataFrame({
        'dates': ['01-Jan','02-Jan','03-Jan','30-Jan'],
        'aud': [0.72,0.73,0.74,0.71],
        'gbp': [1.29,1.30,1.4,1.26],
        'eur': [1.15,1.16,1.17,1.18]
    })
    

    I want to obtain the intersection of df1.dates & df1.currency. For eg: Looking up the prevalent 'aud' exchange rate on '02-Jan'

    It can be solved using the Index + Match functionality of excel. What shall be the best way to replicate it in Pandas.

    Desired Output: add a new column 'price'

    dates  currency amount price
    02-Jan aud      100    0.73
    03-Jan gbp      330    1.4
    30-Jan eur      500    1.18
    
    • shadowtalker
      shadowtalker over 5 years
      Can you give an example of the desired output, or the equivalent Excel code? It's hard to tell exactly what you're looking for from the description.
    • Viraj Desai
      Viraj Desai over 5 years
      Updated the desired output @shadowtalker
  • Viraj Desai
    Viraj Desai over 5 years
    It is giving me a KeyError:Date What I am missing here?
  • Viraj Desai
    Viraj Desai over 5 years
    Gives me the following error : 'One or more row labels was not found'
  • shadowtalker
    shadowtalker over 5 years
    @VirajDesai it works with the data you posted. If this gives you an error, you need to provide data that reproduces the error.
  • Viraj Desai
    Viraj Desai over 5 years
    I had a few lookup values which were missing. This is resolved now.Thanks!
  • Randy
    Randy over 5 years
    It sounds like the column in your actual data might be named Date rather than dates.
  • Viraj Desai
    Viraj Desai over 5 years
    Hey! I checked this out. There was an error related to index assignment. Thanks!
  • Pherdindy
    Pherdindy almost 3 years
    Deprecated since version 1.2.0: DataFrame.lookup is deprecated, use DataFrame.melt and DataFrame.loc instead. pandas.pydata.org/pandas-docs/stable/user_guide/…
  • shadowtalker
    shadowtalker almost 3 years
    @Pherdindy huh, that's disappointing. Why was it deprecated?
  • Pherdindy
    Pherdindy almost 3 years
    @shadowtalker No idea, I was just looking for solutions and noticed it was deprecated