vlookup in Pandas using join

125,445

Solution 1

Perform a left merge, this will use sku column as the column to join on:

In [26]:

df.merge(df1, on='sku', how='left')
Out[26]:
   sku  loc   flag dept
0  122   61   True    b
1  122   62   True    b
2  122   63  False    b
3  123   61   True    b
4  123   62  False    b
5  113   62   True    a
6  301   63   True    c

If sku is in fact your index then do this:

In [28]:

df.merge(df1, left_index=True, right_index=True, how='left')
Out[28]:
     loc   flag dept
sku                 
113   62   True    a
122   61   True    b
122   62   True    b
122   63  False    b
123   61   True    b
123   62  False    b
301   63   True    c

Another method is to use map, if you set sku as the index on your second df, so in effect it becomes a Series then the code simplifies to this:

In [19]:

df['dept']=df.sku.map(df1.dept)
df
Out[19]:
   sku  loc   flag dept
0  122   61   True    b
1  123   61   True    b
2  113   62   True    a
3  122   62   True    b
4  123   62  False    b
5  122   63  False    b
6  301   63   True    c

Solution 2

A more generic application would be to use apply and lambda as follows:

dict1 = {113:'a',
         122:'b',
         123:'b',
         301:'c'}

df = pd.DataFrame([['1', 113],
                   ['2', 113],
                   ['3', 301],
                   ['4', 122],
                   ['5', 113]], columns=['num', 'num_letter'])

Add as a new dataframe column

 **df['letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num  num_letter letter
0   1         113      a
1   2         113      a
2   3         301      c
3   4         122      b
4   5         113      a

OR replace the existing ('num_letter') column

 **df['num_letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num num_letter
0   1          a
1   2          a
2   3          c
3   4          b
4   5          a

Solution 3

VLookup in VBA is just like pandas.dataframe.merge

I always look for so many procedures for VBA in the past and now python dataframe saves me a ton of work, good thing is I don't need write a vlookup method.

pandas.DataFrame.merge

>>> A              >>> B
    lkey value         rkey value
0   foo  1         0   foo  5
1   bar  2         1   bar  6
2   baz  3         2   qux  7
3   foo  4         3   bar  8
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
   lkey  value_x  rkey  value_y
0  foo   1        foo   5
1  foo   4        foo   5
2  bar   2        bar   6
3  bar   2        bar   8
4  baz   3        NaN   NaN
5  NaN   NaN      qux   7

You can also try the following to do a left merge.

import pandas as pd
pd.merge(left, right, left_on = 'key', right_on = 'key', how='left')

outer or left act like SQL, python's built-in class DataFrame has the method merge taking many args, which is very detailed and handy.

Share:
125,445

Related videos on Youtube

Alex Kinman
Author by

Alex Kinman

Updated on July 09, 2022

Comments

  • Alex Kinman
    Alex Kinman almost 2 years

    I have the following 2 dataframes

    Example1
    sku loc flag  
    122  61 True 
    123  61 True
    113  62 True 
    122  62 True 
    123  62 False
    122  63 False
    301  63 True 
    
    Example2 
    sku dept 
    113 a
    122 b
    123 b
    301 c 
    

    I want to perform a merge, or join opertation using Pandas (or whichever Python operator is best) to produce the below data frame.

    Example3
    sku loc flag   dept  
    122  61 True   b
    123  61 True   b
    113  62 True   a
    122  62 True   b
    123  62 False  b
    122  63 False  b
    301  63 True   c
    
    Both 
    df_Example1.join(df_Example2,lsuffix='_ProdHier')
    df_Example1.join(df_Example2,how='outer',lsuffix='_ProdHier')
    

    Aren't working. What am I doing wrong?