Pandas merge df error

20,328

pd.merge can merge only two DataFrames. The third parameter (block_data in your case) is interpreted as "how." You also supply the named how='outer', and that's why you see the error message. Solution to your problem: merge the first two DataFrames, then merge the result with the third one.

Share:
20,328
zsad512
Author by

zsad512

Business Analytics Masters Student with specialization in Data Science.

Updated on July 09, 2022

Comments

  • zsad512
    zsad512 almost 2 years

    I have 3 dataframes I am trying to merge in pandas. One is 20 columns, the other two have 2 columns each. They are organized as such:

    eth_price.head(n=3)
    
    Out[6]: 
                time  eth_price
    0  8/28/17 16:19    344.021
    2  8/28/17 16:24    343.833
    3  8/28/17 16:29    343.643
    btc_price.head(n=3)
    
    Out[7]: 
                      time   btc_price
    0  2017-08-27 22:50:00  4,389.6113
    1  2017-08-27 22:51:00  4,389.0850
    2  2017-08-27 22:52:00  4,388.8625
    
    block_data.head(n=3)
    Out[8]: 
                       time  block_size    difficulty  estimated_btc_sent  \
    0   2017-08-30 22:55:03   165261989  888171856257      22433058065308   
    5   2017-08-30 23:02:03   165261989  888171856257      22433058065308   
    12  2017-08-30 23:09:03   164262692  888171856257      22210602766312   
    
        estimated_transaction_volume_usd     hash_rate  market_price_usd  \
    0                       1.030796e+09  7.417412e+09           4594.98   
    5                       1.030796e+09  7.417412e+09           4594.98   
    12                      1.020574e+09  7.373261e+09           4594.98   
    
        miners_revenue_btc  miners_revenue_usd  minutes_between_blocks  \
    0                 2495         11467926.77                    7.98   
    5                 2495         11467926.77                    7.98   
    12                2478         11388475.85                    8.01   
    
        n_blocks_mined  n_blocks_total   n_btc_mined    n_tx  nextretarget  \
    0              168          482713  210000000000  273392        483839   
    5              168          482713  210000000000  273392        483839   
    12             167          482713  208750000000  271638        483839   
    
         total_btc_sent  total_fees_btc          totalbtc  trade_volume_btc  \
    0   164688219250248     39574691936  1653391250000000          44110.58   
    5   164688219250248     39574691936  1653391250000000          44110.58   
    12  163455939539341     39095614135  1653391250000000          44110.58   
    
        trade_volume_usd  
    0       2.026876e+08  
    5       2.026876e+08  
    12      2.026876e+08  
    

    I am trying to merge using all_data = pd.merge(btc_price, eth_price, block_data, on = 'time', how = 'outer') however when I do this I get the following error:

    File "", line 1, in all_data = pd.merge(btc_price, eth_price, block_data, on = 'time', how = 'outer')

    TypeError: merge() got multiple values for argument 'how'

    What does this mean and how can I fix it?

    The end result should be one data frame with 22 columns, including all rows from all 3 df. I will then drop the rows with missing values.

    EDIT: if you look at the timestamps, the first 2 df occur on the minute whereas the third occurs at 03 seconds...is there a way of fixing this? I have a script that pulls these 3 files from json every minute and I am trying to align the 3 df accordingly