Pandas merge df error
pd.merge
can merge only two DataFrames. The third parameter (block_data
in your case) is interpreted as "how." You also supply the named how='outer'
, and that's why you see the error message. Solution to your problem: merge the first two DataFrames, then merge the result with the third one.
zsad512
Business Analytics Masters Student with specialization in Data Science.
Updated on July 09, 2022Comments
-
zsad512 almost 2 years
I have 3 dataframes I am trying to merge in pandas. One is 20 columns, the other two have 2 columns each. They are organized as such:
eth_price.head(n=3) Out[6]: time eth_price 0 8/28/17 16:19 344.021 2 8/28/17 16:24 343.833 3 8/28/17 16:29 343.643 btc_price.head(n=3) Out[7]: time btc_price 0 2017-08-27 22:50:00 4,389.6113 1 2017-08-27 22:51:00 4,389.0850 2 2017-08-27 22:52:00 4,388.8625 block_data.head(n=3) Out[8]: time block_size difficulty estimated_btc_sent \ 0 2017-08-30 22:55:03 165261989 888171856257 22433058065308 5 2017-08-30 23:02:03 165261989 888171856257 22433058065308 12 2017-08-30 23:09:03 164262692 888171856257 22210602766312 estimated_transaction_volume_usd hash_rate market_price_usd \ 0 1.030796e+09 7.417412e+09 4594.98 5 1.030796e+09 7.417412e+09 4594.98 12 1.020574e+09 7.373261e+09 4594.98 miners_revenue_btc miners_revenue_usd minutes_between_blocks \ 0 2495 11467926.77 7.98 5 2495 11467926.77 7.98 12 2478 11388475.85 8.01 n_blocks_mined n_blocks_total n_btc_mined n_tx nextretarget \ 0 168 482713 210000000000 273392 483839 5 168 482713 210000000000 273392 483839 12 167 482713 208750000000 271638 483839 total_btc_sent total_fees_btc totalbtc trade_volume_btc \ 0 164688219250248 39574691936 1653391250000000 44110.58 5 164688219250248 39574691936 1653391250000000 44110.58 12 163455939539341 39095614135 1653391250000000 44110.58 trade_volume_usd 0 2.026876e+08 5 2.026876e+08 12 2.026876e+08
I am trying to merge using
all_data = pd.merge(btc_price, eth_price, block_data, on = 'time', how = 'outer')
however when I do this I get the following error:File "", line 1, in all_data = pd.merge(btc_price, eth_price, block_data, on = 'time', how = 'outer')
TypeError: merge() got multiple values for argument 'how'
What does this mean and how can I fix it?
The end result should be one data frame with 22 columns, including all rows from all 3 df. I will then drop the rows with missing values.
EDIT: if you look at the timestamps, the first 2 df occur on the minute whereas the third occurs at 03 seconds...is there a way of fixing this? I have a script that pulls these 3 files from json every minute and I am trying to align the 3 df accordingly