pandas merge on date column issue
27,740
I think you need first convert both columns to_datetime
because need same dtypes
:
df1.month = pd.to_datetime(df1.month)
df2.month = pd.to_datetime(df2.month)
print (pd.merge(df2, df1, how='outer', on='month'))
factor month amt
0 1.000000 2015-02-01 1.549368e+06
1 1.000000 2015-03-01 2.175802e+06
2 1.000000 2015-04-01 1.915613e+06
3 1.000000 2015-05-01 1.703064e+06
4 0.998897 2015-06-01 1.770041e+06
#convert to str date column
df2.month = df2.month.astype(str)
print (pd.merge(df2, df1, how='outer', on='month'))
factor month amt
0 1.000000 2015-02-01 1.549368e+06
1 1.000000 2015-03-01 2.175802e+06
2 1.000000 2015-04-01 1.915613e+06
3 1.000000 2015-05-01 1.703064e+06
4 0.998897 2015-06-01 1.770041e+06
Comments
-
muon about 4 years
I am trying to merge two dataframes on date column (tried both as type
object
ordatetime.date
, but fails to give desired merge output:import pandas as pd df1 = pd.DataFrame({'amt': {0: 1549367.9496070854, 1: 2175801.78219801, 2: 1915613.1629125737, 3: 1703063.8323954903, 4: 1770040.7987461537}, 'month': {0: '2015-02-01', 1: '2015-03-01', 2: '2015-04-01', 3: '2015-05-01', 4: '2015-06-01'}}) print(df1) amt month 0 1.549368e+06 2015-02-01 1 2.175802e+06 2015-03-01 2 1.915613e+06 2015-04-01 3 1.703064e+06 2015-05-01 4 1.770041e+06 2015-06-01 df2 = {'factor': {datetime.date(2015, 2, 1): 1.0, datetime.date(2015, 3, 1): 1.0, datetime.date(2015, 4, 1): 1.0, datetime.date(2015, 5, 1): 1.0, datetime.date(2015, 6, 1): 0.99889679025914435}, 'month': {datetime.date(2015, 2, 1): datetime.date(2015, 2, 1), datetime.date(2015, 3, 1): datetime.date(2015, 3, 1), datetime.date(2015, 4, 1): datetime.date(2015, 4, 1), datetime.date(2015, 5, 1): datetime.date(2015, 5, 1), datetime.date(2015, 6, 1): datetime.date(2015, 6, 1)}} df2 = pd.DataFrame(df2) print(df2) factor month 2015-02-01 1.000000 2015-02-01 2015-03-01 1.000000 2015-03-01 2015-04-01 1.000000 2015-04-01 2015-05-01 1.000000 2015-05-01 2015-06-01 0.998897 2015-06-01 pd.merge(df2, df1, how='outer', on='month') factor month amt 0 1.000000 2015-02-01 NaN 1 1.000000 2015-03-01 NaN 2 1.000000 2015-04-01 NaN 3 1.000000 2015-05-01 NaN 4 0.998897 2015-06-01 NaN 5 NaN 2015-02-01 1.549368e+06 6 NaN 2015-03-01 2.175802e+06 7 NaN 2015-04-01 1.915613e+06 8 NaN 2015-05-01 1.703064e+06 9 NaN 2015-06-01 1.770041e+06
-
jezrael about 7 yearsThere is problem you cannot match string column with date column - need same dtypes.
-
muon about 7 yearsbut I get
df1.month.dtype
=dtype('O')
anddf2.month.dtype
=dtype('O')
, so if both are string types then why would it matter -
jezrael about 7 years
-
muon about 7 yearsconversion to str did not help, conversion to
pd.to_datetime()
did -
jezrael about 7 yearsand it is a bit confuse -
dtype
is object, buttype
not -
Chris8447 about 4 yearsI had the same issue...converting it both to a datetime object did it