pandas to_sql method gives error with date column

15,279

Update: starting with pandas 0.15, to_sql supports writing datetime values for both sqlite connections as sqlalchemy engines. So the workaround described below should not be needed anymore.
Pandas 0.15 will be released in coming October, and the feature is merged in the development version.


The reason of the above error is that the df 'date' column is a datetime64 column, and this type is not supported by sqlite3. So you should convert it to a string first (that this is not done automatically for sqlite is maybe a bug/missing feature), or to a datetime.date object (which is recognized by sqlite3, but it will also be converted to a string as sqlite has no datetime type).

You did that in your code example with df['date'] = df.index.map(lambda x: x.strftime('%d-%m-%Y')), but then you overwrite the column again with df['date'] = df.index, so maybe that was an error in your code example. But if you first convert it to strings, it works:

df = pd.DataFrame(index= pd.date_range('2014-01-01', periods=10))
df['date'] = df.index.map(lambda x: x.strftime('%d-%m-%Y'))
df['profit']= rand(10)
df['perf_period_id']=2

df.to_sql(name='fee_profit', index=False, con=db, if_exists='append')

Starting from pandas 0.14, the main sql functions are refactored to use sqlalchemy to deal with different database flavors. If you use this, it will work correctly with the datetime column (it will convert it to a string automatically):

df = pd.DataFrame(index= pd.date_range('2014-01-01', periods=10))
df['profit']= rand(10)
df['perf_period_id']=2

import sqlalchemy
db2 = sqlalchemy.create_engine('...')
df.to_sql(name='fee_profit', index=False, con=db2, if_exists='append')

Using a plain sqlite connection object instead of a sqlalchemy engine, like you did, will still be supported in the future (but only for sqlite!).

Share:
15,279
Joop
Author by

Joop

Updated on June 15, 2022

Comments

  • Joop
    Joop almost 2 years

    I have a dataframe that looks like this:

    df = pd.DataFrame(index= pd.date_range('2014-01-01', periods=10))
    df['date'] = df.index.map(lambda x: x.strftime('%d-%m-%Y'))
    df['date'] = df.index
    df['profit']= rand(10)
    df['perf_period_id']=2
    

    also have a sqlite3 db with a table called fee_profit

    fee_profit has 4 fields:

    • id - integer - primary key
    • perf_period_id - integer
    • date - date
    • profit - real

    When I try to write the dataframe to the database with (not showing db connection):

    df.to_sql(name='fee_profit', index=False, con=db, if_exists='append')
    

    I get the following code:

    252     else:
    253         data = [tuple(x) for x in frame.values.tolist()]
    --> 254     cur.executemany(insert_query, data)
    255 
    256 
    InterfaceError: Error binding parameter 0 - probably unsupported type.
    

    Not passing the primary key (could this be the problem?) I have jumbled the table around and it is definitely looks like the date that is the problem. Have tried various combinations of passing the date in index and also is string, bit nothing works.

    Any idea where I am going wrong. Cannot find examples of using this method anywhere.

    using Pandas 0.13.1 and sqlite 3 2.6.0. Database was created through django 1.6 model

  • Joop
    Joop almost 10 years
    i have done some additional testting. if just populate date id dataframe with df['date'] = datetime.date(2014,1,1). ie date objects it works it seems that it struggles to do the conversion of the datetime64[ns] object natively used with pandas dataframes
  • Joop
    Joop almost 10 years
    thanks. did some reading overnight and did not realize that sqlite does not have date object. as it does pretty good job of pretending to have one! My string code now works as I have turned it into same representation as the origianl data
  • joris
    joris almost 10 years
    Ah, thanks, interesting! Yes, then it is only the datetime64 that is not recognized by sqlite3, for datetime.date is works. But still it is then converted to a string, so you can choose to convert it to datetime.date (with df.index.date) or directly to a string yourself.
  • Joop
    Joop almost 10 years
    for some reason the conversion happens if it is in an object column. I am getting the date from my dataframe index and cannot even coerce it with .astype(datetime.date) as it still stays a datetime64. the string workaround works though.
  • joris
    joris almost 10 years
    Pandas uses datetime64 as the standard way to keep datetime values. So depending on how you assign it to the column, it will again try to convert it to a datetime64 (but with df['date'] = df.index.date it seems to keep the type)
  • Kyle B.
    Kyle B. over 3 years
    I'm using pandas 0.24 (so after 0.15) and it seems this is not the case. At least trying to run to_sql with a snowflake connection will still result in any datetime column throwing an error due to Binding data in type (timestamp) is not supported. Not sure if there's any way to solve this. I know this is a very old question, but for future users, if using snowflake, expect this error. Still not sure of a workaround...
  • joris
    joris over 3 years
    I would advice you to post it as a new question, specifically about snowflake (and maybe best to also use the latest version of pandas to test)