to_datetime Value Error: at least that [year, month, day] must be specified Pandas

35,717

Solution 1

You can stack / pd.to_datetime / unstack

pd.to_datetime(dte.stack()).unstack()

enter image description here

explanation
pd.to_datetime works on a string, list, or pd.Series. dte is a pd.DataFrame and is why you are having issues. dte.stack() produces a a pd.Series where all rows are stacked on top of each other. However, in this stacked form, because it is a pd.Series, I can get a vectorized pd.to_datetime to work on it. the subsequent unstack simply reverses the initial stack to get the original form of dte

Solution 2

For me works apply function to_datetime:

print (dtd)
            1           2           3           4           5           6
0                                                                        
0  2004-01-02  2004-01-02  2004-01-09  2004-01-16  2004-01-23  2004-01-30
1  2004-01-05  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
2  2004-01-06  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
3  2004-01-07  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
4  2004-01-08  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06


dtd = dtd.apply(pd.to_datetime)

print (dtd)
           1          2          3          4          5          6
0                                                                  
0 2004-01-02 2004-01-02 2004-01-09 2004-01-16 2004-01-23 2004-01-30
1 2004-01-05 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
2 2004-01-06 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
3 2004-01-07 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
4 2004-01-08 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06

Solution 3

It works for me:

dtd.apply(lambda x: pd.to_datetime(x,errors = 'coerce', format = '%Y-%m-%d'))

This way you can use function attributes like above (errors and format). See more https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html

Share:
35,717

Related videos on Youtube

Jed
Author by

Jed

Updated on April 03, 2020

Comments

  • Jed
    Jed about 4 years

    I am reading from two different CSVs each having date values in their columns. After read_csv I want to convert the data to datetime with the to_datetime method. The formats of the dates in each CSV are slightly different, and although the differences are noted and specified in the to_datetime format argument, the one converts fine, while the other returns the following value error.

    ValueError: to assemble mappings requires at least that [year, month, day] be sp
    ecified: [day,month,year] is missing
    

    first dte.head()

    0  10/14/2016  10/17/2016  10/19/2016    8/9/2016  10/17/2016   7/20/2016
    1   7/15/2016   7/18/2016   7/20/2016    6/7/2016   7/18/2016   4/19/2016
    2   4/15/2016   4/14/2016   4/18/2016   3/15/2016   4/18/2016   1/14/2016
    3   1/15/2016   1/19/2016   1/19/2016  10/19/2015   1/19/2016  10/13/2015
    4  10/15/2015  10/14/2015  10/19/2015   7/23/2015  10/14/2015   7/15/2015
    

    this dataframe converts fine using the following code:

    dte = pd.to_datetime(dte, infer_datetime_format=True)
    

    or

    dte = pd.to_datetime(dte[x], format='%m/%d/%Y')
    

    the second dtd.head()

    0   2004-01-02 2004-01-02  2004-01-09 2004-01-16  2004-01-23  2004-01-30
    1   2004-01-05 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
    2   2004-01-06 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
    3   2004-01-07 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
    4   2004-01-08 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
    

    this csv doesn't convert using either:

    dtd = pd.to_datetime(dtd, infer_datetime_format=True)
    

    or

    dtd = pd.to_datetime(dtd, format='%Y-%m-%d')
    

    It returns the value error above. Interestingly, however, using the parse_dates and infer_datetime_format as arguments of the read_csv method work fine. What is going on here?

  • Jed
    Jed over 7 years
    Awesome. Very simple. Thanks!
  • jezrael
    jezrael over 7 years
    Glad can help you!
  • Jed
    Jed over 7 years
    Sorry, I'm new here and didn't realize I needed to accept it. However, I'm still curious what the reason was for the error. Any ideas?
  • Jed
    Jed over 7 years
    how does this work? I don't understand the logic of the operation
  • jezrael
    jezrael over 7 years
    I think problem is to_datetime need Series in old versions of pandas, in newer I get error AttributeError: 'numpy.int64' object has no attribute 'lower', because it need minimal 3 column with year, month and day - see first example in to_datetime.
  • piRSquared
    piRSquared over 7 years
    @Dorian821 also notice that jezrael's answer uses apply which takes each column of the dtd and uses pd.to_datetime. This works because each column is a pd.Series and is perfect for using pd.to_datetime.
  • jezrael
    jezrael over 7 years
    @piRSquared - Thank you for better explanation.
  • Jed
    Jed over 7 years
    ahh, ok. Thanks a lot for the explanation.
  • jezrael
    jezrael over 7 years
    @piRSquared - I think you can add your comment to answer ;)
  • LonelySoul
    LonelySoul almost 3 years
    Generally, Errors = 'coerce' should be last choice since the error may be due to "NULL" values