Converting different date time formats to MM/DD/YYYY format in pandas dataframe

13,998

Solution 1

Given your sample data, with the addition of a NaT, this works:

Code:

df.date.apply(lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

Test Code:

import pandas as pd

df = pd.DataFrame([
    [['']],
    [['May 23rd, 2011']],
    [['January 1st, 2010']],
    [['Apr. 15, 2008']],
    [['07-11-2013']],
    [['9/01/1995']],
    [['04/15/2000']],
    [['11/22/68']],
    [['12/1997']],
    [['08/2002']],
    [['2014']],
    [['2016']],
], columns=['date'])

df['clean_date'] = df.date.apply(
    lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

print(df)

Results:

                   date  clean_date
0                    []         NaT
1      [May 23rd, 2011]  05/23/2011
2   [January 1st, 2010]  01/01/2010
3       [Apr. 15, 2008]  04/15/2008
4          [07-11-2013]  07/11/2013
5           [9/01/1995]  09/01/1995
6          [04/15/2000]  04/15/2000
7            [11/22/68]  11/22/1968
8             [12/1997]  12/01/1997
9             [08/2002]  08/01/2002
10               [2014]  01/01/2014
11               [2016]  01/01/2016

Solution 2

It would be better if you use this it'll give you the date format in MM-DD-YYYY the you can apply strftime:

df['Date_ColumnName'] = pd.to_datetime(df['Date_ColumnName'], dayfirst = False, yearfirst = False)
Share:
13,998
Chris T.
Author by

Chris T.

Updated on June 22, 2022

Comments

  • Chris T.
    Chris T. almost 2 years

    I have a date column in a pandas.DataFrame in various date time formats and stored as list object, like the following:

                date
    1    [May 23rd, 2011]
    2    [January 1st, 2010]
        ...
    99   [Apr. 15, 2008]
    100  [07-11-2013]
        ...
    256  [9/01/1995]
    257  [04/15/2000]
    258  [11/22/68]
        ...
    360  [12/1997]
    361  [08/2002]
         ...
    463  [2014]
    464  [2016]
    

    For the sake of convenience, I want to convert them all to MM/DD/YYYY format. It doesn't seem possible to use regex replace() function to do this, since one cannot execute this operation over list objects. Also, to use strptime() for each cell will be too time-consuming.

    What will be the easier way to convert them all to the desired MM/DD/YYYY format? I found it very hard to do this on list objects within a dataframe.

    Note: for cell values of the form [YYYY] (e.g., [2014] and [2016]), I will assume they are the first day of that year (i.e., January 1, 1968) and for cell values such as [08/2002] (or [8/2002]), I will assume they the first day of the month of that year (i.e., August 1, 2002).

    • Jon Rose
      Jon Rose over 6 years
      why would strptime() for each cell will be too time-consuming? how large is your list? Doesn't it only need to run once?
    • John Zwinck
      John Zwinck over 6 years
      How close do you get with just pd.to_datetime(df.date)?
    • Chris T.
      Chris T. over 6 years
      My dataframe is only about 500 rows. It's manageable, but there are certainly better ways to convert all these formats in one go.
    • Chris T.
      Chris T. over 6 years
      It says "<class 'list'> is not convertible to datetime" when I applied pd.to_datetime(df.date) because each row in the dataframe is a list object.
    • Zero
      Zero over 6 years
      Does df.date.apply(lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')) work?
    • Chris T.
      Chris T. over 6 years
      No, it does not work. It says "NaTType does not support strftime."
  • Chris T.
    Chris T. over 6 years
    Thanks for helping on this. I suppose this solves my problem, although I could not apply this really straightforward code to my own data (the data is a pandas series contains hundred of list objects like the one shown in the example data). I thought I could have converted that to dataframe and apply you code, but it says 'Series' object has no attribute 'to_DataFrame'
  • Stephen Rauch
    Stephen Rauch over 6 years
    So this code works on a panda.Series of lists, each list only having one item. The expression df.date.apply does an apply to the Series which is in the column date. If you have only a Series and not a DataFrame you can should be able to do my_series.apply(....).
  • Chris T.
    Chris T. over 6 years
    Yes, this code does work on pandas dataframe with list objects. I am trying to think of a way to convert my series object to dataframe, so that I can apply your code. When I used my_series.apply(), I got this following error message "index 0 is out of bounds for axis 0 with size 0"
  • Stephen Rauch
    Stephen Rauch over 6 years
    You do not need to convert. In fact this code converts the column of a DataFrame to a Series. That is what df.date does. If takes one column named date and makes it a series. So as I tried to indicate in my previous comment, take this code, and change df.date to be the name of your Series. Should just work.
  • Chris T.
    Chris T. over 6 years
    Yes, I did, but I received the same error message "index 0 is out of bounds for axis 0 with size 0."
  • Stephen Rauch
    Stephen Rauch over 6 years
    Well it appears that there is something else non-uniform about the data in your Series.
  • Chris T.
    Chris T. over 6 years
    I couldn't solve it. The dataset just doesn't make much sense to me. But the original data series does look every bit like the example data shown in this thread.
  • Stephen Rauch
    Stephen Rauch over 5 years
    @KhalilAlHooti, I just tested with that string and it works fine for me. I got 08/24/2014
  • Khalil Al Hooti
    Khalil Al Hooti over 5 years
    @StephenRauch Thank you it worked finally with me. not sure what was the problem!!
  • vikrant rana
    vikrant rana almost 5 years
    @Stephen Rauch. This one works great. how we can apply same function to date columns of same dataframe. for example; process_date & order_date. Do we need to specify or call this lambda function twice?