Python: reduce precision pandas timestamp dataframe

30,445

Solution 1

You could convert the underlying datetime64[ns] values to datetime64[s] values using astype:

In [11]: df['Time'] = df['Time'].astype('datetime64[s]')

In [12]: df
Out[12]: 
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55

Note that since Pandas Series and DataFrames store all datetime values as datetime64[ns] these datetime64[s] values are automatically converted back to datetime64[ns], so the end result is still stored as datetime64[ns] values, but the call to astype causes the fractional part of the seconds to be removed.

If you wish to have a NumPy array of datetime64[s] values, you could use df['Time'].values.astype('datetime64[s]').

Solution 2

If you really must remove the microsecond part of the datetime, you can use the Timestamp.replace method along with Series.apply method to apply it across the series , to replace the microsecond part with 0. Example -

df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

Demo -

In [25]: df
Out[25]:
   Record_ID                       Time
0      94704 2014-03-10 07:19:19.647342
1      94705 2014-03-10 07:21:44.479363
2      94706 2014-03-10 07:21:45.479581
3      94707 2014-03-10 07:21:54.481588
4      94708 2014-03-10 07:21:55.481804

In [26]: type(df['Time'][0])
Out[26]: pandas.tslib.Timestamp

In [27]: df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

In [28]: df
Out[28]:
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55

Solution 3

For pandas of version 0.24.0 or upward, you can simply set the freq parameter in ceil() function to get the precison you want:

df['Time'] = df.Time.dt.ceil(freq='s')  

In [28]: df
Out[28]:
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55
Share:
30,445
emax
Author by

emax

Updated on September 18, 2020

Comments

  • emax
    emax over 3 years

    Hello I have the following dataframe

    df = 
    
           Record_ID       Time
            94704   2014-03-10 07:19:19.647342
            94705   2014-03-10 07:21:44.479363
            94706   2014-03-10 07:21:45.479581
            94707   2014-03-10 07:21:54.481588
            94708   2014-03-10 07:21:55.481804
    

    Is it possible to the have following?

    df1 = 
    
           Record_ID       Time
            94704   2014-03-10 07:19:19
            94705   2014-03-10 07:21:44
            94706   2014-03-10 07:21:45
            94707   2014-03-10 07:21:54
            94708   2014-03-10 07:21:55
    
  • mnky9800n
    mnky9800n almost 8 years
    Does pandas only support datetime64[ns]? Is there a built in way to change the precision. Say like only dates?
  • unutbu
    unutbu almost 8 years
    @mnky9800n: Currently, Pandas NDFrames only support datetime64[ns].
  • mnky9800n
    mnky9800n almost 8 years
    Is there a plan to support other datetimes? I think there is obvious reasons since datetime64[ns] ranges can only be ~600 years long.
  • Shaido
    Shaido over 4 years
    This is a clean solution. Depending on the situation, round and floor are also available.
  • Sam
    Sam over 4 years
    im getting TypeError: Cannot cast DatetimeArray to dtype datetime64[s] now with padnas '0.25.3'