Remove 'seconds' and 'minutes' from a Pandas dataframe column

13,004

dt.round

This is how it should be done... use dt.round

df.assign(Date=df.Date.dt.round('H'))

                 Date       Num
0 2011-01-01 00:00:00  0.577957
1 2011-01-01 01:00:00  0.995748
2 2011-01-01 02:00:00  0.864013
3 2011-01-01 03:00:00  0.468762
4 2011-01-01 04:00:00  0.866827

OLD ANSWER

One approach is to set the index and use resample

df.set_index('Date').resample('H').last().reset_index()

                 Date       Num
0 2011-01-01 00:00:00  0.577957
1 2011-01-01 01:00:00  0.995748
2 2011-01-01 02:00:00  0.864013
3 2011-01-01 03:00:00  0.468762
4 2011-01-01 04:00:00  0.866827

Another alternative is to strip the date and hour components

df.assign(
    Date=pd.to_datetime(df.Date.dt.date) +
         pd.to_timedelta(df.Date.dt.hour, unit='H'))

                 Date       Num
0 2011-01-01 00:00:00  0.577957
1 2011-01-01 01:00:00  0.995748
2 2011-01-01 02:00:00  0.864013
3 2011-01-01 03:00:00  0.468762
4 2011-01-01 04:00:00  0.866827
Share:
13,004
Dustin Helliwell
Author by

Dustin Helliwell

BY DAY: Commercial crabber, surfer, math teacher, online student, programing hobbyist. BY NIGHT: Gamer, netflix addict, RPG enthusiast, fantasy / sci fi reader. FOR FUN: One of my favorite XKCD cartoons: Certainty XKCD

Updated on June 24, 2022

Comments

  • Dustin Helliwell
    Dustin Helliwell almost 2 years

    Given a dataframe like:

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(
    {'Date' : pd.date_range('1/1/2011', periods=5, freq='3675S'),
     'Num' : np.random.rand(5)})
                     Date       Num
    0 2011-01-01 00:00:00  0.580997
    1 2011-01-01 01:01:15  0.407332
    2 2011-01-01 02:02:30  0.786035
    3 2011-01-01 03:03:45  0.821792
    4 2011-01-01 04:05:00  0.807869
    

    I would like to remove the 'minutes' and 'seconds' information.

    The following (mostly stolen from: How to remove the 'seconds' of Pandas dataframe index?) works okay,

    df = df.assign(Date = lambda x: pd.to_datetime(x['Date'].dt.strftime('%Y-%m-%d %H')))
                     Date       Num
    0 2011-01-01 00:00:00  0.580997
    1 2011-01-01 01:00:00  0.407332
    2 2011-01-01 02:00:00  0.786035
    3 2011-01-01 03:00:00  0.821792
    4 2011-01-01 04:00:00  0.807869
    

    but it feels strange to convert a datetime to a string then back to a datetime. Is there a way to do this more directly?

  • Dustin Helliwell
    Dustin Helliwell about 7 years
    Turns out dt.floor worked better for my case although I expect dt.round is better in general. -Thanks
  • Marco Cerliani
    Marco Cerliani almost 4 years
    ATTENTION: the round of 2030-01-01 21:54:00 is 2030-01-01 22:00:00 and NOT 2030-01-01 21:00:00 --- to do this use dt.floor
  • m_h
    m_h over 3 years
    Alternatively: df.Date = df.Date.dt.floor('H')
  • Pam
    Pam about 2 years
    Just to add, I needed pd.datetime but otherwise this worked perfectly.