Remove 'seconds' and 'minutes' from a Pandas dataframe column
13,004
dt.round
This is how it should be done... use dt.round
df.assign(Date=df.Date.dt.round('H'))
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827
OLD ANSWER
One approach is to set the index and use resample
df.set_index('Date').resample('H').last().reset_index()
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827
Another alternative is to strip the date
and hour
components
df.assign(
Date=pd.to_datetime(df.Date.dt.date) +
pd.to_timedelta(df.Date.dt.hour, unit='H'))
Date Num
0 2011-01-01 00:00:00 0.577957
1 2011-01-01 01:00:00 0.995748
2 2011-01-01 02:00:00 0.864013
3 2011-01-01 03:00:00 0.468762
4 2011-01-01 04:00:00 0.866827
Author by
Dustin Helliwell
BY DAY: Commercial crabber, surfer, math teacher, online student, programing hobbyist. BY NIGHT: Gamer, netflix addict, RPG enthusiast, fantasy / sci fi reader. FOR FUN: One of my favorite XKCD cartoons: Certainty XKCD
Updated on June 24, 2022Comments
-
Dustin Helliwell almost 2 years
Given a dataframe like:
import numpy as np import pandas as pd df = pd.DataFrame( {'Date' : pd.date_range('1/1/2011', periods=5, freq='3675S'), 'Num' : np.random.rand(5)}) Date Num 0 2011-01-01 00:00:00 0.580997 1 2011-01-01 01:01:15 0.407332 2 2011-01-01 02:02:30 0.786035 3 2011-01-01 03:03:45 0.821792 4 2011-01-01 04:05:00 0.807869
I would like to remove the 'minutes' and 'seconds' information.
The following (mostly stolen from: How to remove the 'seconds' of Pandas dataframe index?) works okay,
df = df.assign(Date = lambda x: pd.to_datetime(x['Date'].dt.strftime('%Y-%m-%d %H'))) Date Num 0 2011-01-01 00:00:00 0.580997 1 2011-01-01 01:00:00 0.407332 2 2011-01-01 02:00:00 0.786035 3 2011-01-01 03:00:00 0.821792 4 2011-01-01 04:00:00 0.807869
but it feels strange to convert a datetime to a string then back to a datetime. Is there a way to do this more directly?
-
Dustin Helliwell about 7 yearsTurns out
dt.floor
worked better for my case although I expectdt.round
is better in general. -Thanks -
Marco Cerliani almost 4 yearsATTENTION: the round of 2030-01-01 21:54:00 is 2030-01-01 22:00:00 and NOT 2030-01-01 21:00:00 --- to do this use dt.floor
-
m_h over 3 yearsAlternatively: df.Date = df.Date.dt.floor('H')
-
Pam about 2 yearsJust to add, I needed pd.datetime but otherwise this worked perfectly.