Pandas 0.15 DataFrame: Remove or reset time portion of a datetime64
Solution 1
Pandas has a builtin function pd.datetools.normalize_date
for that purpose:
df['date_col'] = df['date_col'].apply(pd.datetools.normalize_date)
It's implemented in Cython and does the following:
if PyDateTime_Check(dt):
return dt.replace(hour=0, minute=0, second=0, microsecond=0)
elif PyDate_Check(dt):
return datetime(dt.year, dt.month, dt.day)
else:
raise TypeError('Unrecognized type: %s' % type(dt))
Solution 2
Use dt
methods, which is vectorized to yield faster results.
# There are better ways of converting it in to datetime column.
# Ignore those to keep it simple
data['date_column'] = pd.to_datetime(data['date_column'])
data['date_column'].dt.date
Solution 3
pd.datetools.normalize_date
has been deprecated. Use df['date_col'] = df['date_col'].dt.normalize()
instead.
See https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.dt.normalize.html
Solution 4
I can think of two ways, setting or assigning to a new column just the date()
attribute, or calling replace
on the datetime object and passing param hour=0, minute=0
:
In [106]:
# example data
t = """datetime
2014-06-30 21:50:00"""
df = pd.read_csv(io.StringIO(t), parse_dates=[0])
df
Out[106]:
datetime
0 2014-06-30 21:50:00
In [107]:
# apply a lambda accessing just the date() attribute
df['datetime'] = df['datetime'].apply( lambda x: x.date() )
print(df)
# reset df
df = pd.read_csv(io.StringIO(t), parse_dates=[0])
# call replace with params hour=0, minute=0
df['datetime'] = df['datetime'].apply( lambda x: x.replace(hour=0, minute=0) )
df
datetime
0 2014-06-30
Out[107]:
datetime
0 2014-06-30
Solution 5
Since pd.datetools.normalize_date
has been deprecated and you are working with the datetime64
data type, use:
df.your_date_col = df.your_date_col.apply(lambda x: x.replace(hour=0, minute=0, second=0, microsecond=0))
This way you don't need to convert to pandas datetime first. If it's already a pandas datetime, then see answer from Phil.
df.your_date_col = df.your_date_col.dt.normalize()
Related videos on Youtube
n4cer500
Updated on June 14, 2022Comments
-
n4cer500 almost 2 years
I have imported a CSV file into a pandas DataFrame and have a datetime64 column with values such as:
2014-06-30 21:50:00
I simply want to either remove the time or set the time to midnight:
2014-06-30 00:00:00
What is the easiest way of doing this?
-
EdChum over 9 yearsYou could just set the column to just the date portion:
df['datetime'] = df['datetime'].apply( lambda x: x.date() )
-
EdChum over 9 yearsOr you could do this:
df['datetime'] = df['datetime'].apply( lambda x: x.replace(hour=0, minute=0) )
-
-
jpm over 2 yearsNote @phil's answer with
df['date_col'].dt.normalize()
sincepd.datetools.normalize_date
has been deprecated and avoid the resulting error.