Python Pandas Series of Datetimes to Seconds Since the Epoch
Update:
In 0.15.0 Timedeltas
became a full-fledged dtype.
So this becomes possible (as well as the methods below)
In [45]: s = Series(pd.timedelta_range('1 day',freq='1S',periods=5))
In [46]: s.dt.components
Out[46]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 1 0 0 0 0 0 0
1 1 0 0 1 0 0 0
2 1 0 0 2 0 0 0
3 1 0 0 3 0 0 0
4 1 0 0 4 0 0 0
In [47]: s.astype('timedelta64[s]')
Out[47]:
0 86400
1 86401
2 86402
3 86403
4 86404
dtype: float64
Original Answer:
I see that you are on master (and 0.13 is coming out very shortly), so assuming you have numpy >= 1.7. Do this. See here for the docs (this is frequency conversion)
In [5]: df = DataFrame(dict(date = date_range('20130101',periods=10)))
In [6]: df
Out[6]:
date
0 2013-01-01 00:00:00
1 2013-01-02 00:00:00
2 2013-01-03 00:00:00
3 2013-01-04 00:00:00
4 2013-01-05 00:00:00
5 2013-01-06 00:00:00
6 2013-01-07 00:00:00
7 2013-01-08 00:00:00
8 2013-01-09 00:00:00
9 2013-01-10 00:00:00
In [7]: df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1)
Out[7]:
0 15706 days, 02:00:00
1 15707 days, 02:00:00
2 15708 days, 02:00:00
3 15709 days, 02:00:00
4 15710 days, 02:00:00
5 15711 days, 02:00:00
6 15712 days, 02:00:00
7 15713 days, 02:00:00
8 15714 days, 02:00:00
9 15715 days, 02:00:00
Name: date, dtype: timedelta64[ns]
In [9]: (df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1)) / np.timedelta64(1,'s')
Out[9]:
0 1357005600
1 1357092000
2 1357178400
3 1357264800
4 1357351200
5 1357437600
6 1357524000
7 1357610400
8 1357696800
9 1357783200
Name: date, dtype: float64
The contained values are np.timedelta64[ns]
objects, they don't have the same methods as timedelta
objects, so no total_seconds()
.
In [10]: s = (df['date']+timedelta(hours=2)-datetime.datetime(1970,1,1))
In [11]: s[0]
Out[11]: numpy.timedelta64(1357005600000000000,'ns')
You can astype them to int, and you get back a ns
unit.
In [12]: s[0].astype(int)
Out[12]: 1357005600000000000
You can do this as well (but only on an individual unit element).
In [18]: s[0].astype('timedelta64[s]')
Out[18]: numpy.timedelta64(1357005600,'s')
Related videos on Youtube
Chris
Updated on May 25, 2022Comments
-
Chris almost 2 years
Following in the spirit of this answer, I attempted the following to convert a DataFrame column of datetimes to a column of seconds since the epoch.
df['date'] = (df['date']+datetime.timedelta(hours=2)-datetime.datetime(1970,1,1)) df['date'].map(lambda td:td.total_seconds())
The second command causes the following error which I do not understand. Any thoughts on what might be going on here? I replaced map with apply and that didn't help matters.
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-99-7123e823f995> in <module>() ----> 1 df['date'].map(lambda td:td.total_seconds()) /Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/pandas-0.12.0_937_gb55c790-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in map(self, arg, na_action) 1932 return self._constructor(new_values, index=self.index).__finalize__(self) 1933 else: -> 1934 mapped = map_f(values, arg) 1935 return self._constructor(mapped, index=self.index).__finalize__(self) 1936 /Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/pandas-0.12.0_937_gb55c790-py2.7-macosx-10.8-x86_64.egg/pandas/lib.so in pandas.lib.map_infer (pandas/lib.c:43628)() <ipython-input-99-7123e823f995> in <lambda>(td) ----> 1 df['date'].map(lambda td:td.total_seconds()) AttributeError: 'float' object has no attribute 'total_seconds'
-
Zeugma over 10 yearsIt seems the 'date' column may not be a datetime64 in the first place?
-
Abe over 10 yearsDoes the column contain any missing values? Missing values usually cause pandas Series to be cast to floats, causing weirdness when you try to interpret them as datetimes.
-
Chris over 10 years@Abe Indeed you were right. Turns out there is some missing data in some of the records. Not what I anticipated... Ack.
-
-
Chris over 10 yearsPerfect Jeff, thank you! Once I got rid of the rows with missing data, I could directly apply this to the new problem. ;-)
-
Jeff over 10 yearsyou don't need to remove missing values (NaT); they will be returned as nan
-
TimStaley about 10 yearsI don't have time right now to edit (and re-verify) the answer, but in fact version 0.13 now allows 'astype' operations to be applied to an entire Series. See pandas.pydata.org/pandas-docs/stable/…, or pandas.pydata.org/pandas-docs/stable/whatsnew.html and do a find-on-page for 'astype'.
-
soupault almost 9 yearsI'm searching for a method to convert
Series
ofTimedelta
s toseconds
for plotting. Your approach with/ pd.Timedelta(seconds=1)
is a nice one. Stiil, I wonder if there is a better method. What do you think? -
Jeff almost 9 yearsthis answer actually is pretty outdated; Timedeltas became fully fledged types in 0.15.0. I will add another method.