From TimeDelta to float days in Pandas
Solution 1
You can use pd.to_timedelta
or np.timedelta64
to define a duration and divide by this:
# set up as per @EdChum
df['total_days_td'] = df['time_delta'] / pd.to_timedelta(1, unit='D')
df['total_days_td'] = df['time_delta'] / np.timedelta64(1, 'D')
Solution 2
You can use dt.total_seconds
and divide this by the total number of seconds in a day, example:
In [25]:
df = pd.DataFrame({'dates':pd.date_range(dt.datetime(2016,1,1, 12,15,3), periods=10)})
df
Out[25]:
dates
0 2016-01-01 12:15:03
1 2016-01-02 12:15:03
2 2016-01-03 12:15:03
3 2016-01-04 12:15:03
4 2016-01-05 12:15:03
5 2016-01-06 12:15:03
6 2016-01-07 12:15:03
7 2016-01-08 12:15:03
8 2016-01-09 12:15:03
9 2016-01-10 12:15:03
In [26]:
df['time_delta'] = df['dates'] - pd.datetime(2015,11,6,8,10)
df
Out[26]:
dates time_delta
0 2016-01-01 12:15:03 56 days 04:05:03
1 2016-01-02 12:15:03 57 days 04:05:03
2 2016-01-03 12:15:03 58 days 04:05:03
3 2016-01-04 12:15:03 59 days 04:05:03
4 2016-01-05 12:15:03 60 days 04:05:03
5 2016-01-06 12:15:03 61 days 04:05:03
6 2016-01-07 12:15:03 62 days 04:05:03
7 2016-01-08 12:15:03 63 days 04:05:03
8 2016-01-09 12:15:03 64 days 04:05:03
9 2016-01-10 12:15:03 65 days 04:05:03
In [27]:
df['total_days_td'] = df['time_delta'].dt.total_seconds() / (24 * 60 * 60)
df
Out[27]:
dates time_delta total_days_td
0 2016-01-01 12:15:03 56 days 04:05:03 56.170174
1 2016-01-02 12:15:03 57 days 04:05:03 57.170174
2 2016-01-03 12:15:03 58 days 04:05:03 58.170174
3 2016-01-04 12:15:03 59 days 04:05:03 59.170174
4 2016-01-05 12:15:03 60 days 04:05:03 60.170174
5 2016-01-06 12:15:03 61 days 04:05:03 61.170174
6 2016-01-07 12:15:03 62 days 04:05:03 62.170174
7 2016-01-08 12:15:03 63 days 04:05:03 63.170174
8 2016-01-09 12:15:03 64 days 04:05:03 64.170174
9 2016-01-10 12:15:03 65 days 04:05:03 65.170174
Related videos on Youtube
alpagarou
Engineer in bioinformatics and modeling. Very pluridisciplinary field, so I know a bit of many topics but nothing deeply. I've done an internship on metagenomics data and one in swarming robotics. I now work on modeling population dynamics of tsetse flies to evaluate control strategies.
Updated on November 04, 2020Comments
-
alpagarou over 3 years
I have a TimeDelta column with values that look like this:
2 days 21:54:00.000000000
I would like to have a float representing the number of days, let's say here 2+21/24 = 2.875, neglecting the minutes. Is there a simple way to do this ? I saw an answer suggesting
res['Ecart_lacher_collecte'].apply(lambda x: float(x.item().days+x.item().hours/24.))
But I get "AttributeError: 'str' object has no attribute 'item' "
Numpy version is '1.10.4' Pandas version is u'0.17.1'
The columns has originally been obtained with:
lac['DateHeureLacher'] = pd.to_datetime(lac['Date lacher']+' '+lac['Heure lacher'],format='%d/%m/%Y %H:%M:%S') cap['DateCollecte'] = pd.to_datetime(cap['Date de collecte']+' '+cap['Heure de collecte'],format='%d/%m/%Y %H:%M:%S')
in a first script. Then in a second one:
res = pd.merge(lac, cap, how='inner', on=['Loc']) res['DateHeureLacher'] = pd.to_datetime(res['DateHeureLacher'],format='%Y-%m-%d %H:%M:%S') res['DateCollecte'] = pd.to_datetime(res['DateCollecte'],format='%Y-%m-%d %H:%M:%S') res['Ecart_lacher_collecte'] = res['DateCollecte'] - res['DateHeureLacher']
Maybe saving it to csv change their types back to string? The transformation I'm trying to do is in a third script.
Sexe_x PiegeLacher latL longL Loc Col_x DateHeureLacher Nb envolees PiegeCapture latC longC Col_y Sexe_y Effectif DateCollecte DatePose Ecart_lacher_collecte Dist_m M Q0-002 1629238 237877 H Rouge 2011-02-04 17:15:00 928 Q0-002 1629238 237877 Rouge M 1 2011-02-07 15:09:00 2011-02-07 12:14:00 2 days 21:54:00.000000000 0 M Q0-002 1629238 237877 H Rouge 2011-02-04 17:15:00 928 Q0-002 1629238 237877 Rouge M 4 2011-02-07 12:14:00 2011-02-07 09:42:00 2 days 18:59:00.000000000 0 M Q0-002 1629238 237877 H Rouge 2011-02-04 17:15:00 928 Q0-003 1629244 237950 Rouge M 1 2011-02-07 15:10:00 2011-02-07 12:16:00 2 days 21:55:00.000000000 75
res.info():
Sexe_x 922 non-null object PiegeLacher 922 non-null object latL 922 non-null int64 longL 922 non-null int64 Loc 922 non-null object Col_x 922 non-null object DateHeureLacher 922 non-null object Nb envolees 922 non-null int64 PiegeCapture 922 non-null object latC 922 non-null int64 longC 922 non-null int64 Col_y 922 non-null object Sexe_y 922 non-null object Effectif 922 non-null int64 DateCollecte 922 non-null object DatePose 922 non-null object Ecart_lacher_collecte 922 non-null object Dist_m 922 non-null int64
-
Anton Protopopov about 8 yearsI think you could use
float(res['columnName'].dt.days + res['columnName'].dt.hours / 24)
-
EdChum about 8 yearsIIUC you can do
res['columnname'].dt.total_seconds()/ (24 * 60 * 60)
-
alpagarou about 8 years@AntonProtopopov I get "AttributeError: Can only use .dt accessor with datetimelike values"
-
alpagarou about 8 years@EdChum I get "AttributeError: 'Series' object has no attribute 'total_seconds' "
-
Anton Protopopov about 8 years@alpagarou first you need to convert you columns to
datetime
object. You could do that withpd.to_datetime
-
EdChum about 8 yearsWhat's your version of pandas and can you post raw data, code and your numpy version and edit this into your question
-
alpagarou about 8 years@AntonProtopopov pd.to_datetime raises "ValueError: Unknown string format"
-
Anton Protopopov about 8 years@alpagarou post your date please. Without that it's hard to say what could cause the problem
-
EdChum about 8 yearsAre you saying your column is a string of timedeltas?
-
EdChum about 8 yearspost output from
res.info()
, also doesres['Ecart_lacher_collecte'].dt.total_seconds()/ (24 * 60 * 60)
work? -
alpagarou about 8 years@EdChum First I get "AttributeError: Can only use .dt accessor with datetimelike values" but when I try "res['Ecart_lacher_collecte'] = pd.to_datetime(res['Ecart_lacher_collecte'])" I get "ValueError: Unknown string format"
-
EdChum about 8 yearsIt looks you have invalid values in your columns, can you try
pd.to_datetime(res['DateHeureLacher'],format='%Y-%m-%d %H:%M:%S', errors='coerce')
also you need to post minimal data and code that we can run that reproduces your error as this should work, also when you read the csv back in and setparse_dates=['DateHeureLacher', 'DateCollecte']
inread_csv
does it fix thedtypes
? -
alpagarou about 8 yearsparse_dates fixes the type of 'DateHeureLacher' and 'DateCollecte' but 'Ecart_lacher_collecte' which was timedelta64[ns] at the end of the precedent script, goes back to object when I read it from csv (I also tried to include it in the parse_dates but it doesn't work) I should probably use errors='coerce' to convert 'Ecart_lacher_collecte to datetime before using dt.days, but what format should I precise?
-
EdChum about 8 yearsOK, I just tried this and found that you can't specify the dtype in
read_csv
to parse it back as a timedelta64, so you have a couple options, convert the timedelta64 toint64
usingastype
and then when reading it back use a custom converter to convert it or after reading it back in you can doastype(np.timedelta64)
and this will work -
alpagarou about 8 yearsI put my last two scripts in one and managed to it. However, I get a lot of SettingWithCopyWarning ?
-
-
Jingwei Yu over 7 yearsyes it works but I'm shocked they don't have a dt.total_days() or dt.total_hours available...
-
EdChum over 7 years@LedgerYu because it's not a standard method for timedelta, additionally it's trivial to calculate this yourself but besides if we followed your logic we'd then have methods for total_days, total_hours, total_quarters, total_years, total_milliseconds etc.
-
Guillochon almost 6 yearsYou can just do
dt.days
, not as a function. Works for me! -
Taylor over 5 years@Guillochon
dt.days
will only give you the number of days and will throw out all the time information -
Elias Hasle over 5 yearsWhy separate the division and then join it like that? The only difference from x.total_seconds()/(3600*24) is that incomplete hours don't count at all, which I doubt is the desired behavior.
-
sharinganSawant over 5 yearsYes, that's right. the incomplete hours aren't counted. That's because the original poster wanted it that way: 'I would like to have a float representing the number of days, let's say here 2+21/24 = 2.875, neglecting the minutes.'
-
sharinganSawant over 5 years@EliasHasle I have edited the answer where I have mentioned why the incomplete hours weren't accounted for.
-
MarMat almost 4 yearsYou can also use the datetime module, e.g., datetime.timedelta(days=1)