How do you interpolate from an array containing datetime objects?
Solution 1
You can convert them to timestamps (edited to reflect the use of calendar.timegm
to avoid timezone-related pitfalls).
# Python 2.7
import datetime, numpy as np
import calendar
def toTimestamp(d):
return calendar.timegm(d.timetuple())
arr1 = np.array([toTimestamp(datetime.datetime(2008,1,d)) for d in range(1,10)])
arr2 = np.arange(1,10)
result = np.interp(toTimestamp(datetime.datetime(2008,1,5,12)),arr1,arr2)
print result # Prints 5.5
Solution 2
numpy.interp()
function expects that arr1
and arr2
are 1D sequences of floats i.e., you should convert the sequence of datetime
objects to 1D sequence of floats if you want to use np.interp()
.
If input data uses the same UTC offset for all datetime objects then you could get a float by subtracting a reference date from all values. It is true if your input is UTC (the offset is always zero):
from datetime import datetime
import numpy as np
arr1 = np.array([datetime(2008, 1, d) for d in range(1, 10)])
arr2 = np.arange(1, 10)
def to_float(d, epoch=arr1[0]):
return (d - epoch).total_seconds()
f = np.interp(to_float(datetime(2008,1,5,12)), map(to_float, arr1), arr2)
print f # -> 5.5
Solution 3
I'm providing this as a complement to @rchang's answer for those wanting to do this all in Pandas. This function takes a pandas series containing dates and returns a new series with the values converted to 'number of days' after a specified date.
def convert_dates_to_days(dates, start_date=None, name='Day'):
"""Converts a series of dates to a series of float values that
represent days since start_date.
"""
if start_date:
ts0 = pd.Timestamp(start_date).timestamp()
else:
ts0 = 0
return ((dates.apply(pd.Timestamp.timestamp) -
ts0)/(24*3600)).rename(name)
Not sure it will work with times or if it is immune to the time-zone pitfalls mentioned above. But I think as long as you provide a start date in the same time zone, which is subtracted from all the timestamp values, you should be okay.
Here's how I used it:
from scipy.interpolate import interp1d
data = pd.DataFrame({
'Date': pd.date_range('2018-01-01', '2018-01-22', freq='7D'),
'Value': np.random.randn(4)
})
x = convert_dates_to_days(data.Date, start_date='2018-01-01')
y = data.Value
f2 = interp1d(x, y, kind='cubic')
all_dates = pd.Series(pd.date_range('2018-01-01', '2018-01-22'))
x_all = convert_dates_to_days(all_dates, start_date='2018-01-01')
plt.plot(all_dates, f2(x_all), '-')
data.set_index('Date')['Value'].plot(style='o')
plt.grid()
plt.savefig("interp_demo.png")
plt.show()
It seems to work...
Solution 4
If you have/need sub-second precision in your timestamps, here's a slightly edited version of rchang's answer (basically just a different toTimestamp
method)
import datetime, numpy as np
def toTimestamp(d):
return d.timestamp()
arr1 = np.array([toTimestamp(datetime.datetime(2000,1,2,3,4,5) + datetime.timedelta(0,d)) for d in np.linspace(0,1,9)])
arr2 = np.arange(1,10) # 1, 2, ..., 9
result = np.interp(toTimestamp(datetime.datetime(2000,1,2,3,4,5,678901)),arr1,arr2)
print(result) # Prints 6.431207656860352
I can't say anything about timezone issues, as I haven't tested this with other timezones.
Kieran Hunt
Currently working as a postdoc in the Department of Meteorology at Reading University. PhD in atmospheric physics, with particular interest in the processes governing tropical depressions in south Asia. Master's in astrophysics/atmospheric physics with a thesis in fractal geometry and electronic engineering.
Updated on June 14, 2022Comments
-
Kieran Hunt almost 2 years
I'm looking for a function analogous to
np.interp
that can work withdatetime
objects.For example:
import datetime, numpy as np arr1 = np.array([datetime.datetime(2008,1,d) for d in range(1,10)]) arr2 = np.arange(1,10) np.interp(datetime.datetime(2008,1,5,12),arr1,arr2)
would ideally return
5.5
, butnumpy
raisesTypeError: array cannot be safely cast to required type
. Is there a nice pythonic way around this? -
jfs over 9 years
mktime()
assumes thatd
is a local time. mktime() may fail to get the correct timestamp. OP should work with UTC or aware datetime objects. -
jfs over 9 yearsOP input is UTC. It is incorrect to use
time.mktime()
unless the local timezone is UTC, usecalendar.timegm()
instead. More options. -
rchang over 9 years@J.F.Sebastian Well-spotted as always, the answer has been updated to use
calendar.timegm
instead. -
3kstc over 7 years@rchang I am having a smiliar problem, could you please help me?