Python: "Pandas data cast to numpy dtype of object. Check input data with np.asarray(data)."
Solution 1
You need to make the "time_field" column as index of your data frame(in the ARIMA model, we should always set the date time column as index of the data frame)
frame=frame.set_index['time_field']
model = ARIMA(frame, order=(5,1,0))
model_fit = model.fit(disp=0)
Note:- When you're setting up the index column, you may get error if the index column has any duplicate values. So in that case, you better do a group by summation.
frame = frame.groupby(['time_field']).agg({'value_field': 'sum'})
or
frame = frame.groupby(['time_field']).sum()
Solution 2
I had a similar problem and worked for me using pandas Series
instead of the DataFrame
, with the timestamp column as index
data = pd.Series(frame.value_fields, index=frame.time_field)
model = ARIMA(data, order=(5,1,0))
model_fit = model.fit(disp=0)
Julian Almanzar
Updated on July 11, 2022Comments
-
Julian Almanzar almost 2 years
I'm trying to create an ARIMA model for forecasting a time-serie with some data from my server, and i keep the error on the title showing up and i don't know what type of object i need. Here's the code:
frame = pd.read_sql(query, con=connection) connection.close() frame['time_field'] = pd.to_timedelta(frame['time_field']) print(frame.head(10)) #fitting model = ARIMA(frame, order=(5,1,0)) model_fit = model.fit(disp=0)
i've seen examples like this one: https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/
where they use dates instead of times with the respectives values. This is the output of the frame value:
time_field value_field 0 00:00:14 283.80 1 00:01:14 271.97 2 00:02:14 320.53 3 00:03:14 346.78 4 00:04:14 280.72 5 00:05:14 277.41 6 00:06:14 308.65 7 00:07:14 321.27 8 00:08:14 320.68 9 00:09:14 332.32
-
hd1 over 6 yearsWhy are you connection to mysql? Pandas abstracts this away.
-
Julian Almanzar over 6 yearsI'm connecting to mysql because that's the only server i have available right now, and i'm formatting it's output as a frame becasue that's the input format for the ARIMA function
-
Rafael P. Miranda over 6 yearsHave you found the answer?
-
-
José almost 4 yearsdata = pd.Series(frame.value_fields.values, index=frame.time_field) worked for me