Pandas convert columns type from list to np.array
Use apply
to convert each element to it's equivalent array:
df['col1'] = df['col1'].apply(lambda x: np.array(x))
type(df['col1'].iloc[0])
numpy.ndarray
Data:
df = pd.DataFrame({'col1': [[1,2,3],[0,0,0]]})
df
![LeoCella](https://lh3.googleusercontent.com/-a03G7KLMzrM/AAAAAAAAAAI/AAAAAAAAADg/SqxuLV1C2YI/photo.jpg?sz=256)
LeoCella
I’m a baccalaureate in Computer Science Engineering and since October 2014 I’m attending a graduate course in Computer Science Engineering at Politecnico di Milano; more precisely it focuses on Big Data Analysis and Machine Learning. My main interests are: swimming (played for more than 15 years, in the last two years in a cohesive and competitive group) and brazilian jiu jitsu . As a brazilian I'm also a music-dependent, I'm a self-taught of pandeiro (a required percussion for playing samba) and djembè ( an african drums ).
Updated on July 12, 2022Comments
-
LeoCella almost 2 years
I'm trying to apply a function to a pandas dataframe, such a function required two np.array as input and it fit them using a well defined model.
The point is that I'm not able to apply this function starting from the selected columns since their "rows" contain list read from a JSON file and not np.array.
Now, I've tried different solutions:
#Here is where I discover the problem train_df['result'] = train_df.apply(my_function(train_df['col1'],train_df['col2'])) #so I've tried to cast the Series before passing them to the function in both these ways: X_col1_casted = trai_df['col1'].dtype(np.array) X_col2_casted = trai_df['col2'].dtype(np.array)
doesn't work.
X_col1_casted = trai_df['col1'].astype(np.array) X_col2_casted = trai_df['col2'].astype(np.array)
doesn't work.
X_col1_casted = trai_df['col1'].dtype(np.array) X_col2_casted = trai_df['col2'].dtype(np.array)
does'nt work.
What I'm thinking to do now is a long procedure like:
starting from the uncasted column-series, convert them into list(), iterate on them apply the function to the np.array() single elements, and append the results into a temporary list. Once done I will convert this list into a new column. ( clearly, I don't know if it will work )
Does anyone of you know how to help me ?
EDIT: I add one example to be clear:
The function assume to have as input two np.arrays. Now it has two lists since they are retrieved form a json file. The situation is this one:
col1 col2 result [1,2,3] [4,5,6] [5,7,9] [0,0,0] [1,2,3] [1,2,3]
Clearly the function is not the sum one, but a own function. For a moment assume that this sum can work only starting from arrays and not form lists, what should I do ?
Thanks in advance