Create pandas dataframe from numpy array
Solution 1
You need to transpose your numpy
array:
df_1 = pd.DataFrame(data.T, columns=columns)
To see why this is necessary, consider the shape of your array:
print(data.shape)
(2, 3)
The second number in the shape tuple, or the number of columns in the array, must be equal to the number of columns in your dataframe.
When we transpose the array, the data and shape of the array are transposed, enabling it to be a passed into a dataframe with two columns:
print(data.T.shape)
(3, 2)
print(data.T)
[[1 1]
[2 5]
[2 3]]
Solution 2
DataFrames are inherently created in that order from an array.
Either way, you need to transpose something.
One option would be to specify the index=columns then transpose the whole thing. This will get you the same output.
columns = ['1','2']
data = np.array([[1,2,2] , [1,5,3]])
df_1 = pd.DataFrame(data, index=columns).T
df_1
Passing in data.T as mentioned above is also perfectly acceptable (assuming the data is an ndarray type).
Solution 3
In the second case, you can use:
df_1 = pd.DataFrame(dict(zip(columns, data)))
Comments
-
blue-sky almost 2 years
To create a pandas dataframe from numpy I can use :
columns = ['1','2'] data = np.array([[1,2] , [1,5] , [2,3]]) df_1 = pd.DataFrame(data,columns=columns) df_1
If I instead use :
columns = ['1','2'] data = np.array([[1,2,2] , [1,5,3]]) df_1 = pd.DataFrame(data,columns=columns) df_1
Where each array is a column of data. But this throws error :
ValueError: Wrong number of items passed 3, placement implies 2
Is there support in pandas in this data format or must I use the format in example 1 ?