Expected 2D array, got 1D array instead, Reshape Data
Solution 1
Ok I finally got the code to work. Please see the solution below:
# Data Preprocessing
# Import Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Import Dataset
dataset = pd.read_csv('Data2.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values
df_X = pd.DataFrame(X)
df_y = pd.DataFrame(y)
# Replace Missing Values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(X[:, 3:5 ])
X[:, 3:5] = imputer.transform(X[:, 3:5])
# Encoding Categorical Data "Name"
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_x = LabelEncoder()
X[:, 0] = labelencoder_x.fit_transform(X[:, 0])
# Encoding Categorical Data "University"
from sklearn.preprocessing import LabelEncoder
labelencoder_x1 = LabelEncoder()
X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])
# Transform Name into a Matrix
onehotencoder1 = OneHotEncoder(categorical_features = [0])
X = onehotencoder1.fit_transform(X).toarray()
# Transform University into a Matrix
onehotencoder2 = OneHotEncoder(categorical_features = [6])
X = onehotencoder2.fit_transform(X).toarray()
Solution 2
try changing you code to this
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Import Dataset
dataset = pd.read_csv('Data2.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values
df_X = pd.DataFrame(X)
df_y = pd.DataFrame(y)
# Replace Missing Values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(X[:, 3:5 ])
X[:, 3:5] = imputer.transform(X[:, 3:5])
# Encoding Categorical Data "Name"
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_x = LabelEncoder()
X[:, 0] = labelencoder_x.fit_transform(X[:, 0])
# Transform into a Matrix
onehotencoder1 = OneHotEncoder(categorical_features = [0])
res_0 = onehotencoder1.fit_transform(X[:, 0].reshape(-1, 1)) # <=== Change
X[:, 0] = res_0.ravel()
# Encoding Categorical Data "University"
from sklearn.preprocessing import LabelEncoder
labelencoder_x1 = LabelEncoder()
X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])
If you are getting error at labelencoder_x1.fit_transform(X[:, 1])
then make it labelencoder_x1.fit_transform(X[:, 1].reshape(-1, 1))
wolfbagel
Updated on June 05, 2022Comments
-
wolfbagel almost 2 years
I'm really stuck on this problem. I'm trying to use OneHotEncoder to encode my data into a matrix after using LabelEncoder but getting this error: Expected 2D array, got 1D array instead.
At the end of the error message(included below) it said to "Reshape my data" which I thought I did but it's still not working. If I understand Reshaping, is that just when you want to literally reshape some data into a different matrix size? For example, if I want to change a 3 x 2 matrix into a 4 x 6?
My code is failing on these 2 lines:
X = X.reshape(-1, 1) # I added this after I saw the error X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray()
Here is the code I have so far:
# Data Preprocessing # Import Libraries import numpy as np import matplotlib.pyplot as plt import pandas as pd # Import Dataset dataset = pd.read_csv('Data2.csv') X = dataset.iloc[:, :-1].values y = dataset.iloc[:, 5].values df_X = pd.DataFrame(X) df_y = pd.DataFrame(y) # Replace Missing Values from sklearn.preprocessing import Imputer imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0) imputer = imputer.fit(X[:, 3:5 ]) X[:, 3:5] = imputer.transform(X[:, 3:5]) # Encoding Categorical Data "Name" from sklearn.preprocessing import LabelEncoder, OneHotEncoder labelencoder_x = LabelEncoder() X[:, 0] = labelencoder_x.fit_transform(X[:, 0]) # Transform into a Matrix onehotencoder1 = OneHotEncoder(categorical_features = [0]) X = X.reshape(-1, 1) X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray() # Encoding Categorical Data "University" from sklearn.preprocessing import LabelEncoder labelencoder_x1 = LabelEncoder() X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])
Here is the full error message:
File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 1809, in _transform_selected X = check_array(X, accept_sparse='csc', copy=copy, dtype=FLOAT_DTYPES) File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 441, in check_array "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[ 2.00000000e+00 7.00000000e+00 3.20000000e+00 2.70000000e+01 2.30000000e+03 1.00000000e+00 6.00000000e+00 3.90000000e+00 2.80000000e+01 2.90000000e+03 3.00000000e+00 4.00000000e+00 4.00000000e+00 3.00000000e+01 2.76700000e+03 2.00000000e+00 8.00000000e+00 3.20000000e+00 2.70000000e+01 2.30000000e+03 3.00000000e+00 0.00000000e+00 4.00000000e+00 3.00000000e+01 2.48522222e+03 5.00000000e+00 9.00000000e+00 3.50000000e+00 2.50000000e+01 2.50000000e+03 5.00000000e+00 1.00000000e+00 3.50000000e+00 2.50000000e+01 2.50000000e+03 0.00000000e+00 2.00000000e+00 3.00000000e+00 2.90000000e+01 2.40000000e+03 4.00000000e+00 3.00000000e+00 3.70000000e+00 2.77777778e+01 2.30000000e+03 0.00000000e+00 5.00000000e+00 3.00000000e+00 2.90000000e+01 2.40000000e+03]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Any help would be great.
-
Jai over 6 yearsYour array is 1D it has to be 2D .... where ever you are getting the error just add
numpy.asmatrix(data)
where data is the data that you are passing... or you can reshape ... Passing 1D array has been deprecated in recent versions of sklearn -
wolfbagel over 6 yearsHi @JayShah in my code I added: X = X.reshape(-1, 1). Is this the correct way to reshape data?
-
Jai over 6 yearsyes
X = X.reshape(-1, 1)
is the right way is to reshape data but in the error but this will only work if your X is a numpy array and not list... If it is a list than make your array list of list ... from the error message I can clearly seearray = [ ]
is 1D because it has one opening and clasing brackets and after reshaping please removeX[:, 1]
in transform and just put X
-
-
wolfbagel over 6 yearsThanks for the solution! I'm running it line by line but it's failing on this line: "X[:, 0] = res_0.ravel()", saying "ravel not found".
-
Jai over 6 yearstry np.ravel(res_0)