TypeError: argument must be a string or number

11,229

Here is the solution to the problem

this is the code I wrote. (ps: luckily i have the house price prediction dataset with me :D")

from sklearn.preprocessing import LabelEncoder

path="....\house pricing"
filepath=os.path.join(path,"train.csv")

dataset_train=pd.read_csv(filepath)
dataset_train

cat_features=[x for x in dataset_train.columns if dataset_train[x].dtype=="object"]

le=LabelEncoder()

for col in cat_features:
    if col in dataset_train.columns:
        i = dataset_train.columns.get_loc(col)
        dataset_train.iloc[:,i] = dataset_train.apply(lambda i:le.fit_transform(i.astype(str)), axis=0, result_type='expand')

Thus just you have to modify this:

dataset_train.iloc[:,i] =le.fit_transform(dataset_train.iloc[:,i])

with

dataset_train.iloc[:,i] = dataset_train.apply(lambda i:le.fit_transform(dataset_train[i].astype(str)), axis=0, result_type='expand')

The above lamda function will convert each column and its data points(row wise axis=0) to "str" and then pass it through the "le" or LableEncoder function via the "fit_transform" to LabelEncode it.

Share:
11,229
Mamta Gupta
Author by

Mamta Gupta

Updated on June 27, 2022

Comments

  • Mamta Gupta
    Mamta Gupta almost 2 years

    I'm using the code below:

    cat_cols = ['MSZoning','Alley','LotShape','LandContour','Utilities','LotConfig','LandSlope','Neighborhood','Condition1','Condition2','BldgType','HouseStyle','RoofStyle','RoofMatl','Exterior1st','Exterior2nd','MasVnrType','ExterQual','ExterCond','Foundation','BsmtQual','BsmtCond','BsmtExposure','BsmtFinType1','BsmtFinType2','Heating','HeatingQC','CentralAir','Electrical','KitchenQual','Functional','FireplaceQu','GarageType','GarageFinish','GarageQual','GarageCond','PavedDrive','PoolQC','Fence','MiscFeature','SaleType','SaleCondition']
    
    from sklearn.preprocessing import LabelEncoder
    le=LabelEncoder()
    
    for col in cat_cols:
        if col in dataset_train.columns:
            i = dataset_train.columns.get_loc(col)
            dataset_train.iloc[:,i] =le.fit_transform(dataset_train.iloc[:,i])
    

    It gives an error as shown below:

    TypeError: argument must be a string or number