ValueError: Columns must be same length as key

13,117

It appears that this is a problem of dimensionality. It would be like the following:

Say I have a list like so:

mylist = [0, 0, 0, 0]

It is of length 4. If I wanted to do 1:1 mapping of elements of a new list into that one:


otherlist = ['a', 'b']

for i in range(len(mylist)):
    mylist[i] = otherlist[i]

Obviously this will throw an IndexError, because it's trying to get elements that otherlist just doesn't have

Much the same is occurring here. You are trying to insert a string (len=1) to a column of length n>1. Try:

data_final[X] = turn_dummy(data_final[X], L)

Assuming len(L) = number_of_rows

Share:
13,117
Minila S
Author by

Minila S

Updated on June 04, 2022

Comments

  • Minila S
    Minila S almost 2 years

    I have a problem running the code below.

    data is my dataframe. X is the list of columns for train data. And L is a list of categorical features with numeric values.

    I want to one hot encode my categorical features. So I do as follows. But a "ValueError: Columns must be same length as key" (for the last line) is thrown. And I still don't understand why after long research.

    def turn_dummy(df, prop):
        dummies = pd.get_dummies(df[prop], prefix=prop, sparse=True)
        df.drop(prop, axis=1, inplace=True)
        return pd.concat([df, dummies], axis=1)
    
    L = ['A', 'B', 'C']
    
    for col in L:
        data_final[X] = turn_dummy(data_final[X], col)
    
    • C.Nivs
      C.Nivs over 5 years
      What is X in this case?