pandas ValueError: Cannot setitem on a Categorical with a new category, set the categories first

11,216

In your solution is problem there is categorical column, so if replace only some rows pandas want ouput column set to categoricals, and because 0,1 not exist in categories is raised error.

Sample data with categorical column:

df = pd.DataFrame({'decision':['Yes','No']})

df['decision'] = pd.Categorical(df['decision'])

Solutions with Series.map and cat.rename_categories for categorical ouput:

df['decision1'] = df['decision'].map({'Yes':1, 'No':0})
df['decision2'] = df['decision'].cat.rename_categories({'Yes':1, 'No':0})

If only Yes and No values is possible recreate all values by compare by Yes and cast to integer for True, False to 1,0 mapping like mentioned @arhr, categorical is lost:

df['decision3'] = (df['decision'] == 'Yes').astype(int)
print (df)
  decision decision1  decision2 decision3
0      Yes         1          1         1
1       No         0          0         0

print (df.dtypes)
decision     category
decision1    category
decision2    category  
decision3       int32
dtype: object
Share:
11,216
Khawar Islam
Author by

Khawar Islam

Khawar's background includes Bachelors Degree in Computer Science. Skills Proficient with Swift 2.3 & 3.0, depending on project requirements, and Cocoa Touch. Experience with iOS frameworks such as Cocoa Touch, AVFoundation, etc. Experience with offline storage using Realm, threading, and performance tuning. Familiarity with APIs to connect iOS applications to back-end services. Knowledge of other iOS App technologies and UI/UX standards. Understanding of Apple’s design principles and interface guideline. Knowledge of different swift libraries. Experience with performance and memory tuning with tools such as Instruments and Memory Graph, depending on project needs. Familiarity with server based push notifications. Knack for benchmarking and optimization Proficient understanding of code versioning tools such as Git, etc Familiarity with continuous integration. Courses App Design and Development for iOS. Best Practices for iOS User Interface Design. Cloud Computing Applications, Part 1: Cloud Systems and Infrastructure. Data Visualization. Google Cloud Platform Fundamentals: Core Infrastructure. Honor & Award C-Sharpcorner Most Valuable Professional Award 2017 C-Sharpcorner Monthly Winner April C-Sharpcorner Monthly Winner March C-Sharpcorner Most Valuable Professional Award 2016 Laptop Awarded Under The Prime Minster Youth Scheme

Updated on June 04, 2022

Comments

  • Khawar Islam
    Khawar Islam almost 2 years

    Now, I am changing the information inside DataFrame by replacing Yes with 1 and No with 0. Previously, my code worked fine and now I made some changes due to a memory problem.

    Previous code "Got Traceback Error mentioned below"

    df.loc[df[df.decision == 'Yes'].index, 'decision'] = 1
    df.loc[df[df.decision == 'No'].index, 'decision'] = 0
    

    Changed with

    df.loc['Yes', "decision"] = 1
    df.loc['No', "decision"] = 0
    

    Still, the problem remains the same.

    Traceback

    Traceback (most recent call last):
      File "/snap/pycharm-community/226/plugins/python-ce/helpers/pydev/pydevd.py", line 1477, in _exec
        pydev_imports.execfile(file, globals, locals)  # execute the script
      File "/snap/pycharm-community/226/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
        exec(compile(contents+"\n", file, 'exec'), glob, loc)
      File "/home/khawar/deepface/tests/Ensemble-Face-Recognition.py", line 148, in <module>
        df.loc['Yes', "decision"] = 1
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 670, in __setitem__
        iloc._setitem_with_indexer(indexer, value)
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1763, in _setitem_with_indexer
        isetter(loc, value)
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1689, in isetter
        ser._mgr = ser._mgr.setitem(indexer=plane_indexer, value=v)
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 543, in setitem
        return self.apply("setitem", indexer=indexer, value=value)
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 409, in apply
        applied = getattr(b, f)(**kwargs)
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 1688, in setitem
        self.values[indexer] = value
      File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/arrays/categorical.py", line 2011, in __setitem__
        "Cannot setitem on a Categorical with a new "
    ValueError: Cannot setitem on a Categorical with a new category, set the categories first
    python-BaseException
    

    As suggested I implemented new code

    df['decision'] = (df['decision'] == 'Yes').astype(int)
    

    Traceback

    Traceback (most recent call last):
      File "/home/khawar/deepface/tests/Ensemble-Face-Recognition.py", line 174, in <module>
        gbm = lgb.train(params, lgb_train, num_boost_round=1000, early_stopping_rounds=15, valid_sets=lgb_test)
      File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/engine.py", line 231, in train
        booster = Booster(params=params, train_set=train_set)
      File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 2053, in __init__
        train_set.construct()
      File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1325, in construct
        categorical_feature=self.categorical_feature, params=self.params)
      File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1123, in _lazy_init
        self.__init_from_np2d(data, params_str, ref_dataset)
      File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1162, in __init_from_np2d
        data = np.array(mat.reshape(mat.size), dtype=np.float32)
    ValueError: could not convert string to float: 'deepface/tests/dataset/029A33.JPG'
    
    • arhr
      arhr about 3 years
      Could you try the following : df['decision'] = (df['decision'] == 'Yes').astype(int). It should work for a binary categorical variable
    • Khawar Islam
      Khawar Islam about 3 years
      I have written above line and now getting error about datatype conversion