How to convert string labels to numeric values

14,010

Assuming that all classes are present in your list you can do this using apply and call index on the list to return the ordinal position of the class in the list:

In[5]:
df['labels'].apply(data_classes.index)

Out[5]: 
0    0
1    1
2    2
Name: labels, dtype: int64

However, it will be faster to define a dict of your mapping and pass this an use map IMO as this is cython-ised so should be faster:

In[7]:
d = dict(zip(data_classes, range(0,3)))
d

Out[7]: {'cat': 0, 'dog': 1, 'mouse': 2}

In[8]:
df['labels'].map(d, na_action='ignore')

Out[8]: 
0    0
1    1
2    2
Name: labels, dtype: int64

If there are classes not present then NaN is returned

Share:
14,010
T T
Author by

T T

Updated on June 05, 2022

Comments

  • T T
    T T almost 2 years

    I have a csv file(delimiter=,) containing following fields

    filename labels
    xyz.png  cat
    pqz.png  dog
    abc.png  mouse           
    

    there is a list containing all the classes

    data-classes = ["cat", "dog", "mouse"]
    

    Question : How to replace the string labels in csv with the index of the labels data-classes (i.e. if label == cat then label should change to 0 ) and save it in csv file.