ValueError: could not convert string to float:

39,153

Solution 1

Try to skip a header, an empty header in the first column is causing the issue.

>>> float(' ')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not convert string to float:

If you want to skip the header you can achieve it with:

def loadDatasetNB(filename):
    lines = csv.reader(open(filename, "rt"))
    next(reader, None)  # <<- skip the headers
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset

(2) Or you can just ignore the exception:

try:
    float(element)
except ValueError:
    pass

If you decide to go with option (2), make sure that you skip only first row or only rows that contain text and you know it for sure.

Solution 2

Looking at the image of your data, python cannot convert the last column of your data with the values square and circle. Also, you have a header in your data that you need to skip.

Try using this code:

def loadDatasetNB(filename):
    with open(filename, 'r') as fp:
        reader= csv.reader(fp)
        # skip the header line
        header = next(reader)
        # save the features and the labels as different lists
        data_features = []
        data_labels = []
        for row in reader:
            # convert everything except the label to a float
            data_features.append([float(x) for x in row[:-1]])
            # save the labels separately
            data_labels.append(row[-1])
    return data_features, data_labels
Share:
39,153
Thom Elliott
Author by

Thom Elliott

New to coding, learning python for shape recognition using opencv and machine learning.

Updated on March 27, 2020

Comments

  • Thom Elliott
    Thom Elliott about 4 years

    I am following a this tutorial to write a Naive Bayes Classifier: http://machinelearningmastery.com/naive-bayes-classifier-scratch-python/

    I keep getting this error:

    dataset[i] = [float(x) for x in dataset[i]]
    ValueError: could not convert string to float: 
    

    Here is the part of my code where the error occurs:

    def loadDatasetNB(filename):
        lines = csv.reader(open(filename, "rt"))
        dataset = list(lines)
        for i in range(len(dataset)):
            dataset[i] = [float(x) for x in dataset[i]]
        return dataset
    

    And here is how the file is called:

    def NB_Analysis():
        filename = 'fvectors.csv'
        splitRatio = 0.67
        dataset = loadDatasetNB(filename)
        trainingSet, testSet = splitDatasetNB(dataset, splitRatio)
        print('Split {0} rows into train={1} and test={2} rows').format(len(dataset), len(trainingSet), len(testSet))
        # prepare model
        summaries = summarizeByClassNB(trainingSet)
        # test model
        predictions = getPredictionsNB(summaries, testSet)
        accuracy = getAccuracyNB(testSet, predictionsNB)
        print('Accuracy: {0}%').format(accuracy)
    
    NB_Analysis()
    

    My file fvectors.csv looks like this

    What is going wrong here and how do I fix it?