Invalid literal for float(): 0.000001, how to fix error?

python csv floating-point literals

17,413

Solution 1

From what you've posted, it's not clear whether there is something subtly wrong with the string you're trying to pass to float() (because it looks perfectly reasonable). Try adding a debug print statement:

print(repr(items[2]))
p_value = float(items[2])

Then you can determine exactly what is being passed to float(). The call to repr() will make even normally invisible characters visible. Add the result to your question and we will be able to comment further.

Solution 2

Your file most likely has some unprintable character that is read. Try this:

>>> a = '0.00001\x00'
>>> a
'0.00001\x00'
>>> print(a)
0.00001
>>> float(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): 0.00001

You can see that a has a NUL character which is not printed with either print or the exception of float.

17,413

Author by

student001

Updated on June 05, 2022

Comments

student001 almost 2 years
I have a .csv file containing 3 columns of data. I need to create a new output file that includes a specific set of data from the first and third column from the original file. The third column contains decimal values, and I believe in such a case I have use the float() feature of python. I have tried the following code:
```
in_file = open("filename.csv", "r")

out_file = open("output.csv", "w")

while True:

    line = in_file.readline()
    if (line == ''): 
        break
    line = line.strip() 
    items = line.split(',') 
    gi_name = items[0] 
    if (gi_name.startswith("_"))
        continue
    p_value = float(items[2]) 
    if (p_value > 0.05):
        continue
    out_file.write(','.join([gene_name, str(p_value)]))
in_file.close()
out_file.close()
```
when I run the above, I recieve the following error:

Error: invalid literal for float(): 0.000001

the value 0.0000001 is the first value in my data set for the third column, and I guess the code cannot read beyond that set but I'm not sure why. I am new to python, and don't really understand why I am getting this error or how to fix it. I have tried other modifications for how to input the float(), but without success. Does anyone know how I might be able to fix this?
student001 about 12 years

Thank you Greg, when I input the repr(items[2])) it printed the following: '1.10E-06\rGene2' Traceback (most recent call last): File "s6help.py", line 13, in <module> p_value = float(items[2]) so it seems I have a \rGene2 that is hidden in my item[2]. My code has the .strip() function, I thought that would remove the \r and \n. I modified my code to .strip(\r), but it still did not remove it. I don't know what else to do, do have any more ideas?
Greg Hewgill about 12 years

Well, that's definitely the problem. Note that .strip() only removes whitespace from the ends of the string, while your \r is in the middle of the string. You're now going to have to look at the CSV file format and the code you use to read the file. It's possible that your file might have only \r line endings, which isn't supported by default in Python. Does that seem likely?
student001 about 12 years

Yes this is possible, and I believe this is the problem. My line endings contain \r, and any attempt to remove them or replace them only results in creating one long line, which is not what I want. Any suggestion on how to remove the \r but still maintain seperate rows?
Greg Hewgill about 12 years

Use \n instead of \r. The \r by itself is not a usual line terminator. Python normally handles both \n and \r\n (but \n is preferred).
student001 about 12 years

Thank you so much! I was able to get the code to work simply by using the 'rU' read argument instead of just 'r', which basically removes the \r issue. Thank you so much, I don't know if I ever would have figured that out on my own!