Pandas: cannot safely convert passed user dtype of int32 for float64

python pandas validation numpy dataframe

10,878

The problem was that I was using spaces as the delimiter and that the csv had trailing spaces. Removing the trailing spaces solved the issue.

To trim all of the trailing spaces on every line of every file in a directory, I ran this command: find . -name "*.csv" | xargs sed -i 's/[ \t]*$//'

10,878

crypdick

ML engineer fighting climate change and ecological destruction using smart drones at Dendra Systems. I'm also a fairly accomplished vagabond-- I've traveled to 27 countries (as of 2017) and have hitchhiked 18,000+ km.

Updated on June 04, 2022

Comments

crypdick almost 2 years

I am stumped by a problem with loading my data into a Pandas dataframe using read_table(). The error says TypeError: Cannot cast array from dtype('float64') to dtype('int32') according to the rule 'safe' and ValueError: cannot safely convert passed user dtype of int32 for float64 dtyped data in column 2

test.py:

import numpy as np
import os
import pandas as pd

# put test.csv in same folder as script
mydir = os.path.dirname(os.path.abspath(__file__))
csv_path = os.path.join(mydir, "test.csv")

df = pd.read_table(csv_path, sep=' ',
                   comment='#',
                   header=None,
                   skip_blank_lines=True,
                   names=["A", "B", "C", "D", "E", "F", "G"],
                   dtype={"A": np.int32,
                       "B": np.int32,
                       "C": np.float64,
                       "D": np.float64,
                       "E": np.float64,
                       "F": np.float64,
                       "G": np.int32})

test.csv:

2270433 3 21322.889 11924.667 5228.753 1.0 -1 2270432 3 21322.297 11924.667 5228.605 1.0 2270433