Pandas: cannot safely convert passed user dtype of int32 for float64

10,878

The problem was that I was using spaces as the delimiter and that the csv had trailing spaces. Removing the trailing spaces solved the issue.

To trim all of the trailing spaces on every line of every file in a directory, I ran this command: find . -name "*.csv" | xargs sed -i 's/[ \t]*$//'

Share:
10,878

Related videos on Youtube

crypdick
Author by

crypdick

ML engineer fighting climate change and ecological destruction using smart drones at Dendra Systems. I'm also a fairly accomplished vagabond-- I've traveled to 27 countries (as of 2017) and have hitchhiked 18,000+ km.

Updated on June 04, 2022

Comments

  • crypdick
    crypdick almost 2 years

    I am stumped by a problem with loading my data into a Pandas dataframe using read_table(). The error says TypeError: Cannot cast array from dtype('float64') to dtype('int32') according to the rule 'safe' and ValueError: cannot safely convert passed user dtype of int32 for float64 dtyped data in column 2

    test.py:

    import numpy as np
    import os
    import pandas as pd
    
    # put test.csv in same folder as script
    mydir = os.path.dirname(os.path.abspath(__file__))
    csv_path = os.path.join(mydir, "test.csv")
    
    df = pd.read_table(csv_path, sep=' ',
                       comment='#',
                       header=None,
                       skip_blank_lines=True,
                       names=["A", "B", "C", "D", "E", "F", "G"],
                       dtype={"A": np.int32,
                           "B": np.int32,
                           "C": np.float64,
                           "D": np.float64,
                           "E": np.float64,
                           "F": np.float64,
                           "G": np.int32})
    

    test.csv:

    2270433 3 21322.889 11924.667 5228.753 1.0 -1 2270432 3 21322.297 11924.667 5228.605 1.0 2270433