reading csv file to pandas dataframe as float

python csv pandas parsing

33,828

Solution 1

The original code was correct

df = pd.read_csv(filename,index_col=0)

but the .csv file had been constructed incorrectly.

As @juanpa.arrivillaga pointed out, pandas will infer the dtypes without any arguments, provided all the data in a column is of the same dtype. The columns were being interpreted as strings because although most of the data was numeric, one row contained non-numeric data (actually dates). Removing this row from the .csv solved the problem.

Solution 2

Get the list of all column names, remove the first one. cast other columns.

cols = df.columns
cols.remove('fistcolumn')
for col in cols:
    df[col] = df[col].astype(float)

33,828

Author by

doctorer

Updated on July 12, 2020

Comments

doctorer almost 4 years

I have a .csv file with strings in the top row and first column, with the rest of the data as floating point numbers. I want to read it into a dataframe with the first row and column as column names and index respectively, and all the floating values as float64.

If I use df = pd.read_csv(filename,index_col=0) all the numeric values are left as strings.

If I use df = pd.read_csv(filename, index_col=0, dtype=np.float64) I get an exception: ValueError: could not convert string to float as it attempts to parse the first column as float.

There are a large number of columns, and i do not have the column names, so I don't want to identify each column for parsing as float; I want to parse every column except the first one.