reading csv file to pandas dataframe as float
Solution 1
The original code was correct
df = pd.read_csv(filename,index_col=0)
but the .csv
file had been constructed incorrectly.
As @juanpa.arrivillaga pointed out, pandas will infer the dtypes
without any arguments, provided all the data in a column is of the same dtype
. The columns were being interpreted as strings because although most of the data was numeric, one row contained non-numeric data (actually dates). Removing this row from the .csv
solved the problem.
Solution 2
Get the list of all column names, remove the first one. cast other columns.
cols = df.columns
cols.remove('fistcolumn')
for col in cols:
df[col] = df[col].astype(float)
doctorer
Updated on July 12, 2020Comments
-
doctorer almost 4 years
I have a
.csv
file with strings in the top row and first column, with the rest of the data as floating point numbers. I want to read it into a dataframe with the first row and column as column names and index respectively, and all the floating values asfloat64
.If I use
df = pd.read_csv(filename,index_col=0)
all the numeric values are left as strings.If I use
df = pd.read_csv(filename, index_col=0, dtype=np.float64)
I get an exception:ValueError: could not convert string to float
as it attempts to parse the first column asfloat
.There are a large number of columns, and i do not have the column names, so I don't want to identify each column for parsing as
float
; I want to parse every column except the first one.