R function read.csv failing with "scan() expected 'a real', got..." message
read.csv2
also changes the decimal point indicator from .
to ,
(see dec=","
). Thus a "real" value in this format would look like 4,216
, not 4.216
. Better just stick to read.csv(..., sep=";")
read.csv("dataset.txt", sep=";", stringsAsFactors=FALSE, na.strings='NULL')
Admin
Updated on June 05, 2022Comments
-
Admin almost 2 years
This issue has been raised before and I’ve tried their suggestions, but I think my case is of special interest. I’ve used read.table, read.csv, and read.csv2. To no avail. I choose read.csv2 because the fields/variables are separated with ‘;’, which is the default separator for read.csv2 (albeit you can see I’ve explicitly set it as a workaround)
The first row of the dataset is:
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
My read.csv2 is:
foo <- read.csv2(“dataset.txt",sep=";",stringsAsFactors=FALSE,na.strings='NULL',colClasses=c(rep("character",2),rep("numeric",7)))
I’m looking to import the date and time values as strings and explicitly coerce them into date and time:
y <- as.Date(foo[,1],"%d/%m/%Y") x <- strptime(foo[,2],"%H:%M:%S")
My problem is that I cannot get past the read.csv2. The error is:
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'a real', got '4.216'
Here’s what’s cool. Note the message says “expected 'a real', got '4.216’”. Folks, 4.216 is a real. And note 4.216 is indeed the third value of the row. I’ve also tried:
foo <- read.csv2(“dataset.txt",sep=";",stringsAsFactors=FALSE,na.strings='NULL',colClasses=c(“character”,”character”,rep("numeric",7)))
My version of R is R 3.4.1 GUI 1.70 El Capitan build
Anyone have any ideas of what the problem is? Or is this just flat out a bug?
-
joran over 6 yearsPersonally, I hate all the
.csv
and.delim
"helpers" and just always useread.table
so that I know exactly what I'm asking for.