R function read.csv failing with "scan() expected 'a real', got..." message

10,545

read.csv2 also changes the decimal point indicator from . to , (see dec=","). Thus a "real" value in this format would look like 4,216, not 4.216. Better just stick to read.csv(..., sep=";")

read.csv("dataset.txt", sep=";", stringsAsFactors=FALSE, na.strings='NULL')
Share:
10,545
Admin
Author by

Admin

Updated on June 05, 2022

Comments

  • Admin
    Admin almost 2 years

    This issue has been raised before and I’ve tried their suggestions, but I think my case is of special interest. I’ve used read.table, read.csv, and read.csv2. To no avail. I choose read.csv2 because the fields/variables are separated with ‘;’, which is the default separator for read.csv2 (albeit you can see I’ve explicitly set it as a workaround)

    The first row of the dataset is:

    16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
    

    My read.csv2 is:

    foo <- read.csv2(“dataset.txt",sep=";",stringsAsFactors=FALSE,na.strings='NULL',colClasses=c(rep("character",2),rep("numeric",7)))
    

    I’m looking to import the date and time values as strings and explicitly coerce them into date and time:

    y <- as.Date(foo[,1],"%d/%m/%Y")
    x <- strptime(foo[,2],"%H:%M:%S")
    

    My problem is that I cannot get past the read.csv2. The error is:

    Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
      scan() expected 'a real', got '4.216'
    

    Here’s what’s cool. Note the message says “expected 'a real', got '4.216’”. Folks, 4.216 is a real. And note 4.216 is indeed the third value of the row. I’ve also tried:

    foo <- read.csv2(“dataset.txt",sep=";",stringsAsFactors=FALSE,na.strings='NULL',colClasses=c(“character”,”character”,rep("numeric",7)))
    

    My version of R is R 3.4.1 GUI 1.70 El Capitan build

    Anyone have any ideas of what the problem is? Or is this just flat out a bug?

  • joran
    joran over 6 years
    Personally, I hate all the .csv and .delim "helpers" and just always use read.table so that I know exactly what I'm asking for.