Specify custom Date format for colClasses argument in read.table/read.csv
Solution 1
You can write your own function that accepts a string and converts it to a Date using the format you want, then use the setAs
to set it as an as
method. Then you can use your function as part of the colClasses.
Try:
setAs("character","myDate", function(from) as.Date(from, format="%d/%m/%Y") )
tmp <- c("1, 15/08/2008", "2, 23/05/2010")
con <- textConnection(tmp)
tmp2 <- read.csv(con, colClasses=c('numeric','myDate'), header=FALSE)
str(tmp2)
Then modify if needed to work for your data.
Edit ---
You might want to run setClass('myDate')
first to avoid the warning (you can ignore the warning, but it can get annoying if you do this a lot and this is a simple call that gets rid of it).
Solution 2
If there is only 1 date format you want to change, you could use the Defaults
package to change the default format within as.Date.character
library(Defaults)
setDefaults('as.Date.character', format = '%d/%M/%Y')
dataImport <- read.csv("data.csv", colClasses = c("factor","factor","Date"))
str(dataImport)
## 'data.frame': 10 obs. of 3 variables:
## $ func_loc : Factor w/ 5 levels "3076WAG0003",..: 1 2 3 3 3 3 3 4 4 5
## $ order_type : Factor w/ 3 levels "PM01","PM02",..: 3 3 1 1 1 1 2 2 3 1
## $ actual_finish: Date, format: "2008-10-15" "2009-10-19" NA "2008-10-11" ...
I think @Greg Snow's answer is far better, as it does not change the default behaviour of an often used function.
Solution 3
In case you need time also:
setClass('yyyymmdd-hhmmss')
setAs("character","yyyymmdd-hhmmss", function(from) as.POSIXct(from, format="%Y%m%d-%H%M%S"))
d <- read.table(colClasses="yyyymmdd-hhmmss", text="20150711-130153")
str(d)
## 'data.frame': 1 obs. of 1 variable:
## $ V1: POSIXct, format: "2015-07-11 13:01:53"
Solution 4
A long time ago, in the meantime the problem has been solved by Hadley Wickham. So nowadays the solution is reduced to a oneliner:
library(readr)
data <- read_csv("data.csv",
col_types = cols(actual_finish = col_datetime(format = "%d/%m/%Y")))
Maybe we want even to get rid of unnecessary stuff:
data <- as.data.frame(data)
Related videos on Youtube
Tommy O'Dell
Updated on May 28, 2020Comments
-
Tommy O'Dell almost 4 years
Question:
Is there a way to specify the Date format when using the colClasses argument in read.table/read.csv?
(I realise I can convert after importing, but with many date columns like this, it would be easier to do it in the import step)
Example:
I have a .csv with date columns in the format
%d/%m/%Y
.dataImport <- read.csv("data.csv", colClasses = c("factor","factor","Date"))
This gets the conversion wrong. For example,
15/07/2008
becomes0015-07-20
.
Reproducible code:
data <- structure(list(func_loc = structure(c(1L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 5L), .Label = c("3076WAG0003", "3076WAG0004", "3076WAG0007", "3076WAG0009", "3076WAG0010"), class = "factor"), order_type = structure(c(3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 1L), .Label = c("PM01", "PM02", "PM03"), class = "factor"), actual_finish = structure(c(4L, 6L, 1L, 2L, 3L, 7L, 1L, 8L, 1L, 5L), .Label = c("", "11/03/2008", "14/08/2008", "15/07/2008", "17/03/2008", "19/01/2009", "22/09/2008", "6/09/2007"), class = "factor")), .Names = c("func_loc", "order_type", "actual_finish"), row.names = c(NA, 10L), class = "data.frame") write.csv(data,"data.csv", row.names = F) dataImport <- read.csv("data.csv") str(dataImport) dataImport dataImport <- read.csv("data.csv", colClasses = c("factor","factor","Date")) str(dataImport) dataImport
And here's what the output looks like:
-
joran over 11 yearsA hackish way to do this would be to create your own version of
read.table
and add aformat
argument that is passed on toas.Date
. I wouldn't be surprised if there's a better way that I'm not thinking of, though.
-
-
Tommy O'Dell over 11 yearsThanks Greg - I dind't know about the
as
orsetAs
functions. Works great. -
user295691 over 11 yearsWow -- setAs is a lifesaver! How have I never seen this function before?
-
Danny D'Amours over 11 yearsNote that you might get a 'no definition for class "myDate"' warning as detailed in this question.
-
Josh O'Brien over 10 yearsWhat is
setMethod('myDate')
supposed to do? Running it just gives me an error... -
Greg Snow over 10 years@JoshO'Brien, sorry that should have been
setClass
(fixed now). What it does is preventsetAs
from issuing a warning about 'myDate' not existing as a class. The warning is harmless and everything still works, but setting the class means that you don't even see the warning. -
MySchizoBuddy about 10 yearsIs it better to just save it as character and then do the conversion of the column to date, or do it as mentioned above?
-
Greg Snow about 10 years@MySchizoBuddy, If you only have one date column and you are doing this one time then it probably does not matter which way you do it. But if you have several columns in your dataset that are dates, then I think this approach would probably be simpler than changing each of the columns after reading.
-
jks612 over 5 yearsFor those like me who came this thread hoping to implement it within
data.table::fread
,fread
doesn't implement date types in it's conversions so this will still not work at time of this writing infread
.