Convert factor to date object R without NA

23,514

The problem now is that your format string states the dates include the year with century where your dates only contain the year without century. You need to use the %y placeholder, not the %Y one.

dates <- factor(c("2/27/14","2/28/14","2/27/14","2/27/14","2/27/14"))
as.Date(dates, format = "%m/%d/%y") # correct lowercase y
as.Date(dates, format = "%m/%d/%Y") # incorrect uppercase y

> as.Date(dates, format = "%m/%d/%y")
[1] "2014-02-27" "2014-02-28" "2014-02-27" "2014-02-27" "2014-02-27"
> as.Date(dates, format = "%m/%d/%Y")
[1] "14-02-27" "14-02-28" "14-02-27" "14-02-27" "14-02-27"

Notice R gets it right when you use the correct placeholder; lowercase y.

What happens with %Y when you don't have a year with century seems OS dependent. As you can see on Linux (Fedora 22) I get no padding of the year part whereas you are seeing zero-padding.

Share:
23,514
Scott Davis
Author by

Scott Davis

Updated on July 09, 2022

Comments

  • Scott Davis
    Scott Davis almost 2 years

    Question: how can I convert a factor to a date object without getting NA values.

    Here's a similar post: Convert Factor to Date/Time in R

    In that post, the user converted to a character object before a date. I am getting NA values when converting to character object using as.character inside the as.Date function.

    I have a column in the dataframe with the date in factor format with different numbers of occurrences. Here's the information contained in the data.frame.

    > head(fraud, 5)
      TRANSACTION.DATE TRANSACTION.AMOUNT AIR.TRAVEL.DATE POSTING.DATE
    1 2/27/14                  25.00                 <NA>          2/28/14
    2 2/28/14                  25.00                 <NA>          2/28/14
    3 2/27/14                  25.00                 <NA>          2/28/14
    4 2/27/14                  20.00              2/27/14          2/28/14
    5 2/27/14                  12.13                 <NA>          2/28/14
    
    > str(fraud$TRANSACTION.DATE)
     Factor w/ 519 levels "1/1/14","1/1/15",..: 228 230 228 228 228 230 226 228 230 228 ...
    
    > summary(fraud$TRANSACTION.DATE, 5)
    9/30/14 9/17/14 11/4/14 9/23/14 (Other) 
        197     187     171     160   19221 
    

    Converting the factor to a date object resulted in NA values.

    > fraud$TRANSACTION.DATE <- as.Date(as.character(fraud$TRANSACTION.DATE), 
    +                                       format = "%m/%d/%Y")
    > head(fraud$TRANSACTION.DATE, 5)
    [1] NA NA NA NA NA
    

    Checking if the as.character function worked.

    > fraud$TRANSACTION.DATE <- as.character(fraud$TRANSACTION.DATE)
    > head(fraud$TRANSACTION.DATE)
    [1] NA NA NA NA NA NA
    

    EDIT: I used as.Date function but got the wrong formatting

    > fraud$TRANSACTION.DATE <- as.Date(fraud$TRANSACTION.DATE, format = "%m/%d/%Y")
    > str(fraud$TRANSACTION.DATE)
     Date[1:19936], format: "0014-02-27" "0014-02-28" "0014-02-27" "0014-02-27" "0014-02-27" ...
    > head(fraud$TRANSACTION.DATE, 5)
    [1] "0014-02-27" "0014-02-28" "0014-02-27" "0014-02-27" "0014-02-27"
    

    EDIT 2: Here's the dput value

    > dput(droplevels(head(fraud$TRANSACTION.DATE)))
    structure(c(1L, 2L, 1L, 1L, 1L, 2L), .Label = c("2/27/14", "2/28/14"
    ), class = "factor")
    

    Solution: using %y instead of %Y

    > fraud$TRANSACTION.DATE <- as.Date(fraud$TRANSACTION.DATE, "%m/%d/%y")
    > head(fraud$TRANSACTION.DATE, 5)
    [1] "2014-02-27" "2014-02-28" "2014-02-27" "2014-02-27" "2014-02-27"