R: How to get the maximum value of a datetime column in a time series data
11,079
Solution 1
Here's one liner with base R
df1[which.max(as.POSIXct(df1$InsertDate)), ]
# EditDate ID Avg Sig InsertDate FW
# 3 2015-04-07 11:40:13 DL1X8 38.1517 11.4588 2015-04-10 9:40:00 40
Or with data.table
library(data.table)
setDT(df1)[which.max(as.POSIXct(InsertDate))]
# EditDate ID Avg Sig InsertDate FW
# 1: 2015-04-07 11:40:13 DL1X8 38.1517 11.4588 2015-04-10 9:40:00 40
Solution 2
Just with lubridate
library(lubridate)
df1[ymd_hms(EditDate)==max(ymd_hms(EditDate)), ]
or df1[EditDate==as.character(max(ymd_hms(EditDate))), ]
Solution 3
use libraries data.table
and lubridate
as following:
library(data.table)
library(lubridate)
setDT(df1)
df1[,EditDate := ymd_hms(EditDate)]
res <- df1[EditDate = max(EditDate)]
Related videos on Youtube
Author by
Sharath
Updated on September 16, 2022Comments
-
Sharath over 1 year
I am working on a time series data. I have 2 date time columns and 1 fiscal week column. I have given an example where I have a situation like below and I need to get the MAX of the EditDate.
EditDate <- c("2015-04-01 11:40:13", "2015-04-03 02:54:45","2015-04-07 11:40:13") ID <- c("DL1X8", "DL1X8","DL1X8") Avg <- c(38.1517, 38.1517, 38.1517) Sig <- c(11.45880000, 11.45880000, 11.45880000) InsertDate <- c("2015-04-03 9:40:00", "2015-04-03 9:40:00",2015-04-10 9:40:00) FW <- c("39","39","40") df1 <- data.frame(EditDate , ID, Avg, Sig, InsertDate, FW)
This returns
+---------------------+-------+---------+-------------+--------------------+----+ | EditDate | ID | Avg | Sig | InsertDate | FW | +---------------------+-------+---------+-------------+--------------------+----+ | 2015-04-01 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-03 9:40:00 | 39 | | 2015-04-03 02:54:45 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-03 9:40:00 | 39 | | 2015-04-07 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-10 9:40:00 | 40 | +---------------------+-------+---------+-------------+--------------------+----+
The desired output that I want is
+---------------------+-------+---------+-------------+--------------------+----+ | EditDate | ID | Avg | Sig | InsertDate | FW | +---------------------+-------+---------+-------------+--------------------+----+ | 2015-04-07 11:40:13 | DL1X8 | 38.1517 | 11.45880000 | 2015-04-10 9:40:00 | 40 | +---------------------+-------+---------+-------------+--------------------+----+
I tried using sqldf using the library(RH2) but it takes a lot of time to run.
df2 <- sqldf("SELECT * FROM df1 WHERE (EditDate = (SELECT MAX(EditDate) FROM df1)) ORDER BY EditDate ASC")
I am not sure if it could be done using the dplyr package. Could someone provide inputs on how I could optimize this using dplyr or any other alternatives?