How to convert dataframe into time series?

147,823

Solution 1

R has multiple ways of represeting time series. Since you're working with daily prices of stocks, you may wish to consider that financial markets are closed on weekends and business holidays so that trading days and calendar days are not the same. However, you may need to work with your times series in terms of both trading days and calendar days. For example, daily returns are calculated from sequential daily closing prices regardless of whether a weekend intervenes. But you may also want to do calendar-based reporting such as weekly price summaries. For these reasons the xts package, an extension of zoo, is commonly used with financial data in R. An example of how it could be used with your data follows.

Assuming the data shown in your example is in the dataframe df

  library(xts)
  stocks <- xts(df[,-1], order.by=as.Date(df[,1], "%m/%d/%Y"))
#
#  daily returns
#
   returns <- diff(stocks, arithmetic=FALSE ) - 1
#
#  weekly open, high, low, close reports
#
   to.weekly(stocks$Hero_close, name="Hero")

which gives the output

           Hero.Open Hero.High Hero.Low Hero.Close
2013-03-15    1669.1   1684.45   1669.1    1684.45
2013-03-22    1690.5   1690.50   1623.3    1659.60
2013-03-28    1617.7   1617.70   1542.0    1542.00

Solution 2

Input. We will start with the text of the input shown in the question since the question did not provide the csv input:

Lines <- "Dates   Bajaj_close Hero_close
3/14/2013   1854.8  1669.1
3/15/2013   1850.3  1684.45
3/18/2013   1812.1  1690.5
3/19/2013   1835.9  1645.6
3/20/2013   1840    1651.15
3/21/2013   1755.3  1623.3
3/22/2013   1820.65 1659.6
3/25/2013   1802.5  1617.7
3/26/2013   1801.25 1571.85
3/28/2013   1799.55 1542"

zoo. "ts" class series normally do not represent date indexes but we can create a zoo series that does (see zoo package):

library(zoo)
z <- read.zoo(text = Lines, header = TRUE, format = "%m/%d/%Y")

Alternately, if you have already read this into a data frame DF then it could be converted to zoo as shown on the second line below:

DF <- read.table(text = Lines, header = TRUE)
z <- read.zoo(DF, format = "%m/%d/%Y")

In either case above z ia a zoo series with a "Date" class time index. One could also create the zoo series, zz, which uses 1, 2, 3, ... as the time index:

zz <- z
time(zz) <- seq_along(time(zz))

ts. Either of these could be converted to a "ts" class series:

as.ts(z)
as.ts(zz)

The first has a time index which is the number of days since the Epoch (January 1, 1970) and will have NAs for missing days and the second will have 1, 2, 3, ... as the time index and no NAs.

Monthly series. Typically "ts" series are used for monthly, quarterly or yearly series. Thus if we were to aggregate the input into months we could reasonably represent it as a "ts" series:

z.m <- as.zooreg(aggregate(z, as.yearmon, mean), freq = 12)
as.ts(z.m)

Solution 3

Late to the party, but the tsbox package is designed to perform conversions like this. To convert your data into a ts-object, you can do:

dta <- data.frame(
  Dates = c("3/14/2013", "3/15/2013", "3/18/2013", "3/19/2013"),
  Bajaj_close = c(1854.8, 1850.3, 1812.1, 1835.9),
  Hero_close = c(1669.1, 1684.45, 1690.5, 1645.6)
)

dta
#>       Dates Bajaj_close Hero_close
#> 1 3/14/2013      1854.8    1669.10
#> 2 3/15/2013      1850.3    1684.45
#> 3 3/18/2013      1812.1    1690.50
#> 4 3/19/2013      1835.9    1645.60

library(tsbox)
ts_ts(ts_long(dta))
#> Time Series:
#> Start = 2013.1971293045 
#> End = 2013.21081883954 
#> Frequency = 365.2425 
#>          Bajaj_close Hero_close
#> 2013.197      1854.8    1669.10
#> 2013.200      1850.3    1684.45
#> 2013.203          NA         NA
#> 2013.205          NA         NA
#> 2013.208      1812.1    1690.50
#> 2013.211      1835.9    1645.60

It automatically parses the dates, detects the frequency and makes the missing values at the weekends explicit. With ts_<class>, you can convert the data to any other time series class.

Solution 4

With library fpp, you can easily create time series with date format: time_ser=ts(data,frequency=4,start=c(1954,2))

here we start at the 2nd quarter of 1954 with quarter fequency.

Solution 5

See this question: Converting data.frame to xts order.by requires an appropriate time-based object, which suggests looking at argument to order.by,

Currently acceptable classes include: ‘Date’, ‘POSIXct’, ‘timeDate’, as well as ‘yearmon’ and ‘yearqtr’ where the index values remain unique.

And further suggests an explicit conversion using order.by = as.POSIXct,

df$Date <- as.POSIXct(strptime(df$Date,format),tz="UTC")
xts(df[, -1], order.by=as.POSIXct(df$Date))

Where your format is assigned elswhere,

format <- "%m/%d/%Y" #see strptime for details
Share:
147,823
Neil
Author by

Neil

Data is the new Fuel.. By training,I am an Engineer in Computer Science and MBA in Information Systems. I code in Python as well as in R. Primarily as a data scientist,I work on end to end data platforms from gathering requirements till building and deploying machine learning models. I work on various statistical methodologies to generate insights from data which involves looking data from different angles,lots of data munging and data manipulation.

Updated on March 04, 2020

Comments

  • Neil
    Neil about 4 years

    I have one csv file in which I have 2 closing prices of stock(on daily basis)

    Dates   Bajaj_close Hero_close
    3/14/2013   1854.8  1669.1
    3/15/2013   1850.3  1684.45
    3/18/2013   1812.1  1690.5
    3/19/2013   1835.9  1645.6
    3/20/2013   1840    1651.15
    3/21/2013   1755.3  1623.3
    3/22/2013   1820.65 1659.6
    3/25/2013   1802.5  1617.7
    3/26/2013   1801.25 1571.85
    3/28/2013   1799.55 1542
    

    I want to convert above data into time series format. (start date is 3/14/2013 and end date is 3/13/2015) I have tried this but its giving me some weird output

    values <- bajaj_hero[, -1]  (excluded first column i.e date in real dataset)
    bajaj_hero_timeseries <- ts(values,start=c(2013,1),end=c(2015,3),frequency=365)
    

    Output is:

               Bajaj_close Hero_close
    2013.000     1854.80    1669.10
    2013.003     1850.30    1684.45
    2013.005     1812.10    1690.50
    2013.008     1835.90    1645.60
    2013.011     1840.00    1651.15
    2013.014     1755.30    1623.30
    2013.016     1820.65    1659.60
    2013.019     1802.50    1617.70
    2013.022     1801.25    1571.85
    
  • Neil
    Neil about 9 years
    @ WaltS I didnt get to.weekly(stocks$Hero_close, name="Hero")
  • WaltS
    WaltS about 9 years
    You'll need to use the statement library(xts) for that to work.
  • BMLopes
    BMLopes over 2 years
    If the object df is a tibble (it problably is, if you are using readr to import data), you should first convert to a vanilla dataframe, using the function "as,data.frame"
  • federico
    federico over 2 years
    @WaltS Do you know how this process is called in statistics?