R X-axis Date Labels using plot()

39,208

With plots it's very hard to reproduce results with out sample data. Here's a sample I'll use

dd<-data.frame(
  saldt=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
  salpr = cumsum(rnorm(31))
)

A simple plot with

with(dd, plot(saldt, salpr))

produces a few year marks

enter image description here

If i wanted more control, I could use axis.Date as you alluded to

with(dd, plot(saldt, salpr, xaxt="n"))
axis.Date(1, at=seq(min(dd$saldt), max(dd$saldt), by="30 mon"), format="%m-%Y")

which gives

enter image description here

note that xlim will only zoom in parts of the plot. It is not directly connected to the axis labels but the axis labels will adjust to provide a "pretty" range to cover the data that is plotted. Doing just

xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01"))

is the correct way to zoom the plot. No need for conversion to numeric or POSIXct.

Share:
39,208
jackw19
Author by

jackw19

Updated on December 29, 2020

Comments

  • jackw19
    jackw19 over 3 years

    Using the plot() function in R, I'm trying to produce a scatterplot of points of the form (SaleDate,SalePrice) = (saldt,sapPr) from a time-series, cross-section real estate sales dataset in dataframe format. My problem concerns labels for the X-axis. Just about any series of annual labels would be adequate, e.g. 1999,2000,...,2013 or 1999-01-01,...,2013-01-01. What I'm getting now, a single label, 2000, at what appears to be the proper location won't work.

    The following is my call to plot():

    plot(r12rgr0$saldt, r12rgr0$salpr/1000, type="p", pch=20, col="blue", cex.axis=.75, 
         xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01")),
         ylim=c(100,650), 
         main="Heritage Square Sales Prices $000s 1990-2014",xlab="Sale Date",ylab="$000s")
    

    The xlim and ylim are called out to bound the date and price ranges of the data to be plotted; note prices are plotted as $000s. r12rgr0$saldt really is a date; str(r12rgr0$saldt) returns:

    Date[1:4190], format: "1999-10-26" "2013-07-06" "2003-08-25" NA NA "2000-05-24"  xx 
    

    I have reviewed several threads here concerning similar questions, and see that the solution probably lies with turning off the default X-axis behavior and using axis.date, but i) At my current level of R skill, I'm not sure I'd be able to solve the problem, and ii) I wonder why the plotting defaults are producing these rather puzzling (to me, at least) results?

    Addl Observations: The Y-axis labels are just fine 100, 200,..., 600. The general appearance of the scatterplot indicates the called-for date ranges are being observed and the relative positions of the plotted points are correct. Replacing xlim=... as above with xlim=c("1999-01-01","2014-01-01")

    or

    xlim=c(as.numeric(as.character("1999-01-01")),as.numeric(as.character("2014-01-01")))
    

    or

    xlim=c(as.POSIXct("1999-01-01", format="%Y-%m-%d"),as.POSIXct("2014-01-01", format="%Y-%m-%d"))
    

    all result in error messages.

  • jackw19
    jackw19 almost 10 years
    Thanks very much for your helpful comment.
  • jackw19
    jackw19 almost 10 years
    I think maybe plot() draws a chart, including the pretty axes labels, before it zooms the view a la xlim or ylim. I have a couple sales date outliers in my dataframe, including an obs in 1909. When I plot() without using xlim, that is including the outlier, I get a nice set of X-axis labels at 20 yr intervals starting at 1920; all but one of the graphed points are squeeaed way over on the rhs of the chart. And the only label under the big concentration of points is 2000 -- right in the same relative position in which it appears in the zoomed version using xlim. Can't be a coincidence.