subset a data.frame with multiple conditions

17,269

Solution 1

For this problem I would go with the approach in Apprentice Queue's answer of extracting the year from the date rather than doing generic string matching. I would suggest:

data[data$Analyte =="ATRAZINE"
     & as.POSIXlt(data$Date, format="%m/%d/%Y")$year == 106]

But if you really had to do regexp matching, you could use grepl which returns a logical vector rather than grep which returns a vector of indices.

data[data$Analyte=="ATRAZINE" & grepl("2006",as.character(data$Date)),]

Solution 2

One way using date literals:

data[data$Analyte =="ATRAZINE"
     & (data$Date >= '2006-01-01' & data$Date < '2007-01-01')]

Another way using format

data[data$Analyte =="ATRAZINE"
     & format(data$Date, "%Y") == '2006']
Share:
17,269
pslice
Author by

pslice

Updated on June 27, 2022

Comments

  • pslice
    pslice almost 2 years

    Suppose my data looks like this:

    2372  Kansas KS2000111 HUMBOLDT, CITY OF    ATRAZINE    1.3 05/07/2006
    9104  Kansas KS2000111 HUMBOLDT, CITY OF    ATRAZINE   0.34 07/23/2006
    9212  Kansas KS2000111 HUMBOLDT, CITY OF    ATRAZINE   0.33 02/11/2007
    2094  Kansas KS2000111 HUMBOLDT, CITY OF    ATRAZINE    1.4 05/06/2007
    16763 Kansas KS2000111 HUMBOLDT, CITY OF    ATRAZINE   0.61 05/11/2009
    1076  Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR   0.48 05/12/2002
    1077  Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR    0.3 05/07/2006
    

    I want to be able to subset by the Analyte and a partial match on the date(namely I just want the year). I have been trying this, but I know it isn't quite right.

     data[data$Analyte=="ATRAZINE" & grep("2006",as.character(data$Date)),]
    

    Any suggestions?