Sortring in R: Object not found

21,742

In R you cannot just pass the column names to the [.data.table function, even if they might be imagined to be obviously referring to columns within the data.table being extracted or re-ordered. You need to use either "[" or "$":

data <- rawRelevant[ order( rawRelevant$Rate, rawRelevant$Hospital), ]

The use of non-specific object names like "data" is discouraged, especially so when they are also the names of R functions as are data or df. One situation that might arise is that higher up in someones code there might have been an attach call for rawRelevant, and that would have the side-effect of appearing to promote the column-names to objects. But attach causes a lot of confusion since it is not meant for programming, only for interactive use, and so its use is also discouraged.

Notice that the UCLA people used attach(hsb2). Several years ago the UCLA stats websites were advising against R in preference to SAS and SPSS. Now they seem to have come around, but I don't think they are really completely "with the program."

Share:
21,742
Zigu
Author by

Zigu

Updated on July 09, 2022

Comments

  • Zigu
    Zigu almost 2 years

    I've been following a tutorial regarding ordering output for R dataframes:

    https://www.statmethods.net/management/sorting.html

    The problem I'm having is that when I use order the way that is presented in the tutorial, the below code spits out object not found. I don't understand why it can't order the dataframe but the print statements seem to be working fine.

    The following is the code I'm using:

    #hospital name is row 2
    #state is row 7
    #heart attack is row 11
    #heart failure is row 17
    #pneumonia is row 23
    best <- function(state, outcome){
        colNum <- -1
    
        ##Semi hard coded :(
        if(outcome == "heart attack"){
            colNum <- 11
        } else if(outcome == "heart failure"){
            colNum <- 17
        } else if(outcome == "pneumonia"){
            colNum <- 23
        } else {
            stop("invalid outcome")
        }
    
        raw <-  read.csv("outcome-of-care-measures.csv", colClasses = "character")
    
        if(sum(raw$State == state) <= 0){
            stop("invalid state") 
        }
    
        rawRelevant <- raw[with(raw, raw[,colNum] != "Not Available" & 
             raw[,7] == state),c(2,colNum)]
        rawRelevant[,2] <- as.numeric(rawRelevant[,2])
        names(rawRelevant) <- c("Hospital", "Rate")
        print(rawRelevant$Hospital)
        print(rawRelevant$Rate)
        data <- rawRelevant[order(Rate,Hospital),]
    }
    

    Sample Output:

    > trial <- best("AK", "heart attack")
    [1] "PROVIDENCE ALASKA MEDICAL CENTER" "MAT-SU REGIONAL MEDICAL CENTER"  
    [3] "FAIRBANKS MEMORIAL HOSPITAL"      "ALASKA REGIONAL HOSPITAL"        
    [5] "ALASKA NATIVE MEDICAL CENTER"    
    [1] 13.4 17.7 15.5 14.5 15.7
    Error in order(Rate, Hospital) : object 'Rate' not found