Accessing dataframe column causes Error in sort.list(y) : 'x' must be atomic for 'sort.list'

18,929

Pass in the classifications as a vector rather than as a data frame with one column:

knnMCN(data, classification$play_type, data2, K=1, ShowObs=T)

Explanation: while the documentation for knnMCN says that classification should be a "matrix or data frame", this appears to be in error, as the function's code tries to treat the classification as a vector. The line that throws the error is:

OrigTrnG = as.factor(OrigTrnG)

since as.factor cannot be used on a data frame.

Share:
18,929
Hoser
Author by

Hoser

Updated on June 04, 2022

Comments

  • Hoser
    Hoser almost 2 years

    Currently I'm trying to use knnMCN()

    I'm doing it in this format...

    knnMCN(data, classification, data2, K=1, ShowObs=T)
    

    The files data, classification, and data2 are all .csv files. 'data' is my training data, 'classification' is a single column file of classifications (classified as 0, 1 or 2) and data2 is the dataset I want to classify.

    There are only numerical values in these files. Whenever I run this command I get the error:

    Error in sort.list(y) : 'x' must be atomic for 'sort.list'
    Have you called 'sort' on a list?
    

    Anybody know what is going wrong here? Is there a better/different way to do K-Nearest Neighbors?

    EDIT: These are the results of dput(head(data/classification/data2))

    data:

    structure(list(down = c(1L, 2L, 3L, 1L, 2L, 1L), yards_to_first = c(10L, 
    7L, 7L, 10L, 7L, 10L), yards_to_endzone = c(84L, 81L, 81L, 73L, 
    70L, 40L), score_difference = c(0L, 0L, 0L, 0L, 0L, 0L), quarter = c(1L, 
    1L, 1L, 1L, 1L, 1L), seconds_remaining = c(3595L, 3560L, 3554L, 
    3523L, 3476L, 3450L)), .Names = c("down", "yards_to_first", "yards_to_endzone", 
    "score_difference", "quarter", "seconds_remaining"), row.names = c(NA, 
    6L), class = "data.frame")
    

    classification:

    structure(list(play_type = c(0L, 1L, 1L, 0L, 1L, 1L)), .Names = "play_type",
    row.names = c(NA,6L), class = "data.frame")
    

    data2:

    structure(list(down = c(1L, 2L, 3L, 4L, 1L, 2L), yards_to_first = c(10L, 
    5L, 8L, 8L, 10L, 10L), yards_to_endzone = c(58L, 53L, 56L, 56L, 
    98L, 98L), score_difference = c(0L, 0L, 0L, 0L, 0L, 0L), quarter = c(1L, 
    1L, 1L, 1L, 1L, 1L), seconds_remaining = c(3593L, 3556L, 3515L, 
    3507L, 3496L, 3460L)), .Names = c("down", "yards_to_first", "yards_to_endzone", 
    "score_difference", "quarter", "seconds_remaining"), row.names = c(NA, 
    6L), class = "data.frame")