R: as.numeric function not returning correct # from data.frame

34,566

You can try:

as.numeric(as.character(dfA1))

and you can also prevent things from automatically being converted to factors by setting stringsAsFactors = FALSE using ?options.

The reason this happens is that factors are actually stored internally as integers, and the labels are what is actually displayed when you print them out (things like "103316" in your case). The function as.numeric thinks that what you want is the underlying integer representation.

Share:
34,566
Amanda
Author by

Amanda

Updated on August 03, 2020

Comments

  • Amanda
    Amanda over 3 years

    Possible Duplicate:
    R - How to convert a factor to an integer\numeric in R without a loss of information

    I am importing an excel document using read.xls. I know this command uses read.table and returns everything as "factors". I am unable to upload my data directly telling read.xls which columns are numeric, as all columns have previous categorical data. So I have been extracting my numeric data columns I desire, then wanting to transform them from data.frames to numeric data, however when I use as.numeric I am receiving numbers that do not correspond to the original data.

    For example:

    These are the first 6 rows of my data.frame called dfA1, which is a 96,1 vector

             [,1]
    [1,] "103316"
    [2,] "130720"
    [3,] "141808"
    [4,] "131864"
    [5,] "148144"
    [6,] "145760"
    

    When I perform as.numeric(dfA1) I receive:

    [1]  2  18  29  19  43  40
    

    I have absolutely no idea why I get these numbers or how it could be coming up with them. I checked my original xls document and they are marked as numeric with no decimals.

  • Brandon Bertelsen
    Brandon Bertelsen over 12 years
    Alternatively, you can open the file in excel and format the column as a number. This should clear up the translation for R.
  • joran
    joran over 12 years
    @Brandon - True, although I somewhat regret answering this question now, as Joshua is correct, it should be closed as an exact dup.
  • Amanda
    Amanda over 12 years
    Thank you Joran, worked like a charm. I actually tried reformatting the column in the excel as a number, however for some reason that did not fix the problem.
  • Amanda
    Amanda over 12 years
    So now that I've done that, when I try to call a number from one of the resulting cells I am unable to do so. I made a new variable A1 <- as.numeric(as.character(dfA1)) which produces the correct numbers that I expected - thanks! But then when I try to call a cell say A1(1,1) it gives me an Error: could not find function "A1". Any ideas? Thanks again!
  • joran
    joran over 12 years
    @Amanda - That sounds like a different problem, which may be best addressed in a new question. But you should try reading ?'[' first (although that's just a guess...not sure exactly what you mean by 'call').
  • Gavin Simpson
    Gavin Simpson over 12 years
    @joran Answers can be merged where appropriate. Don't feel bad, it is a good Answer.
  • Amanda
    Amanda over 12 years
    @Joran - I am just trying to obtain the numerical value in a position in the vector. But it is giving me that error: could not find function "A1"
  • joran
    joran over 12 years
    @Amanda - These comments are not really the appropriate venue for extend technical support. If you think your question is very quick/basic you can try asking in the R chat room (they're very friendly, honest!), or you can ask a new question here. Either way, no one will be able to help unless you are fairly clear about the commands you're typing that are generating an error.