How to concatenate factors, without them being converted to integer level?

26,431

Solution 1

From the R Mailing list:

unlist(list(facs[1 : 3], facs[4 : 5]))

To 'cbind' factors, do

data.frame(facs[1 : 3], facs[4 : 5])

Solution 2

An alternate workaround is to convert the factor to be a character vector, then convert back when you are finshed concatenating.

cfacs <- as.character(facs)
x <- c(cfacs[1:3], cfacs[4:5]) 

# Now choose between
factor(x)
# and
factor(x, levels = levels(facs))

Solution 3

Use fct_c from the forcats package (part of the tidyverse).

> library(forcats)
> facs <- as.factor(c("i", "want", "to", "be", "a", "factor", "not", "an", "integer"))
> fct_c(facs[1:3], facs[4:5])
[1] i    want to   be   a
Levels: a an be factor i integer not to want

fct_c isn't fooled by concatenations of factors with discrepant numerical codings:

> x <- as.factor(c('c', 'z'))
> x
[1] c z
Levels: c z
> y <- as.factor(c('a', 'b', 'z'))
> y
[1] a b z
Levels: a b z
> c(x, y)
[1] 1 2 1 2 3
> fct_c(x, y)
[1] c z a b z
Levels: c z a b
> as.numeric(fct_c(x, y))
[1] 1 2 3 4 2

Solution 4

Wow, I never realized it did that. Here is a work-around:

x <- c(facs[1 : 3], facs[4 : 5]) 
x <- factor(x, levels=1:nlevels(facs), labels=levels(facs))
x

With the output:

[1] i    want to   be   a   
Levels: a an be factor i integer not to want

It will only work if the two vectors have the same levels as here.

Solution 5

This is a really bad R gotcha. Along those lines, here's one that just swallowed several hours of my time.

x <- factor(c("Yes","Yes","No", "No", "Yes", "No"))
y <- c("Yes", x)

> y
[1] "Yes" "2"   "2"   "1"   "1"   "2"   "1"  
> is.factor(y)
[1] FALSE

It appears to me the better fix is Richie's, which coerces to character.

> y <- c("Yes", as.character(x))
> y
[1] "Yes" "Yes" "Yes" "No"  "No"  "Yes" "No" 
> y <- as.factor(y)
> y
[1] Yes Yes Yes No  No  Yes No 
Levels: No Yes

As long as you get the levels set properly, as Richie mentions.

Share:
26,431

Related videos on Youtube

Keith
Author by

Keith

I'm interested in machine learning, functional programming, compilers and computational biology.

Updated on July 20, 2021

Comments

  • Keith
    Keith almost 3 years

    I was surprised to see that R will coerce factors into a number when concatenating vectors. This happens even when the levels are the same. For example:

    > facs <- as.factor(c("i", "want", "to", "be", "a", "factor", "not", "an", "integer"))
    > facs
    [1] i       want    to      be      a       factor  not     an      integer
    Levels: a an be factor i integer not to want
    > c(facs[1 : 3], facs[4 : 5])
    [1] 5 9 8 3 1
    

    what is the idiomatic way to do this in R (in my case these vectors can be pretty large)? Thank you.

  • Keith
    Keith almost 14 years
    Great thanks! I've just figured out that unlist(list(facs[1 : 3], facs[4 : 5])) also works which is nice if you don't know ahead of time that facs is a factor type.
  • David J.
    David J. over 11 years
    Setting levels manually in this way didn't work for my particular problem. (I have 0-based levels. I could have subtracted 1 and then reconstructed the factor, but, that is brittle and on the lesser end of the scrutabality spectrum, even for R.) Instead (hooray?) I went with unlist(list(...)).
  • DaveM
    DaveM over 3 years
    For me this answer is the right one giving exactly the intended output!. Great