R -apply- convert many columns from numeric to factor

10,409

Solution 1

Try

df[,cols] <- lapply(df[,cols],as.factor)

The problem is that apply() tries to bind the results into a matrix, which results in coercing the columns to character:

class(apply(df[,cols], 2, as.factor))  ## matrix
class(as.factor(df[,1]))  ## factor

In contrast, lapply() operates on elements of lists.

Solution 2

Updated Nov 9, 2017

purrr / purrrlyr are still in development

Similar to Ben's, but using purrrlyr::dmap_at:

library(purrrlyr)

df <- data.frame(A=1:10, B=2:11, C=3:12)

# selected cols to factor
cols <- c('A', 'B')

(dmap_at(df, factor, .at = cols))

A        B       C
<fctr>   <fctr>  <int>
1        2       3      
2        3       4      
3        4       5      
4        5       6      
5        6       7      
6        7       8      
7        8       9      
8        9       10     
9        10      11     
10       11      12 

Solution 3

You can place your results back into a data frame which will recognize the factors:

df[,cols]<-data.frame(apply(df[,cols], 2, function(x){ as.factor(x)}))

Solution 4

Another option, with purrr and dplyr, perhaps a little more readable than the base solutions, and keeps the data in a dataframe:

Here's the data:

df <- data.frame(A=1:10, B=2:11, C=3:12)

str(df)
'data.frame':   10 obs. of  3 variables:
 $ A: int  1 2 3 4 5 6 7 8 9 10
 $ B: int  2 3 4 5 6 7 8 9 10 11
 $ C: int  3 4 5 6 7 8 9 10 11 12

We can easily operate on all columns with dmap:

library(purrr)
library(dplyr)

# all cols to factor
dmap(df, as.factor)

Source: local data frame [10 x 3]

        A      B      C
   (fctr) (fctr) (fctr)
1       1      2      3
2       2      3      4
3       3      4      5
4       4      5      6
5       5      6      7
6       6      7      8
7       7      8      9
8       8      9     10
9       9     10     11
10     10     11     12

And similarly use dmap on a subset of columns using select from dplyr:

# selected cols to factor
cols <- c('A', 'B')

df[,cols] <- 
  df %>% 
  select(one_of(cols)) %>% 
  dmap(as.factor)

To get the desired result:

str(df)
'data.frame':   10 obs. of  3 variables:
 $ A: Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10
 $ B: Factor w/ 10 levels "2","3","4","5",..: 1 2 3 4 5 6 7 8 9 10
 $ C: int  3 4 5 6 7 8 9 10 11 12

Solution 5

A simple but effective option would be mapply

df <- data.frame(A=1:10, B=2:11, C=3:12)
cols <- c('A', 'B')

df[,cols] <- as.data.frame(mapply(as.factor,df[,cols]))

You can also use for-loop to achieve the same result:

for(col in cols){
  df[,col] <- as.factor(df[,col])
}
Share:
10,409
GabyLP
Author by

GabyLP

Updated on June 12, 2022

Comments

  • GabyLP
    GabyLP almost 2 years

    I need to convert many columns that are numeric to factor type. An example table:

    df <- data.frame(A=1:10, B=2:11, C=3:12)
    

    I tried with apply:

    cols<-c('A', 'B')
    df[,cols]<-apply(df[,cols], 2, function(x){ as.factor(x)});
    

    But the result is a character class.

    > class(df$A)
    [1] "character"
    

    How can I do this without doing as.factor for each column?

  • GabyLP
    GabyLP over 8 years
    thanks. So the result returns a matrix and they don't recognize factors.
  • Seanosapien
    Seanosapien over 6 years
    Wouldn't map(df[cols], factor) work as well? I can't find any dmap_at() function.
  • Tanya Murphy
    Tanya Murphy over 6 years
    You're probably right. Understandably the purrr and related packages are still going through a lot of changes. dmap_at has been moved to purrrlyr purrr.tidyverse.org/news/index.html
  • Seanosapien
    Seanosapien over 6 years
    That explains it so! I have just been using the most recent version of purrr so thought I had missed something.