How can I change all factor variables into numeric variables in a bulk
23,300
You can use lapply
:
dat2 <- data.frame(lapply(dat, function(x) as.numeric(as.character(x))))
TargetVar Tar_Var1 Var2 Var3
1 0 0 0 7
2 0 0 1 1
3 0 1 0 3
4 0 1 1 7
5 1 0 0 5
6 1 0 1 1
7 1 1 0 0
8 1 1 1 6
9 0 0 0 8
10 0 0 1 5
11 1 1 1 4
12 0 0 1 2
13 1 0 0 9
14 1 1 1 2
str(dat2)
'data.frame': 14 obs. of 4 variables:
$ TargetVar: num 0 0 0 0 1 1 1 1 0 0 ...
$ Tar_Var1 : num 0 0 1 1 0 0 1 1 0 0 ...
$ Var2 : num 0 1 0 1 0 1 0 1 0 1 ...
$ Var3 : num 7 1 3 7 5 1 0 6 8 5 ...
Author by
mql4beginner
Updated on July 09, 2022Comments
-
mql4beginner almost 2 years
I have a data frame that contains about 100 factorial variables that I would like to change into numeric type. How can I do it to the whole data frame? I know that I can do it per each variable by using this code for example:
dat$.Var2<-as.numeric(dat$.Var2)
but I would like to do it for a lot of variables. Here is an example data frame.dat <- read.table(text = " TargetVar Tar_Var1 Var2 Var3 0 0 0 7 0 0 1 1 0 1 0 3 0 1 1 7 1 0 0 5 1 0 1 1 1 1 0 0 1 1 1 6 0 0 0 8 0 0 1 5 1 1 1 4 0 0 1 2 1 0 0 9 1 1 1 2 ", header = TRUE)
-
mql4beginner about 10 yearsThanks Sven, It works. Can I use sapply for this task?
-
alexwhan about 10 yearsJust be careful using
as.numeric()
- see what happens withas.numeric(factor(c(7, 1)))
- you might needas.numeric(as.character(x))
-
Ben about 10 years+1 @alexwhan, was going to say the same, see: stackoverflow.com/a/14717814/1036500 stackoverflow.com/a/6328860/1036500 & stackoverflow.com/a/2293313/1036500
-
A5C1D2H2I1M1N2O1R2T1 about 10 years@user1024441, why do you want to use
sapply
for this?lapply
is more appropriate. If you want to overwrite the values indat
(not create a newdata.frame
), you can also just usedat[] <- lapply(dat, function(x) as.numeric(as.character(x)))
-
mql4beginner about 10 yearsThanks Ananda, Can you please explain why lapply is more appropriate than sapply ?
-
Sven Hohenstein about 10 years@alexwhan Good point. I modified the code accordingly.
-
A5C1D2H2I1M1N2O1R2T1 about 10 years@user1024441, a
data.frame
is essentially a special type oflist
. If you use the approach I describe, you are essentially directly replacing the columns. Also,lapply
is generally faster thansapply
becausesapply
callslapply
anyway, and then checks to see whether the output can be simplified to an array (using thesimplify2array
function). You can do a few benchmarks to check on your own, but my quick test shows that even with this small dataset,lapply
is considerably faster.