Converting a factor with 2 levels to binary values 0/1 in R
Solution 1
As an addition to @Dason's answer, note that...
test <- c("male","female")
as.factor(test)
#[1] male female
#Levels: female male
...will return female
as the reference group (1) and male
as the comparison group (2),
To spin it the other way, you would need to do...
factor(test,levels=c("male","female"))
#[1] male female
#Levels: male female
As @marius notes, using contrasts
will show you how it will work in the regression model:
contrasts(as.factor(test))
# male
#female 0
#male 1
contrasts(factor(test,levels=c("male","female")))
# female
#male 0
#female 1
Solution 2
Convert to a factor and let R take care of the rest. You should never have to take care of explicitly creating dummy variables when using R.
Solution 3
If you're doing this for real, you should absolutely follow @Dason's advice. I'm going to assume that you're teaching a class and want to demonstrate indicator variables (with thanks to this question):
dat <- data.frame(gender=sample(c("male", "female"), 10, replace=TRUE))
model.matrix(~gender, data=dat)
(Intercept) gendermale
1 1 1
2 1 0
3 1 1
4 1 0
5 1 1
6 1 1
7 1 1
8 1 0
9 1 0
10 1 1
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$gender
[1] "contr.treatment"
If you don't want the intercept, use model.matrix(~gender -1 , data=dat)
instead.
![Admin](/assets/logo_square_200-5d0d61d6853298bd2a4fe063103715b4daf2819fc21225efa21dfb93e61952ea.png)
Admin
Updated on July 09, 2022Comments
-
Admin almost 2 years
I have a variable, called
gender
, with binary categorical values "female"/"male". I want to change its type to integers 0/1 so that I can use it in a regression analysis. i.e I want values "female" and "male" to be mapped to 1 and 0.> str(gender) gender : Factor w/ 2 levels "female","male": 1 1 1 0 0 0 0 1 1 0 ... > gender[1] [1] female
I would like to convert gender variable type so that I get int value 1 when I query an element, i.e.
> gender[1] [1] 1
-
mnel over 11 years+1 far better to address the real issue, not the exact problem!
-
Marius over 11 yearsOr, to see even more explicitly how the levels will be treated in a regression model,
contrasts(factor(test))
-
Kevin T about 3 years@Dason, what about if you wanted to include gender in a correlation matrix? This will not work if gender is a factor.