Stata: combine multiple variables into one

32,451

Solution 1

From your example, which implies numeric variables and at most one variable non-missing in each observation, egen's rowmax() function is all you need.

egen d = rowmax(a b c)

Solution 2

You can loop over the variables, replacing the new variables to the nonmissing values of the other variables. This is assuming your variables are strings. Nick's solution works better for numeric variables.

clear
input ID str5(a b c)
1  x "" ""
2  y "" ""
3  "" z ""
4  "" w ""
5  "" "" u
end
gen d=""
foreach v of varlist a-c {
 replace d=`v' if mi(d)
}
li

Solution 3

You could similarly use stack as you were, while specifying the wide option:

clear
input ID str5(a b c)
1  x "" ""
2  y "" ""
3  "" z ""
4  "" w ""
5  "" "" u
end

stack a b c, into(d) wide clear
keep if !mi(d)
Share:
32,451
Jay G
Author by

Jay G

Updated on August 15, 2020

Comments

  • Jay G
    Jay G over 3 years

    I have a problem in Stata. What I want to do is to combine multiple variables into one. My data looks like the following (simplified):

    ID a b c
    1  x . .
    2  y . .
    3  . z .
    4  . w .
    5  . . u
    

    Now I want to generate a new variable d consisting of all values of variables a, b and c, such that d has no missing values:

    ID a b c d
    1  x . . x
    2  y . . y
    3  . z . z
    4  . w . w
    5  . . u u
    

    I tried to use the command stack a b c, into(d) but then Stata gives me a warning that data will be lost and what is left of my data is only the stacked variable and nothing else. Is there another way to do it without renaming the variables a, b and c?

    My dataset contains around 90 of these variables which I want to combine to a single variable, so maybe there is an efficient way to do so.

  • Maximillian Laumeister
    Maximillian Laumeister over 8 years
    Thank you for posting an answer to this question! Code-only answers are discouraged on Stack Overflow, because a code dump with no context doesn't explain how or why the solution will work, making it difficult for the original poster (or any future readers) to understand the logic behind it. Please, edit your question and include an explanation of your code so that others can benefit from your answer. Thanks!
  • Nick Cox
    Nick Cox over 8 years
    If all variables are string, as in this example egen's concat() function is an alternative.