Drop variables with all missing values

13,312

Solution 1

In an up-to-date Stata either search dropmiss or search nmissing will tell you that both commands are superseded by missings from the Stata Journal.

The following dialogue may illuminate your question:

. sysuse auto , clear
(1978 Automobile Data)

. generate empty = .
(74 missing values generated)

. missings dropvars
force option required with changed dataset
r(4);

. missings dropvars, force

Checking missings in make price mpg rep78 headroom trunk weight length turn
    displacement gear_ratio foreign empty:
74 observations with missing values

note: empty dropped

missings dropvars, once installed, will drop all variables that are entirely missing, except that you need the force option if the dataset in memory has not been saved.

Solution 2

You can simply loop over all variables in your dataset and use the capture and assert commands to test which ones have all their values missing.

The advantage of this approach is that you can do this with only built-in Stata commands:

clear

input X1 X2 X3
1 2 .
. 3 .
3 . .
. 5 .
end

list
     +--------------+
     | X1   X2   X3 |
     |--------------|
  1. |  1    2    . |
  2. |  .    3    . |
  3. |  3    .    . |
  4. |  .    5    . |
     +--------------+

foreach var of varlist _all {
    capture assert missing(`var')
    if !_rc {
        drop `var'
    }
}

list
     +---------+
     | X1   X2 |
     |---------|
  1. |  1    2 |
  2. |  .    3 |
  3. |  3    . |
  4. |  .    5 |
     +---------+
Share:
13,312
JodeCharger100
Author by

JodeCharger100

Updated on August 06, 2022

Comments

  • JodeCharger100
    JodeCharger100 almost 2 years

    I have 5000 variables and 91,534 observations in my dataset.

    I want to drop all variables that have all their values missing:

    X1     X2    X3
    1      2      .
    .      3      .
    3      .      .
    .      5      .
    

    X1     X2
    1      2  
    .      3   
    3      . 
    .      5  
    

    I tried using the dropmiss community-contributed command, but it does not seem to be working for me even after reading the help file. For example:

    dropmiss 
    command dropmiss is unrecognized
    r(199);
    
    missings dropvars
    force option required with changed dataset
    

    Instead, as suggested in one of the solutions, I tried the following:

    ssc install nmissing
    nmissing, min(91534)  
    drop `r(varlist)'
    

    This alternative community-contributed command seems to work for me.

    However, I wanted to know if there is a more elegant solution, or a way to use dropmiss.

    • Nick Cox
      Nick Cox over 5 years
      So, you didn't install dropmiss or you installed it in the wrong place. That would be an explanation. However, I don't know how you could read the help file if you hadn't installed it. Either way, missings is now considered better (by all the program authors concerned).