Converting R file to Stata with missing string values

12,397

I've had this error many times before, and it's easy to reproduce:

library(foreign)
test <- data.frame(a = "", b = 1, stringsAsFactors = FALSE)
write.dta(test, 'example.dta')

One solution is to use factor variables instead of character variables, e.g.,

for (colname in names(test)) {
  if (is.character(test[[colname]])) {
    test[[colname]] <- as.factor(test[[colname]])
  }
}

Another is to change the empty strings to something else and change them back in Stata.

This is purely a problem with write.dta, because Stata is perfectly fine with empty strings. But since foreign is frozen, there's not much you can do about that.

Update: (2015-12-04) A better solution is to use write_dta in the haven package:

library(haven)
test <- data.frame(a = "", b = 1, stringsAsFactors = FALSE)
write_dta(test, 'example.dta')

This way, Stata reads string variables properly as strings.

Share:
12,397
user3570187
Author by

user3570187

Updated on July 03, 2022

Comments

  • user3570187
    user3570187 almost 2 years

    I am getting an error while converting R file into Stata format. I am able to convert the numbers into Stata file but when I include strings I get the following error:

    library(foreign)
    write.dta(newdata, "X.dta")
    
    Error in write.dta(newdata, "X.dta") : 
      empty string is not valid in Stata's documented format
    

    I have few strings like location, name etc. which have missing values which is probably causing this problem. Is there a way to handle this? .

    • Roberto Ferrer
      Roberto Ferrer over 9 years
      Can you post example input data generating that error?
    • Roberto Ferrer
      Roberto Ferrer over 9 years
      Just to be clear, if "empty string" is to be interpreted as "", then Stata does allow it. In fact, it corresponds to missing observation for a string variable.