Converting R file to Stata with missing string values
I've had this error many times before, and it's easy to reproduce:
library(foreign)
test <- data.frame(a = "", b = 1, stringsAsFactors = FALSE)
write.dta(test, 'example.dta')
One solution is to use factor variables instead of character variables, e.g.,
for (colname in names(test)) {
if (is.character(test[[colname]])) {
test[[colname]] <- as.factor(test[[colname]])
}
}
Another is to change the empty strings to something else and change them back in Stata.
This is purely a problem with write.dta
, because Stata is perfectly fine with empty strings. But since foreign
is frozen, there's not much you can do about that.
Update: (2015-12-04) A better solution is to use write_dta
in the haven
package:
library(haven)
test <- data.frame(a = "", b = 1, stringsAsFactors = FALSE)
write_dta(test, 'example.dta')
This way, Stata reads string variables properly as strings.
user3570187
Updated on July 03, 2022Comments
-
user3570187 almost 2 years
I am getting an error while converting R file into Stata format. I am able to convert the numbers into Stata file but when I include strings I get the following error:
library(foreign) write.dta(newdata, "X.dta") Error in write.dta(newdata, "X.dta") : empty string is not valid in Stata's documented format
I have few strings like location, name etc. which have missing values which is probably causing this problem. Is there a way to handle this? .
-
Roberto Ferrer over 9 yearsCan you post example input data generating that error?
-
Roberto Ferrer over 9 yearsJust to be clear, if "empty string" is to be interpreted as
""
, then Stata does allow it. In fact, it corresponds to missing observation for a string variable.
-