Is there a way to get past the "too many values" error in Stata when using tabulate?
Solution 1
To some it might seem silly, or at least puzzling, that people want tables with more than 12000 rows, as there must be a better way to display results or answer the question that is in mind.
That said, the limits of tabulate
are hard-wired. But you just need to think of reproducing whatever you want to show. So, for one-way frequencies
. bysort rowvar : gen freq = _N
. by rowvar : gen tag = _n == 1
. gsort -freq rowvar
. list rowvar freq if tag, noobs
and for two-way frequencies
. bysort rowvar colvar : gen freq = _N
. by rowvar colvar : gen tag = _n == 1
. gsort -freq rowvar colvar
. list rowvar freq if tag, noobs
A similar approach, with more bells and whistles, is coded within groups
(SSC). An even simpler approach in many ways is to collapse
or contract
the dataset and then list
it.
To flag the general strategy here:
Produce what you want as new variables.
Select just one observation from each group if there are multiple observations.
list
, nottabulate
.
UPDATE
OP asked
. bysort rowvar : gen freq = _N
OP: This generates the freq
variable for the last count of every individual value in my rowvar
Me: No. The freq
variable is the count of observations for every distinct value of rowvar
.
. by rowvar : gen tag = _n == 1
OP: This generates the tag
variable for the first count of every unique observation in rowvar
.
Me: Correct, provided you say "distinct", not "unique". Unique values occur once only.
. gsort -freq rowvar
OP: This sorts freq
and rowvar
in descending order
Me: It sorts freq
in descending order and rowvar
in ascending order within blocks of constant freq
.
. list rowvar freq if tag, noobs
OP: What does if
do here?
Me: That one is left as an exercise.
Solution 2
Use the command bigtab
. (You have to install the package first: run ssc install bigtab
.) For help type h bigtab
.
![Admin](/assets/logo_square_200-5d0d61d6853298bd2a4fe063103715b4daf2819fc21225efa21dfb93e61952ea.png)
Admin
Updated on July 13, 2022Comments
-
Admin almost 2 years
I am trying to generate frequencies for a variable in Stata conditional on categories of another variable.
This other categorical variable has about 790,000 observations for the category I am interested in.
Stata's 12,000 rows and 1,200 rows limit for one-way and two-way tables respectively makes this impossible.
Every time I run
tab x if y==<category of interest>
I get the following error:too many values r(134);
I installed the
bigtab
package and though it gives me tables it cannot be used withby
or run statistical tests.Is there a work around for this?
It seems silly that Stata should have this arbitrary limit when SAS and even SPSS can run the exact same operation without trouble.
-
Admin over 10 yearsHello, Thanks for your reply. (My apologies, I didn't realize hitting enter would submit the comment) I am pretty much a novice at Stata so bear with me as I run through this code
bysort rowvar : gen freq = _N
This generates the freq variable for the last count of every individual value in my rowvarby rowvar : gen tag = _n == 1
This generates the tag variable for the first count of every unique observation in rowvargsort -freq rowvar
This sorts freq and rowvar in descending orderlist rowvar freq if tag, noobs
What does if do here? -
Nick Cox over 9 yearsThe user reported problems with this command back when the question was posted.