How to generate a frequency table in R with with cumulative frequency and relative frequency

97,053

Solution 1

You're close! There are a few functions that will make this easy for you, namely cumsum() and prop.table(). Here's how I'd probably put this together. I make some random data, but the point is the same:

#Fake data
x <- sample(10:20, 44, TRUE)
#Your code
factorx <- factor(cut(x, breaks=nclass.Sturges(x)))
#Tabulate and turn into data.frame
xout <- as.data.frame(table(factorx))
#Add cumFreq and proportions
xout <- transform(xout, cumFreq = cumsum(Freq), relative = prop.table(Freq))
#-----
      factorx Freq cumFreq   relative
1 (9.99,11.4]   11      11 0.25000000
2 (11.4,12.9]    3      14 0.06818182
3 (12.9,14.3]   11      25 0.25000000
4 (14.3,15.7]    2      27 0.04545455
5 (15.7,17.1]    6      33 0.13636364
6 (17.1,18.6]    3      36 0.06818182
7   (18.6,20]    8      44 0.18181818

Solution 2

The base functions table, cumsum and prop.table should get you there:

 cbind( Freq=table(x), Cumul=cumsum(table(x)), relative=prop.table(table(x)))
   Freq Cumul   relative
10    2     2 0.04545455
12    2     4 0.04545455
15    1     5 0.02272727
16   10    15 0.22727273
17   16    31 0.36363636
18    6    37 0.13636364
19    4    41 0.09090909
20    2    43 0.04545455
22    1    44 0.02272727

With cbind and naming of the columns to your liking this should be pretty easy for you in the future. The output from the table function is a matrix, so this result is also a matrix. If this were being done on something big it would be more efficient todo this:

tbl <- table(x)
cbind( Freq=tbl, Cumul=cumsum(tbl), relative=prop.table(tbl))

Solution 3

If you are looking for something pre-packaged, consider the freq() function from the descr package.

library(descr)
x = c(sample(10:20, 44, TRUE))
freq(x, plot = FALSE)

Or to get cumulative percents, use the ordered() function

freq(ordered(x), plot = FALSE)

To add a "cumulative frequencies" column:

tab = as.data.frame(freq(ordered(x), plot = FALSE))
CumFreq = cumsum(tab[-dim(tab)[1],]$Frequency)
tab$CumFreq = c(CumFreq, NA)
tab

If your data has missing values, a valid percent column is added to the table.

x = c(sample(10:20, 44, TRUE), NA, NA)
freq(ordered(x), plot = FALSE)

Solution 4

Yet another possibility:

 library(SciencesPo)
    x = c(sample(10:20, 50, TRUE))
    freq(x)
Share:
97,053
eloyesp
Author by

eloyesp

I've learned to code in 2010 and it changed my life entirely, I've been living off this since 2013. I enjoy learning new languages and design patterns. We build a coworking space for ourselves with some friends. I have a kid that is awesome and I'm always looking for more free time for my family.

Updated on February 25, 2020

Comments

  • eloyesp
    eloyesp over 4 years

    I'm new with R. I need to generate a simple Frequency Table (as in books) with cumulative frequency and relative frequency.

    So I want to generate from some simple data like

    > x
    [1] 17 17 17 17 17 17 17 17 16 16 16 16 16 18 18 18 10 12 17 17 17 17 17 17 17 17 16 16 16 16 16 18 18 18 10
    [36] 12 15 19 20 22 20 19 19 19
    

    a table like:

                frequency  cumulative   relative
    (9.99,11.7]    2            2       0.04545455
    (11.7,13.4]    2            4       0.04545455
    (13.4,15.1]    1            5       0.02272727
    (15.1,16.9]   10           15       0.22727273
    (16.9,18.6]   22           37       0.50000000
    (18.6,20.3]    6           43       0.13636364
    (20.3,22]      1           44       0.02272727
    

    I know it should be simple, but I don't know how.

    I got some results using this code:

    factorx <- factor(cut(x, breaks=nclass.Sturges(x)))
    as.matrix(table(factorx))