Use data.table to count and aggregate / summarize a column

r dataframe data.table aggregate

57,579

The post you are referring to gives a method on how to apply one aggregation method to several columns. If you want to apply different aggregation methods to different columns, you can do:

dat[, .(count = .N, var = sum(VAR)), by = MNTH]

this results in:

     MNTH count var
1: 201501     4   2
2: 201502     3   0
3: 201503     5   2
4: 201504     4   2

You can also add these values to your existing dataset by updating your dataset by reference:

dat[, `:=` (count = .N, var = sum(VAR)), by = MNTH]

this results in:

> dat
      MNTH VAR count var
 1: 201501   1     4   2
 2: 201501   1     4   2
 3: 201501   0     4   2
 4: 201501   0     4   2
 5: 201502   0     3   0
 6: 201502   0     3   0
 7: 201502   0     3   0
 8: 201503   0     5   2
 9: 201503   0     5   2
10: 201503   1     5   2
11: 201503   1     5   2
12: 201503   0     5   2
13: 201504   1     4   2
14: 201504   0     4   2
15: 201504   1     4   2
16: 201504   0     4   2

For further reading about how to use data.table syntax, see the Getting started guides on the GitHub wiki.

57,579

Author by

Whitebeard

Areas of interest: Statistics R Python Alteryx

Updated on June 27, 2020

Comments

Whitebeard almost 4 years

I want to count and aggregate(sum) a column in a data.table, and couldn't find the most efficient way to do this. This seems to be close to what I want R summarizing multiple columns with data.table.

My data:

set.seed(321)
dat <- data.table(MNTH = c(rep(201501,4), rep(201502,3), rep(201503,5), rep(201504,4)), 
                  VAR = sample(c(0,1), 16, replace=T))

> dat
     MNTH VAR
 1: 201501   1
 2: 201501   1
 3: 201501   0
 4: 201501   0
 5: 201502   0
 6: 201502   0
 7: 201502   0
 8: 201503   0
 9: 201503   0
10: 201503   1
11: 201503   1
12: 201503   0
13: 201504   1
14: 201504   0
15: 201504   1
16: 201504   0

I want to both count and sum VAR by MNTH using data.table. The desired result:

    MNTH COUNT VAR
1 201501     4   2
2 201502     3   0
3 201503     5   2
4 201504     4   2

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

R data.table: How to sum variables by group based on a condition?

Aggregate / summarize multiple variables per group (e.g. sum, mean)

aggregating multiple columns in data.table

Loop over a data.table rows with condition

Aggregate15 minute data to hourly

Aggregating table() over multiple columns in R without a "by" breakdown

How to aggregate data in data.frame in R

Apply Encoding to Entire Data.Table

Remove columns of dataframe based on conditions in R

Subset by column index in R - Data.Table vs. dataframe

Use data.table to count and aggregate / summarize a column

Whitebeard

Comments

Recents

Related