How can I sum values of columns with dcast()?

10,402

A dcast() version

This works for me:

require("reshape2")
dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)

> dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)
  Year SP1 SP2 SP3
1 2002   5  10   2
2 2003   0   0   2
3 2004   2   0   1

Check that counts$Counts is numeric; look at the output from str(counts) where counts is your DF. I created counts via:

counts <- read.table(text = "Year    Species Counts
2002    SP1 2
2002    SP1 3
2004    SP1 2
2002    SP2 8
2002    SP2 2
2002    SP3 1
2002    SP3 1
2003    SP3 2
2004    SP3 1", header = TRUE)

This was using

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8      stringr_0.6.2 tools_3.0.1

Alternative base R version using xtabs()

You might also like to try the xtabs() function which comes with base R

xtabs(Counts ~ Year + Species, data = counts)

> xtabs(Counts ~ Year + Species, data = counts)
      Species
Year   SP1 SP2 SP3
  2002   5  10   2
  2003   0   0   2
  2004   2   0   1
Share:
10,402
user2542995
Author by

user2542995

Updated on June 13, 2022

Comments

  • user2542995
    user2542995 over 1 year

    I'm stuck with a dcast function; I'm trying to create a sum table for individuals of many species per counting year.

    I have a data frame with 3 columns: (1) the year (factor), (2) the names of the species (factor), and (3) the counts (numeric).

    Year    Species Counts
    2002    SP1 2
    2002    SP1 3
    2004    SP1 2
    2002    SP2 8
    2002    SP2 2
    2002    SP3 1
    2002    SP3 1
    2003    SP3 2
    2004    SP3 1
    

    I'm trying to get this kind of table with sums:

        2002    2003    2004
    SP1 5   0   2
    SP2 10  0   0
    SP3 2   2   1
    

    Aggregate does not do what I want. I'm using the dcast function like this:

    dcast( DF, Species ~ Year , sum)
    

    Whatever I try, sums do not work with factors. I get this error message:

    (Error in Summary.factor(integer(0), na.rm = FALSE): sum not meaningful for factors 
    

    When I try the default function (length), I got the sum of counts of rows instead of the sums of the individuals. When I try to set the sum to deal with my "Counts" column instead of the other factors, it also doesn't work or I get the same error message.

    How can I get that kind of table with sums of Counts?


    EDIT:

    The function calculate false sums and generate individuals. Here I show you a comparison of the sums made with a cross table in Excel, then in R:

    EXCEL                   2003    2004    2005    2006    2007    2008    2009
    Anthocharis_cardamines  1   0   2   2   0   1   0
    Apatura_ilia            0   0   0   0   1   0   0
    Aporia_crataegi         2   3   29  26  6   4   3
    Brintesia_circe         126 217 199 303 64  99  55
    
    
    DCAST                   2003    2004    2005    2006    2007    2008    2009
    Anthocharis_cardamines  2   0   4   4   0   2   0
    Apatura_ilia            0   0   0   0   2   0   0
    Aporia_crataegi         4   6   258 205 25  8   6
    Brintesia_circe         883 1334    1050    1770    490 848 354
    

    The numbers calculated dont even correspond to the sums of lines i had yesterday. How are these sums working ?

    EDIT 2:

    >dput(head(counts, 10)
    structure(list(year = structure(c(16L, 16L, 16L, 16L, 16L, 16L, 
    16L, 16L, 16L, 15L), .Label = c("1994", "1995", "1996", "1997", 
    "1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005", 
    "2006", "2007", "2008", "2009"), class = "factor"), species = structure(c(146L, 
    146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L), .Label = c("Aglais_urticae", 
    "Anthocharis_cardamines", "Anthocharis_euphenoides", "Apatura_ilia", 
    "Apatura_iris", "Aphantopus_hyperantus", "Aporia_crataegi", "Araschnia_levana", 
    "Arethusana_arethusa", "Argynnis_adippe", "Argynnis_aglaja", 
    "Argynnis_paphia", "Aricia_agestis", "Boloria_dia", "Boloria_euphrosyne", 
    "Boloria_selene", "Brenthis_daphne", "Brenthis_ino", "Brintesia_circe", 
    "Callophrys_rubi", "Carcharodus_alceae", "Carcharodus_floccifera", 
    "Carcharodus_lavatherae", "Carterocephalus_palaemon", "Celastrina_argiolus", 
    "Charaxes_jasius", "Chazara_briseis", "Clossiana_dia", "Coenonympha_arcania", 
    "Coenonympha_dorus", "Coenonympha_glycerion", "Coenonympha_oedippus", 
    "Coenonympha_pamphilus", "Coenonympha_tullia", "Colias_alfacariensis", 
    "Colias_croceus", "Colias_hyale", "Colias_palaeno", "Cupido_alcetas", 
    "Cupido_argiades", "Cupido_minimus", "Cupido_osiris", "Diacrisia_sannio", 
    "Erebia_aethiops", "Erebia_euryale", "Erebia_ligea", "Erebia_medusa", 
    "Erebia_meolans", "Erynnis_tages", "Euchloe_crameri", "Euclidia_glyphica", 
    "Euphydryas_aurinia", "Euplagia_quadripunctaria", "Everes_argiades", 
    "Fabriciana_adippe", "Glaucopsyche_alcon", "Glaucopsyche_alexis", 
    "Glaucopsyche_arion", "Glaucopsyche_melanops", "Glaucopsyche_nausithous", 
    "Glaucopsyche_teleius", "Gonepteryx_cleopatra", "Gonepteryx_rhamni", 
    "Hamearis_lucina", "Hesperia_comma", "Heteropterus_morpheus", 
    "Hipparchia_fidia", "Hipparchia_semele", "Hyles_euphorbiae", 
    "Hyponephele_lupinus", "Inachis_io", "Iphiclides_podalirius", 
    "Issoria_lathonia", "Lampides_boeticus", "Lasiommata_maera", 
    "Lasiommata_megera", "Leptidea_sinapis", "Leptotes_pirithous", 
    "Libelloides_coccajus", "Libelloides_longicornis", "Limenitis_camilla", 
    "Limenitis_populi", "Limenitis_reducta", "Lopinga_achine", "Lycaena_alciphron", 
    "Lycaena_dispar", "Lycaena_helle", "Lycaena_phlaeas", "Lycaena_tityrus", 
    "Macroglossum_stellatarum", "Maculinea_arion", "Maniola_jurtina", 
    "Melanargia_arge", "Melanargia_galathea", "Melanargia_lachesis", 
    "Melanargia_occitanica", "Melitaea_cinxia", "Melitaea_diamina", 
    "Melitaea_didyma", "Melitaea_phoebe", "Mesoacidalia_aglaja", 
    "Minois_dryas", "Neohipparchia_statilinus", "Neozephyrus_quercus", 
    "Nymphalis_antiopa", "Nymphalis_polychloros", "Ochlodes_sylvanus", 
    "Ochlodes_venatus", "Palaeochrysophanus_hippothoe", "Papilio_machaon", 
    "Pararge_aegeria", "Pieris_napi", "Plebeius_agestis", "Plebeius_argyrognomon", 
    "Polygonia_c-album", "Polyommatus_bellargus", "Polyommatus_coridon", 
    "Polyommatus_escheri", "Polyommatus_icarus", "Polyommatus_semiargus", 
    "Polyommatus_thersites", "Pontia_daplidice", "Pseudopanthera_macularia", 
    "Pseudophilotes_baton", "Pseudotergumia_fidia", "Pyrgus_malvae", 
    "Pyronia_bathseba", "Pyronia_cecilia", "Pyronia_tithonus", "Quercusia_quercus", 
    "Satyrium_acaciae", "Satyrium_esculi", "Satyrium_ilicis", "Satyrium_pruni", 
    "Satyrium_spini", "Satyrium_w-album", "Smerinthus_ocellatus", 
    "Speyeria_aglaja", "Spialia_sertorius", "Thecla_betulae", "Thymelicus_acteon", 
    "Thymelicus_lineola", "Thymelicus_sylvestris", "Vanessa_atalanta", 
    "Vanessa_cardui", "Zerynthia_polyxena", "Zygaena_carniolica", 
    "Zygaena_ephialtes", "Zygaena_erythrus", "Zygaena_fausta", "Zygaena_filipendulae", 
    "Zygaena_hilaris", "Zygaena_loti", "Zygaena_occitanica", "Zygaena_purpuralis", 
    "Zygaena_sarpedon", "Zygaena_transalpina"), class = "factor"), 
        Counts = c(2, 2, 2, 2, 2, 17, 52, 2, 2, 17)), .Names = c("year", 
    "species", "Counts"), row.names = 5479:5488, class = "data.frame")
    
    > str(counts)
    'data.frame':   3161 obs. of  3 variables:
     $ year   : Factor w/ 16 levels "1994","1995",..: 16 16 16 16 16 16 16 16 16 15 ...
     $ species: Factor w/ 157 levels "Aglais_urticae",..: 146 146 146 146 146 146 146 146 146 146 ...
     $ Counts : num  2 2 2 2 2 17 52 2 2 17 ...
    

    I hope its helps...

  • user2542995
    user2542995 over 10 years
    Thanks for your help but this is not working either. I had an issue with the type of my "counts" vector that i successfully convert from factor to numeric. The function is now working but shows false sums for each species and each year. I edited my answer to include new tables.
  • Gavin Simpson
    Gavin Simpson over 10 years
    @user2542995 Can you add to your Question the output from dput(head(DF, 10)) and str(DF).
  • user2542995
    user2542995 over 10 years
    I finally found a solution, and the problem was not coming from the dcast function but from the as.numeric function which convert false numbers. It's a common issue with this function that i was not aware of. Anyway, thanks for your help!