How can I sum values of columns with dcast()?
A dcast()
version
This works for me:
require("reshape2")
dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)
> dcast(counts, Year ~ Species, value.var = "Counts", fun.aggregate = sum)
Year SP1 SP2 SP3
1 2002 5 10 2
2 2003 0 0 2
3 2004 2 0 1
Check that counts$Counts
is numeric; look at the output from str(counts)
where counts
is your DF
. I created counts
via:
counts <- read.table(text = "Year Species Counts
2002 SP1 2
2002 SP1 3
2004 SP1 2
2002 SP2 8
2002 SP2 2
2002 SP3 1
2002 SP3 1
2003 SP3 2
2004 SP3 1", header = TRUE)
This was using
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.2.2
loaded via a namespace (and not attached):
[1] plyr_1.8 stringr_0.6.2 tools_3.0.1
Alternative base R version using xtabs()
You might also like to try the xtabs()
function which comes with base R
xtabs(Counts ~ Year + Species, data = counts)
> xtabs(Counts ~ Year + Species, data = counts)
Species
Year SP1 SP2 SP3
2002 5 10 2
2003 0 0 2
2004 2 0 1
user2542995
Updated on June 13, 2022Comments
-
user2542995 over 1 year
I'm stuck with a dcast function; I'm trying to create a sum table for individuals of many species per counting year.
I have a data frame with 3 columns: (1) the year (factor), (2) the names of the species (factor), and (3) the counts (numeric).
Year Species Counts 2002 SP1 2 2002 SP1 3 2004 SP1 2 2002 SP2 8 2002 SP2 2 2002 SP3 1 2002 SP3 1 2003 SP3 2 2004 SP3 1
I'm trying to get this kind of table with sums:
2002 2003 2004 SP1 5 0 2 SP2 10 0 0 SP3 2 2 1
Aggregate does not do what I want. I'm using the
dcast
function like this:dcast( DF, Species ~ Year , sum)
Whatever I try, sums do not work with factors. I get this error message:
(Error in Summary.factor(integer(0), na.rm = FALSE): sum not meaningful for factors
When I try the default function (length), I got the sum of counts of rows instead of the sums of the individuals. When I try to set the sum to deal with my "Counts" column instead of the other factors, it also doesn't work or I get the same error message.
How can I get that kind of table with sums of Counts?
EDIT:
The function calculate false sums and generate individuals. Here I show you a comparison of the sums made with a cross table in Excel, then in R:
EXCEL 2003 2004 2005 2006 2007 2008 2009 Anthocharis_cardamines 1 0 2 2 0 1 0 Apatura_ilia 0 0 0 0 1 0 0 Aporia_crataegi 2 3 29 26 6 4 3 Brintesia_circe 126 217 199 303 64 99 55 DCAST 2003 2004 2005 2006 2007 2008 2009 Anthocharis_cardamines 2 0 4 4 0 2 0 Apatura_ilia 0 0 0 0 2 0 0 Aporia_crataegi 4 6 258 205 25 8 6 Brintesia_circe 883 1334 1050 1770 490 848 354
The numbers calculated dont even correspond to the sums of lines i had yesterday. How are these sums working ?
EDIT 2:
>dput(head(counts, 10) structure(list(year = structure(c(16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 15L), .Label = c("1994", "1995", "1996", "1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009"), class = "factor"), species = structure(c(146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L, 146L), .Label = c("Aglais_urticae", "Anthocharis_cardamines", "Anthocharis_euphenoides", "Apatura_ilia", "Apatura_iris", "Aphantopus_hyperantus", "Aporia_crataegi", "Araschnia_levana", "Arethusana_arethusa", "Argynnis_adippe", "Argynnis_aglaja", "Argynnis_paphia", "Aricia_agestis", "Boloria_dia", "Boloria_euphrosyne", "Boloria_selene", "Brenthis_daphne", "Brenthis_ino", "Brintesia_circe", "Callophrys_rubi", "Carcharodus_alceae", "Carcharodus_floccifera", "Carcharodus_lavatherae", "Carterocephalus_palaemon", "Celastrina_argiolus", "Charaxes_jasius", "Chazara_briseis", "Clossiana_dia", "Coenonympha_arcania", "Coenonympha_dorus", "Coenonympha_glycerion", "Coenonympha_oedippus", "Coenonympha_pamphilus", "Coenonympha_tullia", "Colias_alfacariensis", "Colias_croceus", "Colias_hyale", "Colias_palaeno", "Cupido_alcetas", "Cupido_argiades", "Cupido_minimus", "Cupido_osiris", "Diacrisia_sannio", "Erebia_aethiops", "Erebia_euryale", "Erebia_ligea", "Erebia_medusa", "Erebia_meolans", "Erynnis_tages", "Euchloe_crameri", "Euclidia_glyphica", "Euphydryas_aurinia", "Euplagia_quadripunctaria", "Everes_argiades", "Fabriciana_adippe", "Glaucopsyche_alcon", "Glaucopsyche_alexis", "Glaucopsyche_arion", "Glaucopsyche_melanops", "Glaucopsyche_nausithous", "Glaucopsyche_teleius", "Gonepteryx_cleopatra", "Gonepteryx_rhamni", "Hamearis_lucina", "Hesperia_comma", "Heteropterus_morpheus", "Hipparchia_fidia", "Hipparchia_semele", "Hyles_euphorbiae", "Hyponephele_lupinus", "Inachis_io", "Iphiclides_podalirius", "Issoria_lathonia", "Lampides_boeticus", "Lasiommata_maera", "Lasiommata_megera", "Leptidea_sinapis", "Leptotes_pirithous", "Libelloides_coccajus", "Libelloides_longicornis", "Limenitis_camilla", "Limenitis_populi", "Limenitis_reducta", "Lopinga_achine", "Lycaena_alciphron", "Lycaena_dispar", "Lycaena_helle", "Lycaena_phlaeas", "Lycaena_tityrus", "Macroglossum_stellatarum", "Maculinea_arion", "Maniola_jurtina", "Melanargia_arge", "Melanargia_galathea", "Melanargia_lachesis", "Melanargia_occitanica", "Melitaea_cinxia", "Melitaea_diamina", "Melitaea_didyma", "Melitaea_phoebe", "Mesoacidalia_aglaja", "Minois_dryas", "Neohipparchia_statilinus", "Neozephyrus_quercus", "Nymphalis_antiopa", "Nymphalis_polychloros", "Ochlodes_sylvanus", "Ochlodes_venatus", "Palaeochrysophanus_hippothoe", "Papilio_machaon", "Pararge_aegeria", "Pieris_napi", "Plebeius_agestis", "Plebeius_argyrognomon", "Polygonia_c-album", "Polyommatus_bellargus", "Polyommatus_coridon", "Polyommatus_escheri", "Polyommatus_icarus", "Polyommatus_semiargus", "Polyommatus_thersites", "Pontia_daplidice", "Pseudopanthera_macularia", "Pseudophilotes_baton", "Pseudotergumia_fidia", "Pyrgus_malvae", "Pyronia_bathseba", "Pyronia_cecilia", "Pyronia_tithonus", "Quercusia_quercus", "Satyrium_acaciae", "Satyrium_esculi", "Satyrium_ilicis", "Satyrium_pruni", "Satyrium_spini", "Satyrium_w-album", "Smerinthus_ocellatus", "Speyeria_aglaja", "Spialia_sertorius", "Thecla_betulae", "Thymelicus_acteon", "Thymelicus_lineola", "Thymelicus_sylvestris", "Vanessa_atalanta", "Vanessa_cardui", "Zerynthia_polyxena", "Zygaena_carniolica", "Zygaena_ephialtes", "Zygaena_erythrus", "Zygaena_fausta", "Zygaena_filipendulae", "Zygaena_hilaris", "Zygaena_loti", "Zygaena_occitanica", "Zygaena_purpuralis", "Zygaena_sarpedon", "Zygaena_transalpina"), class = "factor"), Counts = c(2, 2, 2, 2, 2, 17, 52, 2, 2, 17)), .Names = c("year", "species", "Counts"), row.names = 5479:5488, class = "data.frame") > str(counts) 'data.frame': 3161 obs. of 3 variables: $ year : Factor w/ 16 levels "1994","1995",..: 16 16 16 16 16 16 16 16 16 15 ... $ species: Factor w/ 157 levels "Aglais_urticae",..: 146 146 146 146 146 146 146 146 146 146 ... $ Counts : num 2 2 2 2 2 17 52 2 2 17 ...
I hope its helps...
-
user2542995 over 10 yearsThanks for your help but this is not working either. I had an issue with the type of my "counts" vector that i successfully convert from factor to numeric. The function is now working but shows false sums for each species and each year. I edited my answer to include new tables.
-
Gavin Simpson over 10 years@user2542995 Can you add to your Question the output from
dput(head(DF, 10))
andstr(DF)
. -
user2542995 over 10 yearsI finally found a solution, and the problem was not coming from the dcast function but from the as.numeric function which convert false numbers. It's a common issue with this function that i was not aware of. Anyway, thanks for your help!