How can I combine two dataframes with different lengths in R?

14,815

Solution 1

Try using left_join in the dplyr package.

library(dplyr)

# make fake data
df1 <- data.frame(id = c("A", "B", "C", "D", "E"), val = rpois(5, 5))
df2 <- data.frame(id = c("A", "B", "C", "E"), val = rpois(4, 20))

# use left_join
df3 <- left_join(df1, df2, by = "id")

# rename and set NAs to 0
names(df3) <- c("id", "val", "val")
df3[is.na(df3)] <- 0

Solution 2

First of all it is not a good practice to have 2 variables with the same name or the same variable in two columns. It is best to have duplicate observations (e.g. Consmos twice in this case).

Based on that it is as simple as row binding or merging the two data frames:

df1 <- data.frame(descripcion_cuenta_N2 = c("Consumos", "Costes Personal", "Fungible Equipamiento", "Servicios"), anualidad = rep(2014, 4), valor = c(10, 11, 12, 13))
df2 <- data.frame(descripcion_cuenta_N2 = c("Consumos", "Costes Personal", "Fungible Equipamiento", "Servicios"), anualidad = rep(2014, 4), valor = c(11, 20, 8, 9))
df <- merge(df1, df2, all = TRUE)

which gives:

  descripcion_cuenta_N2 anualidad valor
1              Consumos      2014    10
2              Consumos      2014    11
3       Costes Personal      2014    11
4       Costes Personal      2014    20
5 Fungible Equipamiento      2014     8
6 Fungible Equipamiento      2014    12
7             Servicios      2014     9
8             Servicios      2014    13

This way is better as stated above.

If you insist on what you're asking you simply specify the variables to use in the merging:

df <- merge(df1, df2, by = c("descripcion_cuenta_N2", "anualidad"))

which results in:

  descripcion_cuenta_N2 anualidad valor.x valor.y
1              Consumos      2014      10      11
2       Costes Personal      2014      11      20
3 Fungible Equipamiento      2014      12       8
4             Servicios      2014      13       9

PS: It would make it easier to answer if your data frames where given in R. That is make your question reproducible and easy to answer. See How to make a great R reproducible example?

Share:
14,815
Theo Sloot
Author by

Theo Sloot

Updated on June 15, 2022

Comments

  • Theo Sloot
    Theo Sloot almost 2 years

    I have two dataframes, like those:

    Year 1

    and

    enter image description here

    I want to get something like that:

    enter image description here

    Here are the data frames for reproducibility:

    df1 <- data.frame(descripcion_cuenta_N2 = c("Consumos", "Costes Personal", "Fungible Equipamiento", "Servicios"), anualidad = rep(2014, 4), valor = c(10, 11, 12, 13))
    df2 <- data.frame(descripcion_cuenta_N2 = c("Consumos", "Costes Personal", "Fungible Equipamiento","Prestaciones", "Servicios"), anualidad = rep(2014, 5), valor = c(11, 20, 8, 9))
    

    A dataframe in which missed positions are filled with 0, because in some cases I am not getting data frames with the same amount of rows, and in those cases rbind fails, and I get an error.

    Which instruccion should use to combine those dataframes?

    Thanks

    PS: I know I can erase the repeated rows once the data frames are together.