Remove rows if the swap also exist in the data frame in R

75

You can row-wise sort the columns and then select only the unique ones :

library(dplyr)

df %>%
 mutate(col1 = pmin(V1, V2), 
        col2 = pmax(V1, V2)) %>%
 distinct(col1, col2)

#  col1 col2
#1    1    2
#2    1    3
#3    1    4
#4    2    4

Using base R :

df1 <- transform(df, col1 = pmin(V1, V2), col2 = pmax(V1, V2))
df[!duplicated(df1[3:4]), ]

data

df <- structure(list(V1 = c(1L, 1L, 1L, 2L, 4L, 2L), V2 = c(2L, 3L, 
4L, 4L, 2L, 1L)), class = "data.frame", row.names = c(NA, -6L))
Share:
75
DigiPath
Author by

DigiPath

Updated on December 22, 2022

Comments

  • DigiPath
    DigiPath over 1 year

    I am trying to remove those rows if the swap also exists in the data frame.

    For example, if I have a data frame:

    1 2
    1 3
    1 4
    2 4
    4 2
    2 1
    

    Then the row (1,2), (2,4) will be removed because (2,1) and (4,2) are also in the df. Is there any fast and neat way to do it? Thank you!

    • Ronak Shah
      Ronak Shah almost 4 years
      Can same row be repeated twice? For eg - (1, 2) and (1, 2) ?
    • DigiPath
      DigiPath almost 4 years
      no, if (1, 2) is in the list then (2,1) can not be in the list
  • DigiPath
    DigiPath almost 4 years
    I like the dplyr solution, it is pretty fast! Thank you!
  • Ronak Shah
    Ronak Shah almost 4 years
    No, wait. Do you want to remove both the original and the swap? meaning (1, 2) and (2, 1) ? My answer only removes (2, 1).
  • Ronak Shah
    Ronak Shah almost 4 years
    In case if you want to remove both, you can use df[!(duplicated(df1[3:4]) | duplicated(df1[3:4], fromLast = TRUE)), ]
  • DigiPath
    DigiPath almost 4 years
    No I mean remain either (1,2) or (2,1) so I think your answer meets my need. Thank you!