Remove rows if the swap also exist in the data frame in R
75
You can row-wise sort the columns and then select only the unique ones :
library(dplyr)
df %>%
mutate(col1 = pmin(V1, V2),
col2 = pmax(V1, V2)) %>%
distinct(col1, col2)
# col1 col2
#1 1 2
#2 1 3
#3 1 4
#4 2 4
Using base R :
df1 <- transform(df, col1 = pmin(V1, V2), col2 = pmax(V1, V2))
df[!duplicated(df1[3:4]), ]
data
df <- structure(list(V1 = c(1L, 1L, 1L, 2L, 4L, 2L), V2 = c(2L, 3L,
4L, 4L, 2L, 1L)), class = "data.frame", row.names = c(NA, -6L))
Author by
DigiPath
Updated on December 22, 2022Comments
-
DigiPath over 1 year
I am trying to remove those rows if the swap also exists in the data frame.
For example, if I have a data frame:
1 2 1 3 1 4 2 4 4 2 2 1
Then the row (1,2), (2,4) will be removed because (2,1) and (4,2) are also in the df. Is there any fast and neat way to do it? Thank you!
-
Ronak Shah almost 4 yearsCan same row be repeated twice? For eg - (1, 2) and (1, 2) ?
-
DigiPath almost 4 yearsno, if (1, 2) is in the list then (2,1) can not be in the list
-
-
DigiPath almost 4 yearsI like the dplyr solution, it is pretty fast! Thank you!
-
Ronak Shah almost 4 yearsNo, wait. Do you want to remove both the original and the swap? meaning (1, 2) and (2, 1) ? My answer only removes (2, 1).
-
Ronak Shah almost 4 yearsIn case if you want to remove both, you can use
df[!(duplicated(df1[3:4]) | duplicated(df1[3:4], fromLast = TRUE)), ]
-
DigiPath almost 4 yearsNo I mean remain either (1,2) or (2,1) so I think your answer meets my need. Thank you!