unique / sort in data.frame

10,244

Solution 1

df <- df[order(df$spn), ]
> df[!duplicated(df), ]
  spn sex
1  01   f
6  03   m
4  22   m
9  35   f

Solution 2

df2 = df[!duplicated(df), ] # Remove duplicated rows.
df3 = df2[order(df2$spn), ] # Sort by the spn column.

df3
#  spn sex
#1  01   f
#6  03   m
#4  22   m
#9  35   f

Solution 3

Use unique then order:

df <- unique(df)
df[order(df$spn), ]

Using dplyr, data.table:

library(dplyr)
unique(df) %>% arrange(spn)
#   spn sex
# 1  01   f
# 2  03   m
# 3  22   m
# 4  35   f

library(data.table)
unique(setDT(df))[ order(spn), ]
#    spn sex
# 1:  01   f
# 2:  03   m
# 3:  22   m
# 4:  35   f
Share:
10,244
yth
Author by

yth

Updated on June 04, 2022

Comments

  • yth
    yth about 2 years

    I have a data frame like this:

    x=c("01","01","01","22","22","03","03","03","35","35")
    y=c("f","f","f","m","m","m","m","m","f","f")
    df=data.frame(spn=x, sex=y)
    

    seems like:

       spn sex
    1   01   f
    2   01   f
    3   01   f
    4   22   m
    5   22   m
    6   03   m
    7   03   m
    8   03   m
    9   35   f
    10  35   f
    

    What I'd like to do is to sort the df$spn and let it appears only once. the appropriate df$sex as well, like:

       spn sex
    1  01   f
    2  03   m
    3  22   m
    4  35   f
    

    How could I do that? many many thanks!

  • flodel
    flodel over 11 years
    from an efficiency point of view, it is indeed faster if the duplicates are removed first.
  • MS Sankararaman
    MS Sankararaman over 5 years
    Is it possible to write this in one line of code? I'm looking to add the output of df here as myVariable in a ggplot scale_y_continuous(label=c(<myVariable)). My requirement is that - it has to be dynamic, so I've been searching for a one-line code solution here.
  • zx8754
    zx8754 over 5 years
    @MeenakshiSundharam Please post a new question, with example data, your ggplot code, and expected output.