Remove duplicate rows, and keep newest row based on date column

98,641

The trick is to sort your table before using Remove duplicates. Excel always keeps the first data set of a duplicated row. All consecutive rows are removed.

In your case:

  1. Set up a helper column and fill it with numerical values. Start by 1 and use autofill till the end of our table
    enter image description here

  2. Make sure your date column is formatted as date and Excel recognize them as date. Otherwise your sorting wouldn't work

  3. Choose Custom sort (depends on your Excel version). Sort your whole table by date column from Newest to Oldest. That's the important part
    enter image description here

  4. Use Remove duplicates and select only your Number column which holds your criteria to check for duplicates. Deselect all other columns
    enter image description here

  5. Choose Custom Sort again and sort your table by that Helper column we have added at the beginning to get your original row order back
    enter image description here

Share:
98,641

Related videos on Youtube

Bjarke Mønsted
Author by

Bjarke Mønsted

Updated on September 18, 2022

Comments

  • Bjarke Mønsted
    Bjarke Mønsted almost 2 years

    I have a huge list of data in excel (250.000+ rows) in the following format:

    Number  Value1  Date            Value2
    40325   1       21/01/11 18.10  2
    65485   3       22/01/11 16.47  2
    40325   9       25/01/11 19.00  0
    70912   8       27/01/11 16.43  2
    

    I need to remove duplicate rows based on column 1 (Number), and have no problem doing this using "Data/Remove Duplicates" in Excel, but I need to make sure that I remove the row with the oldest date, and keep the newest, based on column 3 (Date).

    In the example above, I would need to remove row 1 and keep row 3, since row 3 is the newest.

    I have 4.800 rows with duplicates, so a manual sorting/removing would be a very time consuming job.

    Any good suggestions? And tricks to help me out? Thanks a lot in advance :)

  • user1063287
    user1063287 about 2 years
    I am just wondering why the Helper column is needed? I have been googling for a solution and came across this answer, as well as a blog post here (which doesn't include the Helper column approach and still seems to work). Can't you just use an Excel Custom Sort on the Number and Date columns to get the desired result?