Linux command to remove the duplicate lines but keep the first occurrence
Solution 1
If you allow sorting anyway, this will work:
sort | uniq
-u
was the source of your trouble, because (from man 1 uniq
):
-u
,--unique
only print unique lines
while by default:
With no options, matching lines are merged to the first occurrence.
Solution 2
If you want to dedup while keeping first occurrence, you can do
awk '!visited[$0]++' "$your_hist_file" > "$your_new_hist_file"
If you want to dedup while keeping last occurrence, you can do
tac "$your_hist_file" | awk '!visited[$0]++' | tac > "$your_new_hist_file"
You can use one awk
command and no tac
to achieve this too, but it's as straightforward as using two tac
s.
Related videos on Youtube
user9371654
Updated on September 18, 2022Comments
-
user9371654 almost 2 years
I have a text file. Each line contains a string. Some strings are repeated. I want to remove repetition but I want to keep the first occurrence. For example:
line1 line1 line2 line3 line4 line3 line5
Should be
line1 line2 line3 line4 line5
I tried:
sort file1 | uniq -u > file2
but this did not help. It removed all repeated strings while I want the first occurrence to be present. I do not need to sort. Just remove the exact repetition of a string in a new line while keeping everything else as it is. -
MMM about 4 yearsWelcome to Super User! Generally, answers are much more helpful if they include an explanation of what the code is intended to do, and why that solves the problem without introducing others.
-
DavidPostill about 4 yearsWelcome to Super User! Could you please edit your answer to give an explanation of why this code answers the question? Code-only answers are discouraged, because they don't teach the solution.