How can I remove duplicates in my .bash_history, preserving order?
Solution 1
Sorting the history
This command works like sort|uniq
, but keeps the lines in place
nl|sort -k 2|uniq -f 1|sort -n|cut -f 2
Basically, prepends to each line its number. After sort|uniq
-ing, all lines are sorted back according to their original order (using the line number field) and the line number field is removed from the lines.
This solution has the flaw that it is undefined which representative of a class of equal lines will make it in the output and therefore its position in the final output is undefined. However, if the latest representative should be chosen you can sort
the input by a second key:
nl|sort -k2 -k 1,1nr|uniq -f1|sort -n|cut -f2
Managing .bash_history
For re-reading and writing back the history, you can use history -a
and history -w
respectively.
Solution 2
So I was looking for the same exact thing after being annoyed by duplicates, and found that if I edit my ~/.bash_profile
or my ~/.bashrc
with:
export HISTCONTROL=ignoreboth:erasedups
It does exactly what you wanted, it only keeps the latest of any command. ignoreboth
is actually just like doing ignorespace:ignoredups
and that along with erasedups
gets the job done.
At least on my Mac terminal with bash this work perfect. Found it here on askubuntu.com.
Solution 3
Found this solution in the wild and tested:
awk '!x[$0]++'
The first time a specific value of a line ($0) is seen, the value of x[$0] is zero.
The value of zero is inverted with !
and becomes one.
An statement that evaluates to one causes the default action, which is print.
Therefore, the first time an specific $0
is seen, it is printed.
Every next time (the repeats) the value of x[$0]
has been incrented,
its negated value is zero, and a statement that evaluates to zero doesn't print.
To keep the last repeated value, reverse the history and use the same awk:
awk '!x[$0]++' ~/.bash_history # keep the first value repeated.
tac ~/.bash_history | awk '!x[$0]++' | tac # keep the last.
Solution 4
Extending Clayton answer:
tac $HISTFILE | awk '!x[$0]++' | tac | sponge $HISTFILE
tac
reverse the file, make sure you have installed moreutils
so you have sponge
available, otherwise use a temp file.
Solution 5
This is an old post, but a perpetual issue for users who want to have multiple terminals open, and have the history synched between windows, but not duplicated.
My solution in .bashrc:
shopt -s histappend
export HISTCONTROL=ignoreboth:erasedups
export PROMPT_COMMAND="history -n; history -w; history -c; history -r"
tac "$HISTFILE" | awk '!x[$0]++' > /tmp/tmpfile &&
tac /tmp/tmpfile > "$HISTFILE"
rm /tmp/tmpfile
- histappend option adds the history of the buffer to the end of the history file ($HISTFILE)
- ignoreboth and erasedups prevent duplicate entries from being saved in the $HISTFILE
- The prompt command updates the history cache
history -n
reads all lines from $HISTFILE that may have occurred in a different terminal since the last carriage returnhistory -w
writes the updated buffer to $HISTFILEhistory -c
wipes the buffer so no duplication occurshistory -r
re-reads the $HISTFILE, appending to the now blank buffer
- the awk script stores the first occurrence of each line it encounters.
tac
reverses it, and then reverses it back so that it can be saved with the most recent commands still most recent in the history - rm the /tmp file
Every time you open a new shell, the history has all dupes wiped, and every time you hit the Enter key in a different shell/terminal window, it updates this history from the file.
Related videos on Youtube
cwd
Updated on September 18, 2022Comments
-
cwd over 1 year
I really enjoying using
control+r
to recursively search my command history. I've found a few good options I like to use with it:# ignore duplicate commands, ignore commands starting with a space export HISTCONTROL=erasedups:ignorespace # keep the last 5000 entries export HISTSIZE=5000 # append to the history instead of overwriting (good for multiple connections) shopt -s histappend
The only problem for me is that
erasedups
only erases sequential duplicates - so that with this string of commands:ls cd ~ ls
The
ls
command will actually be recorded twice. I've thought about periodically running w/ cron:cat .bash_history | sort | uniq > temp.txt mv temp.txt .bash_history
This would achieve removing the duplicates, but unfortunately the order would not be preserved. If I don't
sort
the file first I don't believeuniq
can work properly.How can I remove duplicates in my .bash_history, preserving order?
Extra Credit:
Are there any problems with overwriting the
.bash_history
file via a script? For example, if you remove an apache log file I think you need to send a nohup / reset signal withkill
to have it flush it's connection to the file. If that is the case with the.bash_history
file, perhaps I could somehow useps
to check and make sure there are no connected sessions before the filtering script is run?-
jw013 over 11 yearsTry
ignoredups
instead oferasedups
for a while and see how that works for you. -
Jazz over 11 yearsI don't think bash holds an open file handle to the history file - it reads/writes it when it needs to, so it should (note - should - I haven't tested) be safe to overwrite it from elsewhere.
-
Ricardo over 7 yearsI just learned something new on the 1st sentence of your question. Good trick!
-
Jonathan Hartley over 4 yearsI'm failing to find the man page for all the options to the
history
command. Where should I be looking? -
Jonathan Hartley over 2 yearsThis answer unix.stackexchange.com/a/18443/8650 claims to erase all duplicates, not just sequential ones, using HISTCONTROL in conjunction with a PROMPT_COMMAND which re-reads the whole HISTFILE after every prompt, which gives erasedups a chance to erase older commands.
-
-
wnrph over 11 yearsWith
sort
, the-r
switch always reverses the sorting order. But this won't yield the result you have in mind.sort
regards the two occurrences ofls
as identical with the result that, even when reversed, the eventual order depends on the sorting algorithm. But see my update for another idea. -
Nathan over 10 yearsIn case, you don't want to modify .bash_history, you could put the following in .bashrc: alias history='history | sort -k2 -k 1,1nr | uniq -f 1 | sort -n'
-
trss over 9 yearsWow! That just worked. But it removes all but the first occurrence I guess. I'd reversed the ordering of the lines using Sublime Text before running this. Now I'll reverse it again to get a clean history with only the last occurrence of all duplicates left behind. Thank you.
-
Mohd over 9 yearsCheck out my answer!
-
A.L over 9 yearsWhat is
nl
at the beginning of each code line? Shouldn't it behistory
? -
tralston almost 9 yearsFor those on Mac, use
brew install coreutils
, and notice that all the GNU utils have ag
prepended to avoid confusion with the BSD built-in Mac commands (e.g. gsed is GNU whereas sed is BSD). So usegtac
. -
cbmanica over 8 yearsI'm resisting the urge to downvote, but the fact, as you noted, that there is no way to choose which of equal lines makes it in the output means the awk answer below may be much more helpful for others (including for the case that brought me here).
-
wnrph over 8 years@cbmanica That was true only for the first command and meant as a help to understand the second one. The only difference between the first and the second command is, that the second one does exercise control over output sorting.
-
vaichidrewar about 8 yearsI had to cleanup my history file to remove invalid characters. I used "iconv -f utf-8 -t utf-8 -c file.txt"
-
Ricardo over 7 yearstested on Max OS X Yosemite and on Ubuntu 14_04
-
Georg Jung almost 7 yearsagree with @MitchBroadhead. this solves the problem within bash itself, without external cron-job. tested it on ubuntu 17.04 and 16.04 LTS
-
WeakPointer over 6 yearsworks on OpenBSD too. It only removes dups of any command it is appending to the history file, which is fine for me. It has the interesting effect of shortening the history file as I enter commands that had existed as duplicates before. Now I can make my history file max shorter.
-
smilingfrog over 6 yearsHere is an excellent explanation to this in the comments
-
JepZ over 5 yearsNice clean and general answer (not restricted to the history use-case) without launching a bazilion sub-processes ;-)
-
Dylanthepiguy over 5 yearsThis only ignores duplicate, consecutive commands. If you alternate repeatedly between two given commands, your bash history will fill up with duplicates
-
drescherjm over 4 yearsI needed history -c and history -r to get it to use the history
-
Jonathan Hartley over 4 yearsIf "ignoreboth and erasedups prevent dupes from being saved", then why do you also need to do the "awk" command to remove dupes from the file? Is it because "ignoreboth and erasedups" only prevent consecutive dupes being saved? Sorry to be pedantic, I'm just trying to understand.
-
Jonathan Hartley over 4 yearsCan you help me understand why, on logout, you need to append unwritten history to the history file before then rewriting the whole history file? Can't you just write the entire file without the 'append'?
-
smilingfrog over 4 yearserasedups only erases consecutive duplicates. And you are correct that the awk command duplicates the erasedupes command making it superfluous.
-
Jonathan Hartley over 4 yearsTo be explicit, am I understanding right that you've shown two (splendid) solutions here, and a user only needs to execute one of them? Either the ruby one, or the Bash one?
-
laur over 3 yearsWouldn't this sort of break if .bash_history entries are on two lines - timestamp followed by the command itself?
-
anthony over 3 yearsfails with bash timestamps. Most things do!
-
anthony over 3 yearsFails with bash timestamps. Most things don't take timestamps into account. See my solution.
-
anthony over 3 yearsFails with bash timestamps. Most things don't take timestamps into account. See my solution.
-
VinayChoudhary99 about 3 yearsThis one works perfectly.
-
anthony about 3 yearsThe only reason I do something fancy with history during logout, is because I merge (with locks) the history, sorting by timestamps, and removing some 'sensitive' commands. I don't just simply append, which does not work will when you have multiple shell windows on the same machine.
-
Jonathan Hartley over 2 yearsThis answer contains useful information, but misleadingly claims to "do exactly what you wanted". The question states the "problem for me is that erasedups only erases sequential duplicates". This answer only explains how to use erasedups to erase sequential duplicates. It is not an answer to the actual question of how to erase all duplicates, not just sequential ones.
-
Jonathan Hartley over 2 yearsThis answer is bash-fu black belt, of which I am in awe. But it cannot handle history files with multi-line commands in it, or with timestamps in it. (Enabling timestamps in the history file is required for readline to correctly retrieve multi-line commands from the history.)
-
Jonathan Hartley over 2 yearsThis answer is sublime in the appropriate wielding of awk, at which I'm awestruck. However, as @laur notes, it doesn't work for history files with timestamps in. Enabling timestamps is important because these form the delimiters in the history file that enables readline to retrieve multi-line commands.
-
Jonathan Hartley over 2 yearsCan you explain what sponge is, and why you appended it to Clayton's answer?
-
Jonathan Hartley over 2 yearsThis is brilliant, but like many other answers here, doesn't handle history files with timestamps enabled, which is required if you want readline to be able to retrieve multi-line commands saved to your history file.
-
Jonathan Hartley over 2 years
$ sponge -h: soak up all input from stdin and write it to <file>
. I don't yet understand why it has been appended to Clayton's answer. (although I suspect it is incidental, and the main value of this answer was using 'tac', which Clayton later incorporated in his answer too.) -
Jonathan Hartley over 2 yearsAha, from
man sponge
: Unlike a shell redirect, sponge soaks up all its input before writing the output file. This allows constructing pipelines that read from and write to the same file. -
alchemy over 2 yearsthis works, or appears to (I didnt check what it deleted, but the multiple exits are gone, leaving the last one entered, and removed ~200 of 500 entries). I just had to exit shell and reenter (reloading history file.. there is a command for that somewhere). Thanks!
-
BlueC about 2 yearsThis is really nice, thank you!
-
Admin about 2 yearsFound another issue with the awk is, that it works line by line. Hence it doesn't understand where the history command starts and ends. Where a single line the multi-line command matches with another command, it fails.