Prevent duplicates from being saved in bash history

24,569

Solution 1

As far as I know, it is not possible to do what you want. I see this as a bug in bash's history processing that could be improved.

export HISTCONTROL=ignoreboth:erasedups   # no duplicate entries
shopt -s histappend                       # append history file
export PROMPT_COMMAND="history -a"        # update histfile after every command

This will keep the in memory history unique, but while it does saves history from multiple sessions into the same file, it doesn't keep the history in the file itself unique. history -a will write the new command to the file unless it's the same as the one immediately before it. It will not do a full de-duplication like the erasedups setting does in memory.

To see this silliness in action, start a new terminal session, examine the history, and you'll see repeated entries, say ls. Now run the ls command, and all the duplicated ls will be removed from the history in memory, leaving only the last one. The in memory history becomes shorter as you run commands that are duplicated in the history file, yet the history file itself continues to grow.

I use my own script to clean up the history file on demand.

# remove duplicates while preserving input order
function dedup {
   awk '! x[$0]++' $@
}

# removes $HISTIGNORE commands from input
function remove_histignore {
   if [ -n "$HISTIGNORE" ]; then
      # replace : with |, then * with .*
      local IGNORE_PAT=`echo "$HISTIGNORE" | sed s/\:/\|/g | sed s/\*/\.\*/g`
      # negated grep removes matches
      grep -vx "$IGNORE_PAT" $@
   else
      cat $@
   fi
}

# clean up the history file by remove duplicates and commands matching
# $HISTIGNORE entries
function history_cleanup {
   local HISTFILE_SRC=~/.bash_history
   local HISTFILE_DST=/tmp/.$USER.bash_history.clean
   if [ -f $HISTFILE_SRC ]; then
      \cp $HISTFILE_SRC $HISTFILE_SRC.backup
      dedup $HISTFILE_SRC | remove_histignore >| $HISTFILE_DST
      \mv $HISTFILE_DST $HISTFILE_SRC
      chmod go-r $HISTFILE_SRC
      history -c
      history -r
   fi
}

I'd love to hear more elegant ways to do this.

Note: the script won't work if you enable timestamp in history via HISTTIMEFORMAT.

Bash can improve the situation by

  1. fix history -a to only write new data if it does not match any history in memory, not just the last one.
  2. de-deduplicate history when files are read if erasedups setting is set . A simple history -w in a new terminal would then clean up the history file instead of the silly script above.

Solution 2

export HISTCONTROL=ignoreboth

Solution 3

Here is what I use..

[vanuganti@ ~]$ grep HIST .alias*
.alias:HISTCONTROL="erasedups"
.alias:HISTSIZE=20000
.alias:HISTIGNORE=ls:ll:"ls -altr":"ls -alt":la:l:pwd:exit:mc:su:df:clear:ps:h:history:"ls -al"
.alias:export HISTCONTROL HISTSIZE HISTIGNORE
[vanuganti@ ~]$ 

and working

[vanuganti@ ~]$ pwd
/Users/XXX
[vanuganti@ ~]$ pwd
/Users/XXX
[vanuganti@ ~]$ history | grep pwd | wc -l
       1

Solution 4

inside your .bash_profile add

alias hist="history -a && hist.py"

then put this on your path as hist.py and make it executable

#!/usr/bin/env python

from __future__ import print_function
import os, sys
home = os.getenv("HOME")
if not home :
    sys.exit(1)
lines = open(os.path.join(home, ".bash_history")).readlines()
history = []
for s in lines[:: -1] :
    s = s.rstrip()
    if s not in history :
        history.append(s)
print('\n'.join(history[:: -1]))

now when you want the short list just type hist

Share:
24,569
JPLemme
Author by

JPLemme

Marketing Automator

Updated on July 09, 2022

Comments

  • JPLemme
    JPLemme almost 2 years

    I'm trying to prevent bash from saving duplicate commands to my history. Here's what I've got:

    shopt -s histappend
    export HISTIGNORE='&:ls:cd ~:cd ..:[bf]g:exit:h:history'
    export HISTCONTROL=erasedups
    export PROMPT_COMMAND='history -a'
    

    This works fine while I'm logged in and .bash_history is in memory. For example:

    $ history
        1 vi .bashrc
        2 vi .alias
        3 cd /cygdrive
        4 cd ~jplemme
        5 vi .bashrc
        6 vi .alias
    
    $ vi .bashrc
    
    $ history
        1 vi .alias
        2 cd /cygdrive
        3 cd ~jplemme
        4 vi .alias
        5 vi .bashrc
    
    $ vi .alias
    
    $ history
        1 cd /cygdrive
        2 cd ~jplemme
        3 vi .bashrc
        4 vi .alias
    
    $ exit
    

    But when I log back in, my history file looks like this:

    $ history
        1 vi .bashrc
        2 vi .alias
        3 cd /cygdrive
        4 cd ~jplemme
        5 vi .bashrc
        6 vi .alias
        7 vi .bashrc
        8 vi .alias
    

    What am I doing wrong?

    EDIT: Removing the shopt and PROMPT_COMMAND lines from .bashrc does not fix the problem.

  • joedevon
    joedevon over 13 years
    Thanks Venu. The problem I'm having is a little different. Let's say there are 20 "ls" in the history. By typing "ls", it removes duplicates so your history is shorter....at least during the session. But exit and start a new session and it saves the new stuff and retains the old duplicates. Undoing most of the utility of erasedups. #Facepalm.
  • trusktr
    trusktr over 11 years
    Execellent answer. If you would rather preserve the chronological order (instead of the input order) for your commands, modify dedup() by replacing awk '! x[$0]++' $@ with tac $@ | awk '! x[$0]++' | tac.
  • tommy.carstensen
    tommy.carstensen over 10 years
    @raychi Just checking to see, if this is still the best solution as we approach 2014?
  • humbolight
    humbolight about 10 years
    Appended something very similar to my .bash_profile: export HISTIGNORE=ls:"ls -la":"cd ..":"cd ~":pwd:exit:su:"sudo -i":clear:ps:"ps$ export HISTSIZE=20000 export HISTCONTROL="erasedups"
  • trss
    trss almost 10 years
    Extending @trusktr's comment, fixing history -a to only write new data wouldn't work either, since it should in fact remove the previous occurrence and add the latest one. I suspect it is due to this complexity that they've settled for removing only consecutive duplicates.
  • Alex Hall
    Alex Hall about 7 years
    If this is intended as a self-contained script the only way it actually did anything for me is adding a call to the history_cleanup function on the last line of the script :)
  • anthony
    anthony almost 5 years
    I would make one small change to the dedup function. Reverse the order of the history file (perhaps using using "tac") before, and again after, the awk de-dup. That way the latest duplicate command is preserved instead of oldest, as awk sees it first.
  • anthony
    anthony almost 5 years
    Also I find I prefer two levels of HOSTIGNORE. the first is the shell one, ignore the given and not save it to history at all. the second is commands I like to have in the working history, but don't want to be saved out to the history file (between sessions). Command that just don't matter normally. For example all ls, and cd commands. they just become irrelevant beyond a specific shell session.
  • zyy
    zyy over 4 years
    It does the job, thanks! But the history is still memorizing duplicate commands that are not consecutive, is there a way to improve it?
  • x-yuri
    x-yuri over 3 years
    What job it does? Or rather what does it have to do with the question?