How to find file/directory names that are the same, but with different capitalization/case?

5,405

Solution 1

If you have GNU uniq, you can sort case insensitively (-i), and use -d to print only duplicate lines:

find . | sort -f | uniq -di

As @StephaneChazelas mentioned in his answer, this might not do what you expect if you can have duplicate paths that only differ in case (like a/b/foo and A/b/foo).

Solution 2

Assuming file names don't contain newline characters, you could do something like:

find . | tr '[:upper:]' '[:lower:]' | sort | uniq -d

Note that some tr implementations like GNU tr don't change the case of multi-byte characters.

Also note that the path it reports may not be the paths of any file. For instance, if there's a ./a/b/fOo and a ./A/b/fOo file, it will report ./a/b/foo. If it's not what you want, you may want to refine your requirements.

Share:
5,405

Related videos on Youtube

gasko peter
Author by

gasko peter

Updated on September 18, 2022

Comments

  • gasko peter
    gasko peter almost 2 years

    How can I list the file/directory names in a directory recursively that are the same, but with different capitalization/case? ex.:

    INPUT (not the ls command, the directories):

    [user@localhost ~/a] ls -R
    .:
    b
    
    ./b:
    ize  Ize
    
    ./b/ize:
    
    ./b/Ize:
    [user@localhost ~/a] 
    

    OUTPUT:

    /b/ize
    
    • Admin
      Admin almost 11 years
      Duh, capitalization, I couldn't figure out what he was asking.
    • Admin
      Admin almost 11 years
      @gasko-peter are you looking for files with similar names because you're trying to identify the same file with a different names?
    • Admin
      Admin over 7 years
  • Stéphane Chazelas
    Stéphane Chazelas almost 11 years
    You probably want sort -f here. Also note that GNU uniq has the same limitation as GNU tr as in it doesn't work for matching case of multi-byte characters.
  • terdon
    terdon almost 11 years
    @StephaneChazelas why do I want sort -f? If uniq can deal with the case, why would I also need to make sort case insensitive? And what do you mean by multi-byte characters? Things like \n,\r etc? How can they have different cases?
  • Stéphane Chazelas
    Stéphane Chazelas almost 11 years
    Try export LC_ALL=C; printf '%s\n' a A b B | sort | uniq -di. Some locales sort case-insensitively, some others (like C) don't. uniq needs a sorted input, its duplicate lines must be adjacent.
  • Brandon Condrey
    Brandon Condrey almost 11 years
    His first example said different font size, suffice it to assume he doesn't have an idea of what he wants.
  • terdon
    terdon almost 11 years
    Suffice it to say that English is not his native language, hardly the OP's fault that. However, the example clearly shows that he is not comparing the files, just looking for files of the same name in a case-insensitive manner. All I'm saying is that you might want to read a question more closely before deciding which ideas are "bad".
  • Jeff Hewitt
    Jeff Hewitt almost 11 years
    Agreed. This doesn't address the OP's concern. I also find it strange that you labeled an answer accepted by the OP as a bad idea because it's not what the OP wants!