Find duplicates string in a text file and print the duplicated string alone in another text file

13,786

Are you required to use batch? If you're willing to use Powershell, which has been part of the Windows OS for many years now, this is not hard to to.
To see only the unique items:

get-content .\input.txt | select -unique | out-file unique.txt

Are you also trying to say you want to know which words are duplicated?
If so, this will give you some information about that.

get-content .\input.txt | group-object | where { $_.count -ne 1 } | format-table -auto -prop name,count

Name                Count
----                -----
Root_Controller         2
Instance_controller     4
Path_finder             2
size_manager            3

EDIT per comment
Merge the text files you want to scan into a single file, and then run the command I posted earlier.

get-content file1.txt > input.txt
get-content file2.txt >> input.txt
get-content file3.txt >> input.txt
Share:
13,786

Related videos on Youtube

S6633d
Author by

S6633d

Updated on September 18, 2022

Comments

  • S6633d
    S6633d over 1 year

    I am trying to find duplicates in my huge text file and trying to print it in another text file. But I am unable to print it in another.

    Here's what I have got so far:

     for dup in $(cut -d " " -f1 input.txt | uniq -d); do grep -n -- "$dup" input.txt; done
    

    The input.txt contains:

     "Root_Controller"
     "Instance_controller"
     "Path_finder"
     "size_manager"
     "Instance_controller"
     "text_controller"
     "file_processor"
     "string_processor"
     "size_manager"
     ".......
      .......
    

    I need to find the duplicates in this file and print it in another txt file.

    Output something like:

     Instance_controller
     size_manager
    

    Please help me with this. Its a nearly 1000 line text file and please let me how to find the same if I have number of text files(comparing the contents of a text file inside it itself and not like comparing contents of one text file in all the other text files).

    • guest
      guest almost 8 years
      sort input.txt | uniq -d
    • Ƭᴇcʜιᴇ007
      Ƭᴇcʜιᴇ007 almost 8 years
      Neither Cut, nor Grep, nor Uniq are Windows batch commands, so you're obviously leaving some information out.
  • S6633d
    S6633d almost 8 years
    It will work for one file. If i have folder called Root and I have few text files and many sub-folders inside it, I want to find the duplicates not only by comparing in this file but by comparing with other files also. How can i do it