How can I sort alphanumeric strings in Unix?

22,988

Solution 1

You need to tell it where your sorting key starts:

sort -n -k1.4 list.txt

Otherwise it starts from the beginning, fails to convert a string to a number and falls back to alphabetical comparison.

Solution 2

You can always perform sort with argument -V to sort alphanumeric string..

$ sort -V inputfile > outputfile

$ cat inputfile  
TAB1  
TAB13  
TAB11  
TAB19  
TAB2  
TAB3  
TAB16  
TAB17  
TAB18  
TAB9  
TAB10  
TAB8  
TAB12  
TAB20  

$ cat outputfile  
TAB1  
TAB2  
TAB3  
TAB8  
TAB9  
TAB10  
TAB11  
TAB12  
TAB13  
TAB16  
TAB17  
TAB18  
TAB19  
TAB20  

Solution 3

Since this is tagged as a Vim question, I figured it might be worth mentioning the Vim option (even though I would personally use sort since the data's already in a file). It's simply

:sort n

Since Vim's numeric sort ignores up to the first decimal number, one doesn't need to ignore the "TAB" (:sort can take a pattern to ignore, :sort n /TAB/ would work as well, for example). As usual, :h :sort for more information.

Solution 4

You can do this in Perl or any language where sort lets you specify a comparison operator:

sub numcomp() {

 $a =~ /([0-9]*)$/; $aa = $1;
 $b =~ /([0-9]*)$/; $bb = $1;
 $aa <=> $bb;

}

sort numcomp @mylist...

(Don't bother telling me it's baby Perl. I... um, I wrote it that way on purpose so it would be easy to understand.) (Don't bother telling me it's wrong. I... um, I wrote it that way on purpose as an exercise for the reader.)

Share:
22,988

Related videos on Youtube

Lazer
Author by

Lazer

Updated on March 02, 2020

Comments

  • Lazer
    Lazer about 4 years

    I have a list of table names, which are out of order. How can I get them in the correct logical order?

    $ cat list.txt

    TAB1
    TAB13
    TAB11
    TAB19
    TAB2
    TAB3
    TAB16
    TAB17
    TAB18
    TAB9
    TAB10
    TAB8
    TAB12
    TAB20
    

    $ cat list.txt | sort -n

    TAB1
    TAB10
    TAB11
    TAB12
    TAB13
    TAB16
    TAB17
    TAB18
    TAB19
    TAB2
    TAB20
    TAB3
    TAB8
    TAB9
    

    Expected order:

    TAB1
    TAB2
    TAB3
    TAB8
    TAB9
    TAB10
    TAB11
    TAB12
    TAB13
    TAB16
    TAB17
    TAB18
    TAB19
    TAB20
    

    Any vim short-cuts will also do, I do not necessarily need a separate utility for this.

    • Jeffrey Jose
      Jeffrey Jose over 13 years
      Bookmarking because its such a fine question (with some fine answers)
  • idbrii
    idbrii over 13 years
    It took me a bit to figure out how -k works: 1 is the field number and 4 is the character. If there was a space (the default field separator) after the T in TAB, then you'd use sort -n -k2.4 list.txt. You can use -t to specify custom field separators. sort -n -tB -k2.1 list.txt would also work because the B in TAB would divide the fields and we'd sort on the first character of the second field to the end of line.
  • Jeffrey Jose
    Jeffrey Jose over 13 years
    Thanks. Didnt know this. I had to go though a lot of pain of deleting the string, sort it and put back the string.
  • tchrist
    tchrist over 13 years
    @jeffjose: That only works easily if you have the same string in all cases; it fails when you don't. The worst is when you can't pass the sort -kX.Yn shell command a fixed position. I've often had to write a Perl script to deal with those situations, and wish I didn't. Similarly for sorting by the last field when there is a variable number of fields, although there you can just reverse the field order, sort it, then reverse it back again.
  • Till Kolditz
    Till Kolditz almost 4 years
    You should mention that according to the manual, '-V' stands for "--version-sort" and is descrived as "natural sort of (version) numbers within text" -- taken from Ubuntu 20.04 standard repository. So it may not exactly behave as desired.