Sorting multiple keys with Unix sort

144,972

Solution 1

Use the -k option (or --key=POS1[,POS2]). It can appear multiple times and each key can have global options (such as n for numeric sort)

Solution 2

Take care though:

If you want to sort the file primarily by field 3, and secondarily by field 2 you want this:

sort -k 3,3 -k 2,2 < inputfile

Not this: sort -k 3 -k 2 < inputfile which sorts the file by the string from the beginning of field 3 to the end of line (which is potentially unique).

-k, --key=POS1[,POS2]     start a key at POS1 (origin 1), end it at POS2
                          (default end of line)

Solution 3

The -k option is what you want.

-k 1.4,1.5n -k 1.14,1.15n

Would use character positions 4-5 in the first field (it's all one field for fixed width) and sort numerically as the first key.

The second key would be characters 14-15 in the first field also.

(edit)

Example (all I have is DOS/cygwin handy):

dir | \cygwin\bin\sort.exe -k 1.4,1.5n -k 1.40,1.60r

for the data:

12/10/2008  01:10 PM         1,564,990 outfile.txt

Sorts the directory listing by month number (pos 4-5) numerically, and then by filename (pos 40-60) in reverse. Since there are no tabs, it's all field 1 to sort.

Solution 4

Here is one to sort various columns in a csv file by numeric and dictionary order, columns 5 and after as dictionary order

~/test>sort -t, -k1,1n -k2,2n -k3,3d -k4,4n -k5d  sort.csv
1,10,b,22,Ga
2,2,b,20,F
2,2,b,22,Ga
2,2,c,19,Ga
2,2,c,19,Gb,hi
2,2,c,19,Gb,hj
2,3,a,9,C

~/test>cat sort.csv
2,3,a,9,C
2,2,b,20,F
2,2,c,19,Gb,hj
2,2,c,19,Gb,hi
2,2,c,19,Ga
2,2,b,22,Ga
1,10,b,22,Ga

Note the -k1,1n means numeric starting at column 1 and ending at column 1. If I had done below, it would have concatenated column 1 and 2 making 1,10 sorted as 110

~/test>sort -t, -k1,2n -k3,3 -k4,4n -k5d  sort.csv
2,2,b,20,F
2,2,b,22,Ga
2,2,c,19,Ga
2,2,c,19,Gb,hi
2,2,c,19,Gb,hj
2,3,a,9,C
1,10,b,22,Ga

Solution 5

I believe in your case something like

sort -t@ -k1.1,1.4 -k1.5,1.7 ... <inputfile

will work better. @ is the field separator, make sure it is a character that appears nowhere. then your input is considered as consisting of one column.

Edit: apparently clintp already gave a similar answer, sorry. As he points out, the flags 'n' and 'r' can be added to every -k.... option.

Share:
144,972
Chris Kloberdanz
Author by

Chris Kloberdanz

Updated on January 21, 2020

Comments

  • Chris Kloberdanz
    Chris Kloberdanz over 4 years

    I have potentially large files that need to be sorted by 1-n keys. Some of these keys might be numeric and some of them might not be. This is a fixed-width columnar file so there are no delimiters.

    Is there a good way to do this with Unix sort? With one key it is as simple as using '-n'. I have read the man page and searched Google briefly, but didn't find a good example. How would I go about accomplishing this?

    Note: I have ruled out Perl because of the file size potential. It would be a last resort.

    • Ken Gentle
      Ken Gentle over 15 years
      One or two lines of example data would be really helpful for to create example command line. Also, does "1-n" keys mean that you need to sort by a variable number of keys? Doing that without scripting is gonna be fun...
    • Chris Kloberdanz
      Chris Kloberdanz over 15 years
      I have a PHP wrapper around the sort command to enable the 1-n feature.
  • Adam Rosenfield
    Adam Rosenfield over 15 years
    From the sort man page: "POS is F[.C][OPTS], where F is the field number and C the character position in the field; both are origin 1." See man page for full documentation.
  • Jonathan Leffler
    Jonathan Leffler over 15 years
    It is only one field if there are no blanks in the input data. Nevertheless, your example is useful.
  • Clinton Pierce
    Clinton Pierce over 15 years
    Correction: if there are no /tabs/ in the input data. In DOS's 'dir' command output, there are no tabs.
  • ron
    ron almost 13 years
    Also see andras's answer if you don't want to get insane.
  • Ken Gentle
    Ken Gentle over 11 years
    Both comments above are accurate and additive. Thanks, gentlemen.
  • mat kelcey
    mat kelcey over 11 years
    LC_ALL=C can also result in quite a speedup!
  • msb
    msb over 10 years
    The examples on how to use the options (numeric, reverse) are extremely helpful, as it's nearly impossible to find out how to use just from the man page and the other answers didn't mention it. I wish I could +2 for this. ;)
  • davidtbernal
    davidtbernal almost 10 years
    Life changing. Thanks.
  • Wildcard
    Wildcard over 8 years
    Whoops! Now I have to fix a script because earlier I only saw the first answer above...good thing I haven't depended on the script output yet....
  • xaxa
    xaxa over 8 years
    This is the best answer because it shows how to use different switches for different columns
  • Arun
    Arun over 7 years
    Nice! Now, what if I want fleld 3 to be numerically and reverse sorted whereas field 2 to be non-numerically and normal (ascending) sorted? :)
  • andras
    andras almost 7 years
    @Arun POS is explained at the end of the man page. You just append the ordering options to the field number like this: sort -k 3,3nr -k 2,2
  • android.weasel
    android.weasel over 6 years
    Aargh. What a counterintuitive interface: -k2 should be -k2,2 and a trailing comma -k2, should be 'magical default end of line or whatever'.
  • BaseZen
    BaseZen about 6 years
    My heavens. The man page writer won a contest for the least helpful way to document this. I've been reading Unix man pages for 28 years. Nowhere do they mention the -k field can be repeated.
  • Brad Dre
    Brad Dre over 4 years
    Even though the default separator accordinding to docs gnu.org/software/coreutils/manual/html_node/… is space, sometimes the field count is not what you'd expect. Perhaps as others have said here because of the LC_CTYPE locale setting. When in doubt count from the beginning of the line!
  • HongboZhu
    HongboZhu almost 4 years
    why the angle bracket <? Should sort -k3,3 -k2,2 inputfile not do the job?