Sort unix alphabetically then numerically, not working as I intended

5,192

sort -k1,1 -nk2 is the same as sort -k1,1 -n -k2, same as sort -n -k1,1 -k2, as in the numerical sorting is turned on globally, for all the keys.

To sort the 2nd key only numerically, you need to add n to that sort key description as in:

sort -k1,1 -k2n

Or:

sort -k1,1 -k2,2n

With n and with the default field separator 2 is the same as 2,2 though. 2 would be the part of the line starting from the second field, but when interpreted as a number, that's the same as the second field alone (2,2).

Here, you could also sort numerically on the number that is after chr and then alphabetically on the rest of the first field and then numerically on the second field with:

sort -k1.4n -k1,1 -k2n
Share:
5,192

Related videos on Youtube

implication
Author by

implication

Updated on September 18, 2022

Comments

  • implication
    implication over 1 year

    Sorry if this is a duplicate question, but I could not find the answer that I am looking for here or in the documentation.

    I have a file that looks like the following:

    chr2_oligo1234  700 750
    chr2_oligo1236  750 800
    chr1_oligo1 50  100
    chr1_oligo256   150 200
    chr1_oligo6 3500    3550
    chr4_oligo95    50  100
    chr5_oligo1 50  100
    chr4_oligo4 150 200
    

    The desired output looks like:

    chr1_oligo1 50  100
    chr1_oligo256   150 200
    chr1_oligo6 3500    3550
    chr2_oligo1234  700 750
    chr2_oligo1236  750 800
    chr4_oligo95    50  100
    chr4_oligo4 150 200
    chr5_oligo1 50  100
    

    The pattern at the start (e.g. chr#_oligo#) only matters in terms of the chr#, meaning that all chr1 should be first, then chr2, then chr3, etc., but I would like to sort those substrings numerically in groups as shown by the desired output above. So, I would like to know how to sort alphabetically in the case of the first column, and then keeping that order (chr1->chrN), sort each chunk of data numerically.

    I apologize if my wording is not the best for this issue or if it is a duplicate. Trying

    sort -k1,1 -nk2
    

    does properly sort numerically, but does not keep the first sort intact (jumbles up the first column and places together all lines with columns 2 and 3 being like:

    50   100
    

    I am using Mac OS X.

    EDIT: I want changed some of the examples in the first column to show more of what I'm looking for. gsort -V worked great if the name in the first column is in numerical order, but in my data set, it isn't always the case.

    I would like to essentially sort each subgroup (in this case, chr1, chr2, etc) by column 2 iteratively. I realize this can be easily done by doing a grep for each and then sorting it on column 2, but I would like to know if sort or another unix command could accomplish this alone.

    • don_crissti
      don_crissti over 7 years
      sort -V -k1,1 -k2 file with gsort (part of coreutils)
    • Jeff Schaller
      Jeff Schaller over 7 years
    • implication
      implication over 7 years
      @JeffSchaller not what I was looking for. Moving the n around does not give the input I would like. Thank you though.
    • implication
      implication over 7 years
      @don_crissti I ran "brew install coreutils" on my terminal, and had to install Xcode. This works great! Is there a way to call gsort without typing it? I read that the sort command should be changed to call gsort instead but that does not work for me (invalid option error, meaning it is still calling the OS X sort and not coreutils version.) Thank you! I am changing the question a bit because it seems -V only works if the first column is in a numerical order. This is not always the case. I will change around my question.