Bash - sort by not first character

8,407

Solution 1

tl;dr

sort -k1.5 file | uniq -s 6 -w 5


Explanation

My sort is GNU coreutils 8.22. The manpage for my sort shows:

KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and  C
       a  character  position  in  the  field;  both are origin 1, and the stop position defaults to the
       line's end.

So with your current sort command, sort -k1,1 file uses the first word to the first word as the sort.

What you want is (for the sort command anyway):

sort -k1.5 file | uniq -s 6 -w 5

This will use the fifth character of the first word, which is what you wanted.

Solution 2

$sort -k2 file

"TTTTCTTCTA"                            1
"TTTTCTTCCT"                            2
"TTTTCTTACC"                    1
"TTTTCTTATT"                    2
"TTTTCTTCGG"                    2       2
"TTTTCTTCTG"            1
"TTTTCTTGAA"            1
"TTTTCTTACA"            1       1
"TTTTCTTTAG"            1       1
"TTTTCTTTGG"            1       1
"TTTTCTTCAT"            1       2       2
"TTTTCTTAGC"    1
"TTTTCTTTAA"    1
"TTTTCTTTCT"    1
"TTTTCTTTGC"    1
"TTTTCTTTTA"    1
"TTTTCTTCTT"    1                       2
"TTTTCTTCAA"    1               1       1
"TTTTCTTGCT"    1               1       1
"TTTTCTTCAG"    1               2       1
"TTTTCTTACT"    1       1
"TTTTCTTTGT"    1       1       2       1

$sort -k2 file | uniq -f 1

"TTTTCTTCTA"                            1
"TTTTCTTCCT"                            2
"TTTTCTTACC"                    1
"TTTTCTTATT"                    2
"TTTTCTTCGG"                    2       2
"TTTTCTTCTG"            1
"TTTTCTTACA"            1       1
"TTTTCTTCAT"            1       2       2
"TTTTCTTAGC"    1
"TTTTCTTCTT"    1                       2
"TTTTCTTCAA"    1               1       1
"TTTTCTTCAG"    1               2       1
"TTTTCTTACT"    1       1
"TTTTCTTTGT"    1       1       2       1
Share:
8,407

Related videos on Youtube

diego9403
Author by

diego9403

Updated on September 18, 2022

Comments

  • diego9403
    diego9403 over 1 year

    I want to sort my file by first column but I have to start sort from 5 character. How can I do that?

    My file:

    "TTTTCTTACA"            1       1
    "TTTTCTTACC"                    1
    "TTTTCTTACT"    1       1
    "TTTTCTTAGC"    1
    "TTTTCTTATT"                    2
    "TTTTCTTCAA"    1               1       1
    "TTTTCTTCAG"    1               2       1
    "TTTTCTTCAT"            1       2       2
    "TTTTCTTCCT"                            2
    "TTTTCTTCGG"                    2       2
    "TTTTCTTCTA"                            1
    "TTTTCTTCTG"            1
    "TTTTCTTCTT"    1                       2
    "TTTTCTTGAA"            1
    "TTTTCTTGCT"    1               1       1
    "TTTTCTTTAA"    1
    "TTTTCTTTAG"            1       1
    "TTTTCTTTCT"    1
    "TTTTCTTTGC"    1
    "TTTTCTTTGG"            1       1
    "TTTTCTTTGT"    1       1       2       1
    "TTTTCTTTTA"    1
    

    I was trying:

    sort -k1,1 file | uniq -s 6 -w 5 
    

    Of course, it doesn't work. Mayby sort has some flags, but I didn't find them. Do you have some idea?

    • DavidPostill
      DavidPostill almost 8 years
      "I want to sort my file by first column" - Your data is already sorted by the first column. Please explain what you are really trying to do.
  • Buffalo Rabor
    Buffalo Rabor almost 8 years
    the literal first column is already sorted with no duplication in your sample data, so I provided a sort of the first numeric column.