Sorting a file based on one column using Unix and Awk
A possible solution is to get each two lines together, sort, then split again the joined lines
awk '{ getline line; print $0, line }' input_file |
sort -k6,6nr -k15,15nr |
awk '{ $10 = "\n" $10; print }'
Related videos on Youtube
Namrata
I am currently at the Swiss Institute of Bioinformatics (SIB) and UNIL,Switzerland after completing a Masters in Bioinformatics from King's College London, UK. I work with Next Generation Sequencing Data Analysis and Genome Assemblies.
Updated on September 18, 2022Comments
-
Namrata over 1 year
I need to sort the input file according the 6th column, which is the score.
Input File:
Sc2/80 20 . A T 86 Pass N=2 F=5;U=4 Sc2/80 20 . A C 80 Pass N=2 F=5;U=4 Sc2/60 55 . G T 90 Pass N=2 F=5;U=4 Sc2/60 55 . G C 99 Pass N=2 F=5;U=4 Sc2/20 39 . C T 97 Pass N=2 F=5;U=4 Sc2/20 39 . C A 99 Pass N=2 F=5;U=4
Expected Output:
Sc2/20 39 . C T 97 Pass N=2 F=5;U=4 Sc2/20 39 . C A 99 Pass N=2 F=5;U=4 Sc2/60 55 . G T 90 Pass N=2 F=5;U=4 Sc2/60 55 . G C 99 Pass N=2 F=5;U=4 Sc2/80 20 . A T 86 Pass N=2 F=5;U=4 Sc2/80 20 . A C 80 Pass N=2 F=5;U=4
Logic: All the even lines of the input file should be compared and ranked according to the score (Descending Order) and the corresponding odd line of the file should be printed as well with it. If any of the scores (of the even lines) are equal then we need to look at the score of the corresponding odd line and therefore, the higher score gets priority and is printed first.
-
Kartik almost 11 yearsThis seems related to some DNA (AGCT). Is it really related to DNA of some kind?
-
Namrata almost 11 years@Kartik : Yes, you are right. This work is a part of genome data analysis.
-