Sort words in file
Solution 1
Your use of tr
is clever. But you need to sort
before you use uniq
, because uniq
only looks at adjacent lines. So we have
cat file.txt | sort | uniq -c | sort -r | awk '{print $2, $1}' | head -n 10
Also as you can see the use of -k
and -n
for sort is unnecessary in this case (though not wrong).
Solution 2
The answer to the first question would be (if anyone is interested ?)
tr [:space:] '\n' <$1| sort |uniq -c|sort -k1rn -k2n|awk '{print $2,$1}'|head -12
I still don't know how to do this part .
As an extra feature , i'd like the script to seach number of occurencies of words in the first m lines of file.
Related videos on Youtube
gigiman
Updated on September 18, 2022Comments
-
gigiman over 1 year
I’ve got some problems I’m not capable of overcoming. I need to count the first let's say N words in a text file. Then, I have to print them in decreasing order, followed by the number of occurrences.The words must be sorted alphabetically.
As an example , if I have 6 occurrences of word "a" , 5 of word "b", 5 of word c and n is given as 2, I’ll print:
a 6
b 5
If I have 10 occurrences of word "la" , 5 of word "hi" , 5 of "zzz" and 5 of "arr", and n given as 3 , I’ll print:
la 10
arr 5
hi 5
(the zzz is omitted intentionally).
The problem is that my script (which is below) only prints one word of each number of occurrences.
tr [:space:] '\n' <$1| uniq -c|sort -rnuk1,1|awk '{print $2,$1}'|head -n
As an extra feature, I’d like the script to search number of occurrences of words in the first m lines of file.
-
gigiman over 8 yearsBut what about if let's say i need to print the words that have more than x apperances ?
-
gigiman over 8 years@JeffSchaller ok , but if X is a parameter given as an input?
-
Jeff Schaller over 8 yearsSearch this site for ways that people pass variables into awk. Look for -v