Counting occurrences in first column of a file

linux bash perl awk

20,080

If the input is sorted, you can use uniq:

<infile cut -d' ' -f1 | uniq -c

If not, sort it first:

<infile cut -d' ' -f1 | sort -n | uniq -c

Output:

  3 1                                      
  1 3
  2 52

The output is swapped compared to your requirement, you can use awk '{ print $2, $1 }' to change that.

1 3 
3 1
52 2

There's also the awk idiom, which does not require sorted input:

awk '{h[$1]++}; END { for(k in h) print k, h[k] }'

Output:

1 3
52 2
3 1

As the output here comes from a hash it will not be ordered, pass to sort -n if that is needed:

awk '{h[$1]++} END { for(k in h) print k, h[k] }' | sort -n

If you're using GNU awk, you can do the sorting from within awk:

awk '{h[$1]++} END { n = asorti(h, d, "@ind_num_asc"); for(i=1; i<=n; i++) print d[i], h[d[i]] }'

In the last two cases the output is:

1 3
3 1
52 2

20,080

I believe If some days your dignity came down don't give up hope because the sun every evening sets to rise tomorrow's morning

Updated on September 18, 2022

Arash over 1 year
We have this file:
```
1 2 
1 3
1 2
3 3
52 1
52 300
```
and 1000 more.

I want to count the number of times each value occurs in the first column.
```
1  3 
3  1
52 2
```
This means we saw 1 three times.

How can I do that, in Perl, AWK or Bash?
- slhck over 11 years
  
  Hi arashams! I saw you recently asked very similar questions that all revolve around the same topic. I'm sure the community would like to help you, but maybe you could show us what you've already tried and where exactly you got stuck? We require people to show a little effort before asking their questions – there isn't any learning involved from simply asking others to give you the code for a specific thing. Why not tell us what exactly the background of this is? Maybe there is an easier way to accomplish what you want, and we don't need to resort to dummy examples with some abstract numbers?
- Arash over 11 years
  
  tnx for your help. i'm working with bgpdump data and parsing them.
Arash over 11 years

could you plz explain the code??? awk '{h[$1]++} END { for(k in h) print k, h[k] }' | sort -n
Thor over 11 years

@arashams: The {h[$1]++} block is evaluated for each line. h is a hash and $1 is the first column and used as the key into h. So this tallies how often unique $1's are seen. The END block is executed at the end of input, and prints the keys and tallies. sort -n sorts the output numerically.