Count total number of lines before/after a pattern match

10,863

Solution 1

Number of lines before and after a match, including the match (i.e. you need to subtract 1 from the result if you want to exclude the match):

sed -n '0,/pattern/p' file | wc -l
sed -n '/pattern/,$p' file | wc -l

But this has nothing to do with IP addresses in particular.

Solution 2

Maybe the easiest is,

sed -n '/pattern/{=; q;}' file

Thanks @JoshepR for pointing the error

Solution 3

I did this two ways, though I think I like this best:

: $(( afterl=( lastl=$(wc -l <~/file) ) - 2 -
  $(( beforel=( matchl=$(sed -n "/$IP/{=;q;}" <~/file) ) - 1
)) ))
for n in last match afters befores
do  printf '%s line%s :\t%d\n' \
        "${n%s}" "${n##*[!s]}" $((${n%s}l))
done

That saves all of those as current shell variables - and evaluates them in the for loop afterwards for output. It counts the total lines in the file with wc and the gets the first matched line number with sed.

Its output:

last line :     1000
match line :    200
after lines :   799
before lines :  199

I also did:

sed -n "/$IP/=;\$=" ~/file |  
tr \\n \  | { 
IFS=' ' read ml ll 
printf '%s line%s:\t%d\n' \
    last '' $((ll=${ll##* }))
    match '' $ml \
    after s "$((al=ll-ml-1)) \ 
    before s $((bl=ml-1))
}

sed prints only matching and last line numbers, then tr translates the intervening \newlines to , and read reads the first of sed's results into $ml and all others into $ll. Possible multiple match cases are handled by stripping all but the last result out of $ll's expansion when setting it again later.

Its output:

last line :     1000
match line :    200
after lines :   799
before lines :  199

Both methods were tested on the file generated in the following way:

IP='some string for which I seek' 
for count in 1 2 3 4 5 
do  printf '%.199d%s\n' 0 "$IP" 
done | tr 0 \\n >~/file 

It does, by line number:

  1. sets the search string
  2. loops five times to ensure there will be multiple matches
  3. prints 199 zeroes then "$IP" then a \newline
  4. pipes output to tr - which translates zeroes to \newlines then into ~/file

Solution 4

Here's a little bit of Perl code that does it:

perl -ne '
     if(1 .. /192\.168\.1\.1/) { $before++ }
     else                      { $after++  }
     $before--; # The matching line was counted
     END{print "Before: $before, After: $after\n"}' your_file

This counts the total number of lines before and after the line containing the IP 192.168.1.1. Replace with your desired IP.

Using nothing but Bash:

before=0
match=0
after=0
while read line;do
    if [ "$line" = 192.168.1.1 ];then
        match=1
    elif [ $match -eq 0 ];then
        before=$(($before+1))
    else
        after=$(($after + 1))
    fi
done < your_file
printf "Before: %d, After: %d\n" "$before" "$after"

Solution 5

An awk solution reporting number of lines before and after last match

awk '/192\.168\.1\.1/{x=NR};{y=NR} END{printf "before-%d, after-%d\n" , x-1, y-x}'  file
Share:
10,863

Related videos on Youtube

Mandar Shinde
Author by

Mandar Shinde

Updated on September 18, 2022

Comments

  • Mandar Shinde
    Mandar Shinde over 1 year

    I am having a long list of IP addresses, which are not in sequence. I need to find how many IP addresses are there before/after a particular IP address. How can I achieve this?

    • cuonglm
      cuonglm almost 10 years
      Do you have duplicated IP?
    • Mandar Shinde
      Mandar Shinde almost 10 years
      No. All IP addresses are unique.
    • vinc17
      vinc17 almost 10 years
      What does before/after mean for an IP address? In particular, do you have both IPv4 and IPv6 addresses? How do they compare?
    • cuonglm
      cuonglm almost 10 years
      Do you need the file sorted?
    • Mandar Shinde
      Mandar Shinde almost 10 years
      @vinc17- Number of IP addresses present before/after the match is found. Only IPv4 addresses are there.
    • Mandar Shinde
      Mandar Shinde almost 10 years
      @Gnouc- File contents must not be changed, so sorting is not advisable in this case.
    • vinc17
      vinc17 almost 10 years
      This is not clear. Does the file contain other data? etc. You should give an example with the expected result.
    • Mandar Shinde
      Mandar Shinde almost 10 years
      @vinc17 - File only contains IP addresses (IPv4), no other data is included. If there are 1000 IP addresses in total, and match is found at 300th location, means there are 299 lines before the match and 700 lines after the match.
    • mikeserv
      mikeserv almost 10 years
      @MandarShinde - please see juampa's answer for the most straightforward way to do this.
    • Jeff Hewitt
      Jeff Hewitt almost 10 years
      @MandarShinde You mentioned that you prefer a Bash solution; however, the answer you accepted uses sed and that's perfectly fine of course. I just wanted to bring to your attention that sed is not Bash. It is a Turing complete scripting language on its own. Similarly wc is not a part of Bash but a stand-alone tool.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    This counts the number of occurrences of the target IP. This is not what the OP asked for.
  • Matej Vrzala M4
    Matej Vrzala M4 almost 10 years
    I just edited it, if he is asking to count all other IPs before and after a particular IP, the edit should work for him.
  • Mandar Shinde
    Mandar Shinde almost 10 years
    BASH is preferred.
  • cuonglm
    cuonglm almost 10 years
    @Joseph R.: Why don't you use $. instead of a counter?
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    @Gnouc I could of course. I just think this is more readable than setting $after to $. - $before.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    @MandarShinde Please see the edit. I added a pure Bash answer.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    This just prints the line number on which the pattern occurred.
  • mikeserv
    mikeserv almost 10 years
    @JosephR. - no, it prints every line number on which every match occurs.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    @mikeserv I know but the OP specified that IP addresses are unique. The OP also doesn't want the line number where the match(es) occurred; they want the number of lines before the pattern occurred and the number of lines after it.
  • Dani_l
    Dani_l almost 10 years
    You know, for pure readability, you could replace the before=$(($ and after=$(($ with just ((before++)) and ((after++)). no "=" or "$" required.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    @Dani_l True, but I was once told that the $(($ forms are more portable.
  • Jeff Hewitt
    Jeff Hewitt almost 10 years
    @mikeserv I'm not arguing that the information from this answer isn't useful, I'm just saying that this code on its own doesn't do what the OP wants.
  • Nico
    Nico almost 10 years
    yes, you're right. Fixed it