URL Filtering with IP tables

7,725

I found the reason why this is not working.

You cannot rely on having the entire HTTP request on a single packet being inspected by netfilter. The same packet will not match 'GET /' and 'Host:*' since that payload is spread through several packets.

Consider the following list of rules:

-A OUTPUT -p tcp -m tcp --dport 80 -j URLFILTER # would be FORWARD in your case
-A URLFILTER -m string --string "Host: www.kernel.org" --algo bm --from 1 --to 500 --icase -j LOG --log-prefix UF_MATCHHOST
-A URLFILTER -m string --string "GET /" --algo bm --from 1 --to 500 --icase -j LOG --log-prefix UF_MATCHGET

An HTTP call to www.kernel.org like so

GET / HTTP/1.0
Host: www.kernel.org

Will match both rules in reverse order, proving that the chain URLFILTER was traversed by more than one packet; The first carrying the GET string and the second carrying the Host string. Therefore you cannot simultaneously match GET and Host without further work.

[471493.767020] UF_MATCHGETIN= OUT=enp0s31f6 SRC=192.168.20.204 DST=147.75.205.195 LEN=67 TOS=0x00 PREC=0x00 TTL=64 ID=65494 DF PROTO=TCP SPT=51624 DPT=80 WINDOW=229 RES=0x00 ACK PSH URGP=0 
[471499.761216] UF_MATCHHOSTIN= OUT=enp0s31f6 SRC=192.168.20.204 DST=147.75.205.195 LEN=73 TOS=0x00 PREC=0x00 TTL=64 ID=65495 DF PROTO=TCP SPT=51624 DPT=80 WINDOW=229 RES=0x00 ACK PSH URGP=0 

Maybe you could track each connection that matches GET / and match the following packets, I reckon that would be possible.

Netfilter may do this for you but it is far from being the best tool for the job.

Original answer:

Your

-A TCPFILTER -m string --string "GET /" --algo bm --from 1 --to 70 -j URLFILTER

entry is not matching, are you sure you can match strings against raw HTTP traffic, can you see the string using tcpdump -vv? Can you try a simpler match and see if that works?

The wireshark output you are showing is the parsed packet, not what iptables is seeing necessarily. You want to see the hex/ascii payload of the packet to double-check.

Share:
7,725

Related videos on Youtube

Mustafa Mujahid
Author by

Mustafa Mujahid

Updated on September 18, 2022

Comments

  • Mustafa Mujahid
    Mustafa Mujahid over 1 year

    I've been trying to set up URL filtering with Iptables. I have set up two interfaces. Traffic flows in from one interface and flows out from the other.

    Below are the Iptables I have configured:

    *filter
    :INPUT ACCEPT [0:0]
    :FORWARD ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    :TCPFILTER - [0:0]
    :URLFILTER - [0:0]
    #######################
    -A INPUT -i lo -j ACCEPT
    -A INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
    -A INPUT -m state --state ESTABLISHED,RELATED,NEW -j ACCEPT
    
    #BGP
    -A INPUT -p tcp -m state --state NEW -m tcp --dport 179 -j ACCEPT
    
    # Pass traffic to filters which have TCP Flags PSH,ACK and DST Port 80
    
    -A FORWARD -p tcp --tcp-flags PSH,ACK PSH,ACK --dport 80 -j TCPFILTER
    -A FORWARD -p tcp --dport 80 -j TCPFILTER
    -A FORWARD -j ACCEPT
    
    # Further process only packets with HTTP Get Request
    
    -A TCPFILTER -m string --string "GET /" --algo bm --from 1 --to 70 -j URLFILTER
    -A TCPFILTER -j ACCEPT
    -A URLFILTER -m string --algo bm --to 500 --string "Host: www.tutorialspoint.com" -p tcp -j REJECT --reject-with tcp-reset
    -A URLFILTER -j ACCEPT
    COMMIT
    

    As per the Wireshark output of the packets received is as shown below

    enter image description here

    I believe it should work but I'm not getting any hits on the URLFILTER chain as shown below:

        # iptables -L -v -n
    Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
     pkts bytes target     prot opt in     out     source               destination
        0     0 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0
        7   352 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0           icmp type 255
      108  6621 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           state NEW,RELATED,ESTABLISHED
        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           state NEW tcp dpt:2002
        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           state NEW tcp dpt:179
    
    Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
     pkts bytes target     prot opt in     out     source               destination
       39 34674 TCPFILTER  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:80 flags:0x18/0x18
      253 13392 TCPFILTER  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:80
        0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0
    
    Chain OUTPUT (policy ACCEPT 75 packets, 13141 bytes)
     pkts bytes target     prot opt in     out     source               destination
    
    Chain TCPFILTER (2 references)
     pkts bytes target     prot opt in     out     source               destination
        0     0 URLFILTER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           STRING match "GET /" ALGO name bm FROM 1 TO 70
      292 48066 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0
    
    Chain URLFILTER (1 references)
     pkts bytes target     prot opt in     out     source               destination
        0     0 REJECT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           STRING match "Host: www.tutorialspoint.com" ALGO name bm TO 500 reject-with tcp-reset
        0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0
    

    The traffic is accepted and passed to the TCPFILTER but is unable to match against GET / string.

    Any Assistance would be greatly appreciated.

    • Alessio
      Alessio about 6 years
      do yourself a favour and use a proxy like squid instead. force client machines to use it by either blocking or transproxying ports 80 & 443.
    • Alessio
      Alessio about 6 years
      That's weird. I'm running squid on an AMD Phenom II 1090T 6-core/6-thread CPU, with my own custom squid-redir perl script with hundreds of regex rules AND 18337 acl rules (built from various regularly updated ad-blocking rules) and have no problems at all with throughput or CPU consumption. Squid barely even registers as any kind of consumer of system resources. BTW, this machine is also running postfix w/ spamassassin and hundreds of custom header/body check rules, apache, gitlab, bind9 for both authoritative zones and recursive caching, iptables + fail2ban, and lots of other stuff.
    • Alessio
      Alessio about 6 years
      what kind of acl rules are you using? complex regex or simple things like dstdomain?
    • Alessio
      Alessio about 6 years
      BTW, years ago (around 1997, (working for an ISP providing internet services for schools) I used to run squid for entire secondary schools with hundreds of simultaneous users and enormous acl rulesets on 486 DX-40 machines (years before pentium) with 64MB RAM or less. I suspect that if you're having load or throughput problems, squid probably isn't the cause.
    • Mustafa Mujahid
      Mustafa Mujahid about 6 years
      @cas, hmm, I was surprised myself, I would like to discuss this with you over chat if that may be possible. Please let me know.
  • Mustafa Mujahid
    Mustafa Mujahid about 6 years
    Yes, tcpdump -vv shows the the string GET /. Besides I have tested it and iptables do infact match against raw HTTP traffic. I can match Referer or Host but the above issue persists.
  • Pedro
    Pedro about 6 years
    OK, it needs a little bit more research, but I think I found the reason why it is not working. When you are matching packets on netfilter you can't rely on having the entire HTTP request on the first packet. Whilst it is reasonable to have the first 5 bytes or so (hence reliably matching GET /), the remainder is less likely to happen.
  • Mustafa Mujahid
    Mustafa Mujahid about 6 years
    Hmm, This seems to be a logical explanation. but now the question of keeping track of the packets!!