URL Filtering with IP tables
I found the reason why this is not working.
You cannot rely on having the entire HTTP request on a single packet being inspected by netfilter. The same packet will not match 'GET /' and 'Host:*' since that payload is spread through several packets.
Consider the following list of rules:
-A OUTPUT -p tcp -m tcp --dport 80 -j URLFILTER # would be FORWARD in your case
-A URLFILTER -m string --string "Host: www.kernel.org" --algo bm --from 1 --to 500 --icase -j LOG --log-prefix UF_MATCHHOST
-A URLFILTER -m string --string "GET /" --algo bm --from 1 --to 500 --icase -j LOG --log-prefix UF_MATCHGET
An HTTP call to www.kernel.org like so
GET / HTTP/1.0
Host: www.kernel.org
Will match both rules in reverse order, proving that the chain URLFILTER was traversed by more than one packet; The first carrying the GET string and the second carrying the Host string. Therefore you cannot simultaneously match GET and Host without further work.
[471493.767020] UF_MATCHGETIN= OUT=enp0s31f6 SRC=192.168.20.204 DST=147.75.205.195 LEN=67 TOS=0x00 PREC=0x00 TTL=64 ID=65494 DF PROTO=TCP SPT=51624 DPT=80 WINDOW=229 RES=0x00 ACK PSH URGP=0
[471499.761216] UF_MATCHHOSTIN= OUT=enp0s31f6 SRC=192.168.20.204 DST=147.75.205.195 LEN=73 TOS=0x00 PREC=0x00 TTL=64 ID=65495 DF PROTO=TCP SPT=51624 DPT=80 WINDOW=229 RES=0x00 ACK PSH URGP=0
Maybe you could track each connection that matches GET / and match the following packets, I reckon that would be possible.
Netfilter may do this for you but it is far from being the best tool for the job.
Original answer:
Your
-A TCPFILTER -m string --string "GET /" --algo bm --from 1 --to 70 -j URLFILTER
entry is not matching, are you sure you can match strings against raw HTTP traffic, can you see the string using tcpdump -vv? Can you try a simpler match and see if that works?
The wireshark output you are showing is the parsed packet, not what iptables is seeing necessarily. You want to see the hex/ascii payload of the packet to double-check.
Related videos on Youtube
Mustafa Mujahid
Updated on September 18, 2022Comments
-
Mustafa Mujahid over 1 year
I've been trying to set up URL filtering with
Iptables
. I have set up two interfaces. Traffic flows in from one interface and flows out from the other.Below are the
Iptables
I have configured:*filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :TCPFILTER - [0:0] :URLFILTER - [0:0] ####################### -A INPUT -i lo -j ACCEPT -A INPUT -p icmp -m icmp --icmp-type any -j ACCEPT -A INPUT -m state --state ESTABLISHED,RELATED,NEW -j ACCEPT #BGP -A INPUT -p tcp -m state --state NEW -m tcp --dport 179 -j ACCEPT # Pass traffic to filters which have TCP Flags PSH,ACK and DST Port 80 -A FORWARD -p tcp --tcp-flags PSH,ACK PSH,ACK --dport 80 -j TCPFILTER -A FORWARD -p tcp --dport 80 -j TCPFILTER -A FORWARD -j ACCEPT # Further process only packets with HTTP Get Request -A TCPFILTER -m string --string "GET /" --algo bm --from 1 --to 70 -j URLFILTER -A TCPFILTER -j ACCEPT -A URLFILTER -m string --algo bm --to 500 --string "Host: www.tutorialspoint.com" -p tcp -j REJECT --reject-with tcp-reset -A URLFILTER -j ACCEPT COMMIT
As per the Wireshark output of the packets received is as shown below
I believe it should work but I'm not getting any hits on the
URLFILTER
chain as shown below:# iptables -L -v -n Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0 7 352 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0 icmp type 255 108 6621 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state NEW,RELATED,ESTABLISHED 0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2002 0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:179 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 39 34674 TCPFILTER tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 flags:0x18/0x18 253 13392 TCPFILTER tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 Chain OUTPUT (policy ACCEPT 75 packets, 13141 bytes) pkts bytes target prot opt in out source destination Chain TCPFILTER (2 references) pkts bytes target prot opt in out source destination 0 0 URLFILTER all -- * * 0.0.0.0/0 0.0.0.0/0 STRING match "GET /" ALGO name bm FROM 1 TO 70 292 48066 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 Chain URLFILTER (1 references) pkts bytes target prot opt in out source destination 0 0 REJECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 STRING match "Host: www.tutorialspoint.com" ALGO name bm TO 500 reject-with tcp-reset 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0
The traffic is accepted and passed to the
TCPFILTER
but is unable to match againstGET /
string.Any Assistance would be greatly appreciated.
-
Alessio about 6 yearsdo yourself a favour and use a proxy like squid instead. force client machines to use it by either blocking or transproxying ports 80 & 443.
-
Alessio about 6 yearsThat's weird. I'm running squid on an AMD Phenom II 1090T 6-core/6-thread CPU, with my own custom squid-redir perl script with hundreds of regex rules AND 18337 acl rules (built from various regularly updated ad-blocking rules) and have no problems at all with throughput or CPU consumption. Squid barely even registers as any kind of consumer of system resources. BTW, this machine is also running postfix w/ spamassassin and hundreds of custom header/body check rules, apache, gitlab, bind9 for both authoritative zones and recursive caching, iptables + fail2ban, and lots of other stuff.
-
Alessio about 6 yearswhat kind of acl rules are you using? complex regex or simple things like dstdomain?
-
Alessio about 6 yearsBTW, years ago (around 1997, (working for an ISP providing internet services for schools) I used to run squid for entire secondary schools with hundreds of simultaneous users and enormous acl rulesets on 486 DX-40 machines (years before pentium) with 64MB RAM or less. I suspect that if you're having load or throughput problems, squid probably isn't the cause.
-
Mustafa Mujahid about 6 years@cas, hmm, I was surprised myself, I would like to discuss this with you over chat if that may be possible. Please let me know.
-
-
Mustafa Mujahid about 6 yearsYes,
tcpdump -vv
shows the the stringGET /
. Besides I have tested it and iptables do infact match against raw HTTP traffic. I can matchReferer
orHost
but the above issue persists. -
Pedro about 6 yearsOK, it needs a little bit more research, but I think I found the reason why it is not working. When you are matching packets on netfilter you can't rely on having the entire HTTP request on the first packet. Whilst it is reasonable to have the first 5 bytes or so (hence reliably matching GET /), the remainder is less likely to happen.
-
Mustafa Mujahid about 6 yearsHmm, This seems to be a logical explanation. but now the question of keeping track of the packets!!