Block 1.4 million IP addresses on VPS

6,386

Solution 1

You should have a look into ipset.

From the official website:

Ipset may be the proper tool for you [...] to store multiple IP addresses or port numbers and match against the collection by iptables.

[...] (Ipset) may store IP addresses, networks, (TCP/UDP) port numbers, MAC addresses, interface names or combinations of them in a way, which ensures lightning speed when matching an entry against a set.

To use it, you need to create an ipset, add the IPs and create an iptables rule to match with the ipset:

ipset create blacklist hash:ip hashsize 1400000
ipset add blacklist <IP-ADDRESS>
iptables -I INPUT -m set --match-set blacklist src -j DROP

A real life example of usage can be found here. Notice that it uses ipset restore instead of going through each IP in a loop because it’s much more faster.

If your list of IPs has overlaps, you may want to preprocess it to convert to IP ranges where possible. Here is an example of a tool to do it. It won't get you better performances with ipset but it will reduce the size of your list.


On a side note, in term of performances, it is very fast and scale without penalty. As the Cloudflare's blog mention, there are faster low level approaches; but it's much more complex and only adds a few bytes per seconds, which, unless you have the scale and ambition of a cloud provider, are not worth the effort.

Solution 2

Frame challenge - what's the shorter list, authorised or blocked addresses?

Rather than denying 1.4 million, simply allow the perhaps ~dozen IPs you want to permit, and default-deny everything.

Solution 3

If the IP addresses operate in a well-defined range, then you can use ufw like this to block traffic:

sudo ufw deny from 192.0.0.0/8 to any

The example above blocks all traffic from 192.0.0.1 to 192.255.255.254, which works out to 16,777,214 addresses and this has zero (noticeable) effect on network throughput.

So long as your IP list is in a workable fashion to generate IP ranges, this may work for you.

Solution 4

You can minimize look-ups to gain more speed by tree-structuring your rules. You can for example do it based on the first part of the IP i.e. /8 like so:

iptables -N rule8_192_0_0_0
iptables -N rule8_172_0_0_0
iptables -N rule8_10_0_0_0

iptables -A INPUT -s 192.0.0.0/8 -j rule8_192_0_0_0
iptables -A INPUT -s 172.0.0.0/8 -j rule8_172_0_0_0
iptables -A INPUT -s 10.0.0.0/8 -j rule8_10_0_0_0

iptables -A rule8_192_0_0_0 -s 192.168.2.3 -j DROP
iptables -A rule8_172_0_0_0 -s 172.16.2.3 -j DROP
iptables -A rule8_10_0_0_0 -s 10.10.2.3 -j DROP

Solution 5

There's another improvement that directly solves your 3 Mb/s problem:

iptables -I INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

This allows established connections to traverse as few iptables rules as possible, although using ipset to improve the IP address lookup speed is still necessary for new connections to establish faster.

No matter how many other rules you have, this is a good one to deploy as the first rule.

Share:
6,386

Related videos on Youtube

Kamil Skwirut
Author by

Kamil Skwirut

Updated on September 18, 2022

Comments

  • Kamil Skwirut
    Kamil Skwirut over 1 year

    How can I block a list of about 1.4 million IP addresses? I've already tried to do it with iptables PREROUTING, like:

    -A PREROUTING -d IP_HERE/32 -j DROP

    But with this many records, my bandwidth goes down like crazy when I do a speedtest.

    Without blocked IPs in iptables:

    1 Gb/s

    With blocked IPs in iptables:

    3 Mb/s at peak.

    I want to use XDP_DROP like here (last step): https://blog.cloudflare.com/how-to-drop-10-million-packets/

    But I don't have an idea how to use this. :/ (I'm really bad at programing)

    Are there alternatives to this approach?

    • user253751
      user253751 over 2 years
      Can we ask why you want to block 1.4 million IPs? That's a lot of IPs. Might be easier to make sure your server is secure instead.
    • peterh
      peterh over 2 years
      There is a new thing named ipset. I do not know it, but it might worth a try. It is the new firewall framework in linux, actually iptables today is only a compat layer over ipset.
    • rtaft
      rtaft over 2 years
      If you are trying to block IPs based on location/country, please say so, there are solutions to this that don't involve millions of iptable entries.
    • user253751
      user253751 over 2 years
      also please don't block IPs based on location/country without a very good reason. Not just "oh there are hackers in that country"
    • ilkkachu
      ilkkachu over 2 years
      @peterh, do you mean nftables? I think ipset has existed for a while, and AFAIK it's only about rules that involve, well, a set of addresses
    • TooTea
      TooTea over 2 years
      @peterh ipsets are included in the kernel since 2.6.39, released ten years ago. They already existed before that as an external patch.
    • user26742873
      user26742873 over 2 years
      @user253751 Maybe the op is blocking the whole EU for the cookies law? :P
    • peterh
      peterh over 2 years
      @TooTea Ok, thanks. What is interesting to me, how the ipset matches an ip to an ipset (which is likely a set of ips with mask). Does it use a tree or hash internally? If yes, it will be very fast. With iptables, the only way to match an ip to a set of rules is linear search because Turing.
    • jcaron
      jcaron over 2 years
      Are the IP addresses really individual IP addresses, or are they part of a limited number of ranges?
    • Kamil Skwirut
      Kamil Skwirut over 2 years
      IP's are individual and most are proxies
    • mckenzm
      mckenzm over 2 years
      Of course it should be a hashed (or it will be otherwise pretty unbalanced) index. "lightning speed" according to the reference in @Cyrbil s answer.
  • Hacky
    Hacky over 2 years
    Let us all assume that OP wants to block addresses that are not in a range.
  • iBug
    iBug over 2 years
    UFW is, as it describes itself, a frontend for iptables. This makes its performance even worse than manually maintaining iptables chains.
  • iBug
    iBug over 2 years
    Processing single IPs into ranges is definitely a must. Then you can use hash:net for the set and even better performance.
  • Useless
    Useless over 2 years
    Why worse? It's not like the iptables rules are calling out to ufw, it's just a frontend for configuring them in the first place. Obviously it won't be better either, though.
  • Luc H
    Luc H over 2 years
    This sound more like he wants to block a predefined set of "Bad IPs". For most applications a whitelist system will probably not be useful
  • iBug
    iBug over 2 years
    @Useless UFW creates more chains for every packet to traverse, whereas manually maintained rules can be much simpler and thus more performant.
  • Useless
    Useless over 2 years
    So say "ufw generates overly-simple rules and you can hand-craft better ones" or something. UFW's own performance isn't an issue, and the fact that it's a frontend doesn't automatically make its rules bad. It's no worse than hand-written rules that don't make clever use of chains.
  • Nate T
    Nate T over 2 years
    @useless since it is a frontend, every time a request shows up. it is processed by ufw which then calls on iptables behind the scenes. Once iptables has matched ip against the listed rules, It has to pass this info back to ufw which would allow or deny. This is the typical fe / be flow. Cutting out ufw eliminates half the steps. That said, performance inc / dec would depend not on how many ips are being blocked, but how many requests are actually coming in.
  • Useless
    Useless over 2 years
    @NateT I already pointed out that that's absolutely not true. ufw just provides a simple frontend for iptables, which itself just configures rules in the ip_tables netfilter module. Packet filtering activity never flows from the ip_tables kernel module out to the iptables userspace component, much less the ufw frontend for that.
  • Nate T
    Nate T over 2 years
    Then it is NOT A FRONTEND. App 1 updating the app 2 data store does not make app 1 a "front end" for app 2. If both apps are not being called on in the order I described, you can call it whatever else you want (call it "Nancy in her red dress" for all I care), but calling it a front end in a discussion about speed-of-algorithm is never a good idea. Users of this network are always catching flak for being too picky about terms, but ^this^ is what a slightly misused term can do. @useless
  • Nate T
    Nate T over 2 years
    If I am misunderstanding, please describe the flow of events. If both apps are being used, they are both taking up memory and temporal resources, even if they are only printing hello world to the console. You know what? Ill look it up so you dont have to type it. I'm curious now anyway. The only exception I can think of is the case where iptables is not called at all and only its data is used / updated. In that case, ^^^