How to keep load-balanced servers synced even with deleted files?

8,821

Solution 1

I'm using OCFS2 with DRBD.

A DRBD resource /etc/drbd.d/r0.res:

resource r0 {
    syncer { rate 1000M; }
    net {
        allow-two-primaries;
        after-sb-0pri discard-zero-changes;
        after-sb-1pri discard-secondary;
        after-sb-2pri disconnect;
    }
    startup { become-primary-on both; }

    on s1 {
        device      /dev/drbd1;
        disk        /dev/sdc;
        address     ip1:7789;
        meta-disk   internal;
    }
    on s2 {
        device      /dev/drbd1;
        disk        /dev/xvdb2;
        address     ip2:7789;
        meta-disk   internal;
    }
}

/dev/drbd1 is formatted as ocfs2 filesystem:

/dev/drbd1   ocfs2   100660180   7427076  93233104   8% /data/webroot

Configuration for OCFS2 without Pacemaker /etc/ocfs2/cluster.conf:

node:
    ip_port = 7777
    ip_address = ip1
    number = 0
    name = s1
    cluster = ocfs2

node:
    ip_port = 7777
    ip_address = ip2
    number = 1
    name = s2
    cluster = ocfs2

cluster:
    node_count = 2
    name = ocfs2

DRBD status can be looked at with drbd-overview utility:

# drbd-overview 
  1:r0  Connected Primary/Primary UpToDate/UpToDate C r---- /data/webroot ocfs2 96G 9.8G 87G 11% 

or from /proc/drbd:

cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:09

 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
    ns:953133955 nr:42207234 dw:1185526354 dr:62396241 al:230084 bm:5853 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Solution 2

We are currently using rsync also, but I'm not crazy about it.

We have been experimenting with fileconveyor, which not only will sync between two servers, but we can also sync up with S3, Cloudfiles or other cloud storage. This will obviously provide us a lot more flexibility.

I don't have any config setups to share at this moment, but we are liking what we see.

Solution 3

I have not used it in a server setup, but you might try Unison. It deals with changes on either side and will automatically sync things that aren't conflicting. I believe it is limited to 2 hosts, so it wouldn't scale past your current solution.

The only way I know how to scale past 2 hosts would be to set up NFS, or some other shared/distributed filesystem.

Solution 4

Another option would be to build an "authoritative" replica of the content apart from the front-facing webservers and make sure all updates and changes are made on that replica.

Then, you deploy from that server to any number of front-facing servers on a set schedule.

Yes, it's an extra copy of the content but it does give you some potential benefits:

1) Control of when the updates go live

2) Less complexity in handling multi-direction sync between any number of servers

3) The ability to make changes and preview them without impact your front-facing production.

Other options are some type of shared storage spread across as much hardware as you need for reliability, performance, and scalability.

Share:
8,821

Related videos on Youtube

Derek Downey
Author by

Derek Downey

As a technologist excited about open source database systems and the businesses that they power, I am enthusiastic about efficiency through automation, Operational Visibility, the adoption of Cloud Technologies, and Virtual/Augmented reality enabling #RemoteWork by default. I work for #Google as a Developer Advocate for their Database Cloud services

Updated on September 18, 2022

Comments

  • Derek Downey
    Derek Downey over 1 year

    I've recently setup a loadbalanced solution for our websites. We host about 200 sites, most run of our custom application, but some are running wordpress blogs (in which files can be uploaded/deleted). The setup is basic:

              |-------------------> Apache1
              |
     HAProxy -|
              |
              |-------------------> Apache2
    

    I've set up Apache1 as a 'master', so that most of the changes made on it are rsync'd over to Apache2 every minute using the following command:

    rsync -av --delete apache1:/var/www/html/ /var/www/html/
    

    The problem is, as mentioned earlier, in some cases files are added/removed on Apache2. The only solution I've come up with so far is to have Apache1 rsync all files in certain directories (wp-content, for instance) to itself (not delete), then push everything back to Apache2.

    This has it's flaws, the main ones being:

    • The two servers will eventually get extra files that have been deleted on Apache2
    • As I add more servers, the rsync script will take longer to complete.

    Are there any ways to keep 2+ web servers synched, taking into account that both servers can have files added, updated and deleted?

    • pfo
      pfo over 12 years
      this setup cries for some shared storage.
    • Antoine Benkemoun
      Antoine Benkemoun over 12 years
      Or for storage on a git !
  • user1364702
    user1364702 over 12 years
    +1 since this sounds like the path I'd look at. If you want it scripted, something like Unison. A better solution, depending on network interconnect and speed, would be something like DRDB filesystem that would sync the two filesystems automatically. The popular way to deal with it is a shared storage SAN with RAID but that can introduce a single point of failure.
  • phemmer
    phemmer over 12 years
    You can use unison on more than 2 hosts. Mesh topology; have hostA sync with hostB, then hostC, then have hostB sync with hostC. Not the prettiest, but it works just fine.