Load balancing a Windows File Share using HA-Proxy

11,123

File replication is much more difficult problem than you might first envision.

File replication typically does not scale well. You'll start to see problems when the number of files your handling is half a million or more, either the copy takes longer than it takes to do the sync so either you'll need to sticky the session for a longer period and reduce the intervals between copies or copy fewer files.

From the little I know about your specific workload this might still be OK for you. You said the file share is read only which leads me to believe you update the data in large batch quantities. Robocopy might be slow under these circumstances yet since the interval between changes is so long this might be an acceptable risk.

Seeing as HAProxy offers comparative intelligence to a layer 4 load balancer in this setup it might be more beneficial to use a layer 4 load balancer too as they will typically handle more throughput with less latency under high loads. That might not apply to your problem but food for thought.

If you require features and performance (like r/w shares that need to be closely synced) then this wont work. If you think you'll need this with this dataset in the future consider your solution carefully as your dataset might be terabytes in size by then and you wouldn't want to be in a situation where your having to scrap it and reupload it to a new solution.

Share:
11,123

Related videos on Youtube

nbevans
Author by

nbevans

Updated on September 17, 2022

Comments

  • nbevans
    nbevans almost 2 years

    After pulling my hair out over DFS I just had this weird and potentially dangerous idea come into my head whereby, just possibly, I might be able to use HA-Proxy to load balance a file share between servers.

    I've done some remedial packet traces and it does appear that TCP port 445 is the only thing involved in using Windows file sharing. I've always thought for many years that UDP 139, 135 etc were also involved in at least establishing the connection - but apparently not!

    So I setup a basic test:

    listen SMBTest *:445
      mode tcp
      server Smb1 172.16.61.201:445
      server Smb2 172.16.61.202:445
    

    And you'll never guess what... it works??? (!)

    Now obviously there is the whole concern about synchronisation between the file servers (of course). That could easily be taken care of with a little bit of Robocopy script.

    And considering I only need a HA read-only file share there wouldn't be any issues with regard to file locking etc.

    • Can anyone tell me if what I'm playing with here is fire? I really didn't think it would work at all and now I'm a little shocked.
    • What would be the downsides?
    • Could this be relied upon for a production environment?
    • joeqwerty
      joeqwerty over 13 years
      Instead of reinventing the wheel maybe you could tell us what kind of trouble you're having with FDS and we could assist with that.
    • nbevans
      nbevans over 13 years
      My problem with DFS is that it won't install. Fresh VM running 2008R2 Standard Edition with SP1 (not slipstream). I can install everything except for the DFS Namespaces component. This fails to install. Doesn't give any error messages - just code 0x80070643. And tells me to reboot before trying again. Screenshot @ i.imgur.com/kld33.png
    • tsykoduk
      tsykoduk over 13 years
      I've had tons of problems with DFS. But that's another story. HAProxy is just directing traffic to ip addresses - so the only concern that I would have is keeping the two shares updated. The replication part of DFS worked well for me, so that might be a good solution to keeping the folders synced.
    • tony roth
      tony roth over 13 years
      0x80070643 this error is a CBS error, I downloaded the checksur utility from MS it will help you track down the issue.