AWS elastic load balancer basic issues
Solution 1
Micro instances are not designed for sustained load. They allow bursting CPU, but after a short period of time (think 15-30 seconds) at heavy load they will be severely capped.
Try it with a small instance, at least, if you want any sort of useful benchmark.
Solution 2
Check the load on the single servers. ELB does not balance equally the traffic to all instances when this is coming from a single IP (like in the AB test case): it just switches from one instance to another. The final load then cannot be the double of a single instance, but on average it is in any case better than directing all the traffic to one single instance (due to reduce load and faster response)
Solution 3
Make sure you haven't accidentally selected sticky load balancing. This would cause the same user to be directed to the same instance.
Micro instances weren't designed to sustain heavy load. They are for CPU bursting. I can assure you though that micro instances work fine with elastic load balances.
Don't forget there may be other ways to increase the traffic your website can cope with. Eg. Varnish
Related videos on Youtube
Chris J
iOS Engineer (Obj-C & Swift) - Mobile Apps, Startup Veteran, and Cyber Security.
Updated on September 18, 2022Comments
-
Chris J over 1 year
I have an array of EC2 t1.micro instances behind a load balancer and each node can manage ~100 concurrent users before it starts to get wonky.
i would THINK if i have 2 such instances it would allow my network to manage 200 concurrent users... apparently not. When i really slam the server (blitz.io) with a full 275 concurrents, it behaves the same as if there is just one node. it goes from 400ms response time to 1.6 seconds (which for a single t1.micro is expected, but not 6).
So the question is, am i simply not doing something right or is ELB effectively worthless? Anyone have some wisdom on this?
AB logs: Loadbalancer (3x m1.medium) Document Path: /ping/index.html Document Length: 185 bytes Concurrency Level: 100 Time taken for tests: 11.668 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Non-2xx responses: 50001 Total transferred: 19850397 bytes HTML transferred: 9250185 bytes Requests per second: 4285.10 [#/sec] (mean) Time per request: 23.337 [ms] (mean) Time per request: 0.233 [ms] (mean, across all concurrent requests) Transfer rate: 1661.35 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 1 2 4.3 2 63 Processing: 2 21 15.1 19 302 Waiting: 2 21 15.0 19 261 Total: 3 23 15.7 21 304 Single instance (1x m1.medium direct connection) Document Path: /ping/index.html Document Length: 185 bytes Concurrency Level: 100 Time taken for tests: 9.597 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Non-2xx responses: 50001 Total transferred: 19850397 bytes HTML transferred: 9250185 bytes Requests per second: 5210.19 [#/sec] (mean) Time per request: 19.193 [ms] (mean) Time per request: 0.192 [ms] (mean, across all concurrent requests) Transfer rate: 2020.01 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 1 9 128.9 3 3010 Processing: 1 10 8.7 9 141 Waiting: 1 9 8.7 8 140 Total: 2 19 129.0 12 3020
-
khoxsey over 11 years"Worthless" is a bit strong, don't you think? Plenty of firms run sites with hundreds or thousands of instances fronted by ELB.
-
Chris J over 11 yearsi know! thats why i'm saying, I'm clearly doing something wrong here. I cant figure out why the numbers aren't supporting the concept. I have 6 instances behind a load balancer... but the numbers show the same data as 1 instance alone.
-
khoxsey over 11 yearsYes, you are not reading the documentation. Micros are not intended for significant load, Amazon is quite clear about this. You are consuming your "fair use" budget and getting clamped. Your 6 micros are costing you 12c/hr, replace them with a c1.medium at 16.5/hr and retest.
-
Chris J over 11 yearsI have read the documentation, and I don't believe I'm getting hit with cpu throttling.
-
khoxsey over 11 yearsUnderstood. An easy way to verify that you are having ELB problems is to replace the ELB with a suitable instance (let's say m1.small for now) and start up Apache in reverse-proxy mode. If you can get the desired throughput from your micro array, then it is clearly ELB and I would be very interested to see the
ab
output posted here, it would be incredibly helpful. -
Chris J over 11 yearsi did a standard 100 concurrent ab test, and it sort of looks like the single instance is doing a bit better, no?
-
ceejayoz over 11 yearsWhat does a longer test than 10 seconds show (i.e. several minutes)? ELB takes a while to ramp up depending on traffic history.
-
-
khoxsey over 11 years+1 for identifying the oxymoron: load-balanced micros are the jumbo shrimp of the cloud world. For cheap capability, switch to load-balanced c1.mediums (5 ECU per).
-
Chris J over 11 yearsMy understanding is that an array of 6 t1.micro instances SHOULD handle 6 times the load a single instance can handle, but this isn't happening in practice - isn't that what's supposed to happen?
-
ceejayoz over 11 yearsNo, they will not necessarily handle six times the load. Try it with a set of small/medium instances and see.
-
Chris J over 11 yearsYea thats basically what i'm seeing too. Not sure why this is. I know that netflix uses AWS and so it can't be a joke service, but it doesn't seem to be working here.
-
Logic Wreck over 11 yearsActually I was going up from like 20 requests/second, did many tests until got to 250 requests/second and got this high latency. So in general you're very wrong here. Also this was being done with m1.large instances, not t1.micro in my case. Please read carefully next time.
-
Logic Wreck over 11 yearsLike already said, at 250 reqs/second it's rather the ELB then the instance, thus upgrading the instance from micro to small will likely not help although can be tried, but I have serious doubts again.
-
ceejayoz over 11 yearsWhile you're instructing me to read carefully, please read my posts on your answer and tell me where I said you were using a micro instance?
-
bwight over 11 yearsThe problem is that the load balancers need time to scale up. Not every load balancer is equal. If you go from 0 to 250 requests / second in 0 seconds then the load balancer will not be configured to handle that load. Usually takes 5-10 minutes for the load balancer to complete the upgrade. However, like ceejayoz said, the micro instances are horrible you're not going to get very far stress testing micro instances. I can easily handle 3000+ request/second with 1 load balancer without any special configuration.
-
ceejayoz over 11 yearsI believe blitz.io is supposed to use many IPs.
-
WooDzu about 10 yearsThis answer should be the winner I think. If you ab test from a single ip address/range ELB will stick the connections to single EC2 instance regardles whether you're using sticky-sessions OR NOT. I would like to know how to workaround this.