In our current setup at Zomato, user traffic lands on HAProxy. HAProxy has an ELB added as a backend which in turn distributes it to our Apache servers.
Why do we have ELB in our setup? Why don't we distribute traffic directly to Apache servers?
Our Apache instances are autoscaled. ELB makes it easier to add or remove Apache instances when you are using autoscaling.
How does ELB work?
ELB is rumoured to be HAProxy instances on autoscaling. You may have noticed ELB doesn't provide you with a fixed IP address. Instead, it provides you with a domain.
So, when the load increases, the ELB's domain starts pointing to multiple IP addresses.
If you do a dig query, you can see it returning multiple A records. If traffic keeps on increasing, the number of A records returned will also increase.
This setup seems logical, so what's wrong?
HAProxy resolves the ELB domain and uses just one A record returned by the DNS query. This means even if your ELB has scaled-out to 3 instances HAProxy will still use only one instance and the other two instances will sit idle.
What is the workaround for this?
HAProxy developers have already acknowledged that they missed this aspect and it will probably be fixed in the future release.
For now, we add multiple ELB as backend in HAProxy so that it never reaches its saturation point on a single ELB. A better approach is to use Consul to populate HAProxy with Apache servers directly. However, this will add an extra layer of complexity.