Problems with previous load balancing:
Hardware load balancers: don't know enough about the backend web nodes' state. They can adapt over time, but not quick enough (especially when you have heavy apps, low max clients per node limits, and wildly distributed page generation times.... milliseconds vs. multiple seconds)
I told F5 to make an Apache module which keeps the BIG-IP informed about backend states and they liked the idea and (perhaps jokingly) offered me a job. But they haven't got around to it yet, so we did it ourselves, but without the BIG-IP.
Our existing load balancing system: mod_perl apps broadcast their child free/busy stats. Three mod_proxy machines listen to that and with the use of an external rewrite engine (written in C) from mod_rewrite, ProxyPass'es requests to backend nodes. (this is all in our CVS... go steal) Then we hardware load balance round-robin onto alive mod_proxies (our system).
Works well, except: sometimes mod_proxy and/or the the rewrite daemon totally bites it and machine dies. Can't figure out why.
Also, mod_proxy doesn't buffer enough. We want to be able to say: this proxy machine has 2GB of memory. So buffer 2GB of content to slow clients (= everybody slower than 100Mbps) before you have to start bucket-bridaging from the backend. (Goal: keep backends free and not at the wrath of slow clients!)
Also, if mod_proxy connects to a backend node that is down, it gives an error to the remote user, rather than being able to ask the rewrite engine again for a host. I'd be able to solve that with both pieces being together.
I hear mod_accel is better, but it's all in Russian, and there's no mod_rewrite support, as far as I know.
So fuck it, I'm making my own. So far I hacked apart the Mono XSP web server (which hosts .NET web apps) and took the parts I needed: HTTP parser and threading system.
Doesn't seem like it'll be too hard. The remaining questions are:
1) memory fragmentation over time. should be okay... only allocating big chunks and then releasing them. could also allocate fixed-size objects only (64k or so) and chain them together. (this is for buffering replies to users)
2) will the C# (Mono) thread listening for UDP broadcasts be able to keep up? we had that problem with the initial perl version and could never resolve it. perhaps we just have too much broadcast traffic. alternative: have web nodes send along their state info in an HTTP header. the server won't learn as much, but that's not necessary because it's all integrated (the load balancing and learner). if the load balancer tries to connect and it takes more than a threshold amount of time, it can give up for 'n' seconds and try another host instead. another disadvantage is we can't just pipe the backend's reply back to the client if the backend is passing state info back to the load balancer in http headers. or we could... and just send the client along info about the backend's state, but that seems like it's just asking for DoS attacks.
In any case, I'm excited. I need more control over load balancing... that's LJ's weak area I think.
Oh, and I got to submit a bug report to the XSP maintainer. Go open source, rah rah rah!