|Reverse proxies, multiple backends, access controls, ...
||[Sep. 10th, 2003|10:49 pm]
evan, avva, and I have been having fun with mod_rewrite's external rewrite engine support to do some whacky stuff. Combined with mod_proxy, using mod_rewrite's [P] support, we're going to be switching to a new type of load balancing for LiveJournal that's much smarter than our existing stuff. We'll still use the BIG-IP in front of it all to hit a random alive proxy, but we won't use the BIG-IP for internal load balancing where load and free connections fluctuate too quickly for the BIG-IP to make a good decision.|
That'll all be open source in a couple days, but I've been thinking about something else:
Say we were to serve large files requiring complex access control. We'd need mod_perl to do authentication and authorization, but after that point, we don't want mod_perl wasting its resources feeding a slow client. Of course, we normally put mod_proxy in front of it to buffer and close the backend connection, but if the content's over a couple hundred kb, we'll end up wasting a lot of memory on the front-ends, or locking up the backends.
Now, imagine the large file's on some filesystem already, accessible by a lean, mean webserver like thttpd or TUX, if only it had the correct path (which is invariably going to be increadibly ugly due to load balancing and filesystem directory hashing to avoid large directories).
What I'd like to do is get the request with mod_proxy, send it to backend_fast (mod_perl), have backend_fast do authz/auth checks, then tell mod_proxy, not the client to do a redirect and get the resource from another backend server, the fast one (thttpd/TUX). Because the backend one won't have connection limits problems (or rather, hundreds times higher), the proxy won't need a lot of memory, and mod_perl won't be locked up.
So, the question is: is there an existing way to get mod_proxy (or Pound? or Squid? or any reverse proxy?) to do a redirect itself, instead of passing it along to the client. mod_proxy's ProxyPassReverse just rewrites the Location header that gets sent to the client. That's not a solution. We can't ever expose a URL going directly to the fast backend, bypassing access controls.
I imagine we could just hack mod_proxy/Pound/Squid, adding a new internal HTTP status code, but I'd like to think somebody else has done this.
Jumping a little ahead, something akin to LVS direct routing would be the best. Not sure how that'd interact, though: go right to mod_perl, and then have mod_perl tinker with some routing/headers and change the path through the system? I'd have to think about that longer... it's definitely more challening than just hacking a reverse proxy.