?

Log in

No account? Create an account
Apache configuration challenge/question - brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Apache configuration challenge/question [Dec. 21st, 2005|05:58 pm]
Brad Fitzpatrick
[Tags|, , ]

Dear Lazyweb,

Anybody know Apache well enough to help with this? I'll have to skip the background and rationale for now, but basically I need:

#1: Some backend HTTP server serving static files:
-- Apache 1.3, whatever.
-- notable, these include HTML files including SSI <--#include virtual="..." --> tags w/ relative paths
-- but this server doesn't do SSI expansion
-- don't need help with this part.

#2: Apache 2.x front-end that does:
-- mod_rewrite external "prg:" program map to map path to full URL of server #1, above
-- result of that prg: map goes to [P] (to mod_proxy)
-- SSI expansion of the result, using, say, "SetOutputFilter INCLUDES"
-- those virtual includes then need to start back over at mod_rewrite/map/proxy/SSI.

So far I have two solution, neither of which totally work. The basic one which got me hopeful was:

ProxyPass / http://127.0.0.1:81/
ProxyPassReverse / http://127.0.0.1:81/
<Proxy http://127.0.0.1:81/*>
Order deny,allow
Allow from all
SetOutputFilter INCLUDES
</Proxy>

That properly does the SSI expansion, and fetches the virtual includes as well from the remote proxy server.

So I thought: Just add the rewrite map!

So I removed the ProxyPass/ProxyPassReverse lines and added:

RewriteEngine on
RewriteMap extmap prg:/var/www/extmap.pl
RewriteRule ^(.*) ${extmap:$1} [P]

Where extmap.pl is:

#!/usr/bin/perl
$| = 1;
while (my $in = <STDIN>) {
    chomp $in;
    $in .= "l" if $in =~ /\.htm$/;
    print "http://127.0.0.1:81$in\n";
}


But that only half-works. If I then hit /index.htm, mod_rewrite changes it to http://127.0.0.1:81/index.html and expands the static SSI tags (like inserting current date, etc) but the #include tags aren't sent back through mod_rewrite, or at least not the map.... (yes, I have RewriteLogLevel 5). Instead, Apache is looking for the included files locally, in /var/www (the base directory which is actually empty, except for my map function).

So basically I can get proxying working (with virtual includes), or external map working, but not both.

I've tried various combinations of [P,PT], [P,N], etc (the mod_rewrite flags for control flow) but I'm running out of ideas.

Any winners get the only prizes I can give out easily: LJ paid/perm accounts. :-) (although this isn't for LJ at all.....)

Thanks!
LinkReply

Comments:
[User Picture]From: jgrafton
2005-12-22 02:21 am (UTC)
Maybe you need to specify in the <proxy> block
AcceptPathInfo on

as given in this example, though that really doesn't make a whole lot of sense.

Perhaps you found a bug in apache?
(Reply) (Thread)
[User Picture]From: lithiana
2005-12-22 02:59 am (UTC)
this is a slightly different solution from what you have, so it may not be useful, but if you use an extmap like:

    $in .= "l" if $in =~ /\.htm$/;
    print "$in\n";


and then:

ProxyPass / http://real.host/
<Proxy http://real.host/*>
Order deny,allow
Allow from all
SetOutputFilter INCLUDES
</Proxy>

RewriteEngine on
RewriteMap extmap prg:/tmp/extmap.pl
RewriteRule ^(.*)$ ${extmap:$1} [L,PT]


it works for me.
(Reply) (Thread)
[User Picture]From: brad
2005-12-22 05:10 pm (UTC)
Interesting. Unfortunately I need to dynamically select the real.host in the rewrite script.... trying to think how I can utilize your method with [PT] to achieve that. But I think I'm stuck, unless I add another apache2 proxy in between to do another rewrite pass, but that's super-lame.
(Reply) (Parent) (Thread)
[User Picture]From: lithiana
2005-12-22 06:36 pm (UTC)
okay, what about this?
RewriteMap extmap prg:/tmp/extmap.pl
RewriteCond %{HTTP_HOST} localhost
RewriteRule ^(.*)$ ${extmap:$1} [P]

"localhost" is the front-end host in my configuration (i.e. the URL in the browser).
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-12-22 07:28 pm (UTC)
Doesn't work for me. How's that any different from the example I posted in my original post? What's the RewriteCond supposed to add/change to the flow?
(Reply) (Parent) (Thread)
[User Picture]From: lithiana
2005-12-22 07:47 pm (UTC)
hmm, this is odd. i thought that made it work, but now it seems to work even if i remove the rewritecond (it definitely didn't before).

this is how i'm setting it up, in case i missed something..

at http://tools.wikimedia.de/~kate/brad/ there are two files, "test.html" and "inc". you can see these are definitely not processed server-side.

i then started apache2 on my workstation with this configuration:

ProxyPass / http://tools.wikimedia.de/~kate/brad/
<Proxy http://tools.wikimedia.de/*>
Order deny,allow
Allow from all
SetOutputFilter INCLUDES
</Proxy>

RewriteEngine on
RewriteMap extmap prg:/tmp/extmap.pl
RewriteRule ^(.*)$ ${extmap:$1} [P]


/tmp/extmap.pl looks like this:
#!/usr/bin/perl
$| = 1;
while (my $in = ) {
    chomp $in;
    $in .= "l" if $in =~ /\.htm$/;
    print "http://tools.wikimedia.de/~kate/brad$in\n";
}

and then:
0/root@hyacinth:~>wget -qO- http://localhost/test.htm
this was included!


so, uh... i'm at a loss to explain what i changed to make it work.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-12-22 08:15 pm (UTC)
Works for me now too.... after I rearranged the ProxyPass and the RewriteRule block.... order matters I guess?
(Reply) (Parent) (Thread)
[User Picture]From: brad
2005-12-22 08:18 pm (UTC)
Nevermind.... it doesn't work for me. I wasn't requesting my "test.htm" ... just my implied "index.html". Fixed my test script.

Still can't get it to work.

My config is identical to yours except:

ProxyPass / http://127.0.0.1:81/

And mine's in a VirtualHost block. What about yours?
(Reply) (Parent) (Thread)
[User Picture]From: lithiana
2005-12-22 08:27 pm (UTC)
it wasn't in a virtualhost when i tested it, but i moved the whole thing inside one and it's still working.

did you remove the ProxyPassReverse? i notice it doesn't work if i include that (although i suppose that might be needed, depending on what you're trying to do).
(Reply) (Parent) (Thread)
[User Picture]From: scosol
2005-12-22 03:07 am (UTC)
is that all the ext map is ever gonna do?
if so, just issue a 302 to URI+l and make the client come back in correctly and get picked up by proxypass :)

if not... it seems proxypass handles the ssi expansion correctly, and rewrite [P] doesn't-
so i would turn them both on, triggering off the same pattern, and see which one actually gets used (one? which? both? in which order?)
you might be able to wank something up with their interactions, setting state in one and triggering off state in the other?

or if resources are abundant, run another apache2 in the middle- rewrite [P] on the front, proxy and SSI-expand on the middle, nothing changes on the back?
(Reply) (Thread)
[User Picture]From: brad
2005-12-22 07:44 am (UTC)
The ext map was a cooked example.
(Reply) (Parent) (Thread)
[User Picture]From: scosol
2005-12-22 09:22 am (UTC)
yeah i figured...

well if you don't get it sorted and dont want to run 3 inlined apaches and want to have some fun with stuff using your own memcached, check out mod_cml for lighttpd:

http://www.lighttpd.net/documentation/cml.html

the docs are in the context of proxy/caching, but you have the full power of lua to do whatever you need to do, and simply need to return the right exit code-

(Reply) (Parent) (Thread)
[User Picture]From: mart
2005-12-22 12:06 pm (UTC)

Lua is everywhere! Heh.

(Reply) (Parent) (Thread)
From: (Anonymous)
2005-12-22 03:13 am (UTC)
Nope. This won't work.

looking at mod_include:
https://svn.apache.org/repos/asf/httpd/httpd/trunk/modules/filters/mod_include.c

It calls 'rr = ap_sub_req_lookup_file(newpath, r, f->next);' for <!--#include file="..."> and 'rr = ap_sub_req_lookup_uri(parsed_string, r, f->next);' for <!--#include virtual="...">.

So, assuming you are using virtual, and calling ap_sub_req_lookup_uri, which is in request.c:
https://svn.apache.org/repos/asf/httpd/httpd/trunk/server/request.c

ap_sub_req_lookup_uri, calls ap_sub_req_method_uri. ap_sub_req_lookup_uri does the actual work, Assuming your content can't be satisfied by a Quick Handler, like mod_cache, it would continue on to ap_process_request_internal(also in request.c).

And, this is where the problem is. Trace where 'int file_req' is used. We skip lots of important hooks, like translate_name, where things like mod_rewrite are ran.

This could be a bug. Worthly of discussion on the dev@httpd mailing list.

Hope this helps.

-Paul Querna.
(Reply) (Thread)
[User Picture]From: brad
2005-12-22 07:49 am (UTC)
Paul,

I'd prefer not join yet-another mailing list, if possible. :-) But if the use case interests you, perhaps you could raise it on the list?

Thanks!
Brad
(Reply) (Parent) (Thread)