Moved the reverse proxy code from reqs.c into it's own files
(reverse_proxy.c). The code in reqs.c is way too complicated, so I
want to move unrelated code into their own files to simplify the main
concepts in reqs.c.
I re-indented the source code using indent with the following options:
indent -kr -bad -bap -nut -i8 -l80 -psl -sob -ss -ncs
There are now _no_ tabs in the source files, and all indentation is
eight spaces. Lines are 80 characters long, and the procedure type is
on it's own line. Read the indent manual for more information about
what each option means.
This allows tinyproxy to respond to a request bound to the same
interface that the request came in on. As Oswald explains:
"attached is a patch that adds the BindSame option. it causes
binding an outgoing connection to the ip address of the respective
incoming connection. that way one can simulate an entire proxy farm
with a single instance of tinyproxy on a multi-homed machine."
Cool.
this addition follow:
The patch implements a simple reverse proxy (with one funky extra
feature). It has all the regular features: mapping remote servers to local
namespace (ReversePath), disabling forward proxying (ReverseOnly) and HTTP
redirect rewriting (ReverseBaseURL).
The funky feature is this: You map Google to /google/ and the Google front
page opens up fine. Type in stuff and click "Google Search" and you'll get
an error from tinyproxy. Reason for this is that Google's form submits to
"/search" which unfortunately bypasses our /google/ mapping (if they'd
submit to "search" without the slash it would have worked ok). Turn on
ReverseMagic and it starts working....
ReverseMagic "hijacks" one cookie which it sends to the client browser.
This cookie contains the current reverse proxy path mapping (in the above
case /google/) so that even if the site uses absolute links the reverse
proxy still knows where to map the request.
And yes, it works. No, I've never seen this done before - I couldn't find
_any_ working OSS reverse proxies, and the commercial ones I've seen try
to parse the page and fix all links (in the above case changing "/search"
to "/google/search"). The problem with modifying the html is that it might
not be parsable (very common) or it might be encoded so that the proxy
can't read it (mod_gzip or likes).
Hope you like that patch. One caveat - I haven't coded with C in like
three years so my code might be a bit messy.... There shouldn't be any
security problems thou, but you never know. I did all the stuff out of my
memory without reading any RFC's, but I tested everything with Moz, Konq,
IE6, Links and Lynx and they all worked fine.
manage the HTML error pages. It simplifies the source, and also make
the object file smaller. Nice. Also added any casting from (void*)
to ensure that the code compiles using a C++ compiler.
"ViaProxyName" directive. The "Via" HTTP header is _required_ by the
HTTP spec, so the code has been changed to always send the header.
However, including the proxy's host name could be considered a
security threat, so the "ViaProxyName" directive is used to set the
token sent in the "Via" header. If the directive is not enabled the
proxy's host name will be used.
tinyproxy. There is really no need for this code, since there are
perfectly good programs out there (like rinetd) which are designed for
TCP tunnelling. tinyproxy should be a good HTTP proxy, nothing more,
and nothing less; therefore, the tunnelling code is gone.
function. The signal handler now simply sets a flag which is monitored
inside the thread_main_loop() function. The log rotation code has also
been tightened to handle any error conditions better. Credit to Petr
Lampa for suggesting that system functions inside of a signal handler is
bad magic.
Log when tinyproxy is using default values rather than specific ones.
Cleaned up the command line arguments since tinyproxy now uses a
configuration file.
Removed the USR1 signal and added the thread creation code.