also fixes a bug where the ErrorFile directive would create a
new hashmap on every added item, effectively allowing only
the use of the last specified errornumber, and producing memory
leaks on each config reload.
due to the usage of a hashmap to store headers, when relaying them
to the other side the order was not prevented.
even though correct from a standards point-of-view, this caused
issues with various programs, and it allows to fingerprint the use
of tinyproxy.
to implement this, i imported the MIT-licensed hsearch.[ch] from
https://github.com/rofl0r/htab which was originally taken from
musl libc. it's a simple and efficient hashtable implementation
with far better performance characteristic than the one previously
used by tinyproxy. additionally it has an API much more well-suited
for this purpose.
orderedmap.[ch] was implemented from scratch to address this issue.
behind the scenes it uses an sblist to store string values, and a htab
to store keys and the indices into the sblist.
this allows us to iterate linearly over the sblist and then find the
corresponding key in the hash table, so the headers can be reproduced
in the order they were received.
closes#73
it is quite easy to bring down a proxy server by forcing it to make
connections to one of its own ports, because this will result in an endless
loop spawning more and more connections, until all available fds are exhausted.
since there's a potentially infinite number of potential DNS/ip addresses
resolving to the proxy, it is impossible to detect an endless loop by simply
looking at the destination ip address and port.
what *is* possible though is to record the ip/port tuples assigned to outgoing
connections, and then compare them against new incoming connections. if they
match, the sender was the proxy itself and therefore needs to reject that
connection.
fixes#199.
the existing codebase used an elaborate and complex approach for
its parallelism:
5 different config file options, namely
- MaxClients
- MinSpareServers
- MaxSpareServers
- StartServers
- MaxRequestsPerChild
were used to steer how (and how many) parallel processes tinyproxy
would spin up at start, how many processes at each point needed to
be idle, etc.
it seems all preforked processes would listen on the server port
and compete with each other about who would get assigned the new
incoming connections.
since some data needs to be shared across those processes, a half-
baked "shared memory" implementation was provided for this purpose.
that implementation used to use files in the filesystem, and since
it had a big FIXME comment, the author was well aware of how hackish
that approach was.
this entire complexity is now removed. the main thread enters
a loop which polls on the listening fds, then spins up a new
thread per connection, until the maximum number of connections
(MaxClients) is hit. this is the only of the 5 config options
left after this cleanup. since threads share the same address space,
the code necessary for shared memory access has been removed.
this means that the other 4 mentioned config option will now
produce a parse error, when encountered.
currently each thread uses a hardcoded default of 256KB per thread
for the thread stack size, which is quite lavish and should be
sufficient for even the worst C libraries, but people may want
to tweak this value to the bare minimum, thus we may provide a new
config option for this purpose in the future.
i suspect that on heavily optimized C libraries such a musl, a
stack size of 8-16 KB per thread could be sufficient.
since the existing list implementation in vector.c did not provide
a way to remove a single item from an existing list, i added my
own list implementation from my libulz library which offers this
functionality, rather than trying to add an ad-hoc, and perhaps
buggy implementation to the vector_t list code. the sblist
code is contained in an 80 line C file and as simple as it can get,
while offering good performance and is proven bugfree due to years
of use in other projects.
sbin/ is meant for programs only usable by root, but in tinyproxy's
case, regular users can and *should* use tinyproxy; meaning it is
preferable from a security PoV to use tinyproxy as regular user.
using the "BasicAuth" keyword in tinyproxy.conf.
base64 code was written by myself and taken from my own library "libulz".
for this purpose it is relicensed under the usual terms of the tinyproxy
license.
this causes a build failure on several platforms using older versions
of autotools or GNU make.
make[2]: Entering directory `src'
Makefile:670: *** missing separator (did you mean TAB instead of 8 spaces?). Stop.
make[2]: Leaving directory `src'
fixes#72
This feature will only confuse us during support, if users come to
us with a Tinyproxy build which has a differently named default config
file. This feature is not that useful anyway.
Moved the reverse proxy code from reqs.c into it's own files
(reverse_proxy.c). The code in reqs.c is way too complicated, so I
want to move unrelated code into their own files to simplify the main
concepts in reqs.c.
variable in conjunction with _DEPENDENCIES and _LDADD. The change here
makes filter a "required" module in the sense that it will always be
compiled (to make sure it doesn't get out of date), but it will
conditionally included in the object file.