https://talosintelligence.com/vulnerability_reports/TALOS-2023-1889
this bug was brought to my attention today by the debian tinyproxy
package maintainer. the above link states that the issue was known
since last year and that maintainers have been contacted, but if
that is even true then it probably was done via a private email
to a potentially outdated email address of one of the maintainers,
not through the channels described clearly on the tinyproxy homepage:
> Feel free to report a new bug or suggest features via github issues.
> Tinyproxy developers hang out in #tinyproxy on irc.libera.chat.
no github issue was filed, and nobody mentioned a vulnerability on
the mentioned IRC chat. if the issue had been reported on github or
IRC, the bug would have been fixed within a day.
since accept() uses the socklen parameter as in/out, after processing
an IPv4 the socklen fed to it waiting for the next client was only
the length of sockaddr_in, so if a connection from an IPv6 came in
the client sockaddr was only partially filled in.
this caused wrongly printed ipv6 addresses in log, and failure to
match them correctly against the acl.
closes#495
* Added support to configure IPv6 upstream proxy servers using bracket syntax.
* Added regular expression for IPv6 scope identifier to re for IPv6 address.
it's not possible to use a https url in a ReversePath directive, without
removing the security provided by https, and would require adding a
dependency on a TLS library like openssl and a lot of code complexity
to fetch the requested resource via https and relay it back to the client.
in case the reversepath directive kicked in, but the protocol wasn't
recognized, and support for transparent proxying built-in, the code
wrongfully tried to turn the request into a trans request, leading
to a bogus rewritten url like http://localhost:8888https://www.endpoint.com
and an error message that we're trying to connect to the machine the
proxy runs on.
now instead use the generic code that signals an invalid protocol/url
was used.
closes#419
while at it, the function doing it was renamed from the misleading
ssl name to what it actually does.
also inlined the strings that were previously defined as macros.
addressing #152
read_request_line() is exercised on the client's fd, and it fails
when the client closed the connection. therefore it's wrong
to send an error message to the client in this situation.
additionally, the error message states that the server closed
the connection.
might fix#383
as suggested in #212, it seems the majority of people don't understand
that input was expected to be in regex format and people were using
filter lists containing plain hostnames, e.g. `www.google.com`.
apart from that, using fnmatch() for matching is actually a lot less
computationally expensive and allows to use big blacklists without
incurring a huge performance hit.
the config file now understands a new option `FilterType` which can
be one of `bre`, `ere` and `fnmatch`.
The `FilterExtended` option was deprecated in favor of it.
It still works, but will be removed in the release after the next.
Currently a 404 is returned for a misconfigured or unavailable upstream
server. Since that's a server error it should be a 5xx instead; a 404
is confusing when used as a forward proxy and might even be harmful when
used as a reverse proxy.
It is debatable if another 5xx code might be better; the misconfigured
situation might better be a 500 whereas the connection issue could be
a 503 instead (as used eg. in haproxy).
this fixes OPTIONS requests sent from apache SVN client using their
native HTTP proxy support.
closes#421
tested with `svn info http://svnmir.bme.freebsd.org/ports/`
the fix in 0b9a74c290 was incomplete, as it
applied the socket timeout only to the socket received from accept(), but
not to sockets created for outgoing connections.
introduced in 979c737f9b.
when refactoring the "site-spec" parsing code i failed to realize that
the code dealing with acl allow/deny directives didn't provide the
option to specify netmasks in dotted ipv4 notation, unlike the code
in the upstream parser. since both scenarios now use the same parsing,
both dotted notation and CIDR slash-notation are possible.
while at it, removed the len parameter from fill_netmask_array() which
provided the illusion the array length could be of variable size.
fixes#394
using a socks4 tor upstream with an .onion url resulted in
gethostbyname() returning NULL and a subsequent segfault.
not only did the code not check the return value of gethostbyname(),
that resolver API itself isn't threadsafe.
as pure SOCKS4 supports only IPv4 addresses, and the main SOCKS4
user to this date is tor, we just use SOCKS4a unconditionally and
pass the hostname to the proxy without trying to do any local name
resolving.
i suspect in 2021 almost all SOCKS4 proxy servers in existence use
SOCKS4a extension, but should i be wrong on this, i prefer issue
reports to show up and implement plain SOCKS4 fallback only when
i see it is actually used in practice.
for some reason, getting this macro is really hard across platforms,
requiring either different feature test macros or even the right order
of included headers, and its usage caused several build failures in the
past. fix it once and for all by just using 1024 as max line length if
the macro can't be retrieved.
closes#382
since there are numerous changes in HTTP/1.1, the proxyserver will
stick to using HTTP/1.0 for internal usage, however when a connection
is requested with HTTP/1.x from now on we will duplicate the minor revision
the client requested, because apparently some servers refuse to accept
HTTP/1.0
addresses #152.
the acl.c code parsing a site-spec has been factored out into a
new TU: hostspec. it was superior to the parsing code in
upstream.c in that it properly deals with both ipv4 and ipv6.
both upstream and acl now use the new code for parsing, and upstream
also for checking for a match.
acl.c still uses the old matching code as it has a lot of special case
code for specifications containing a hostname, and in case such
a spec is encountered, tries to do reverse name lookup to see if
a numeric ip matches that spec.
removing that code could break existing usecases, however since
that was never implemented for upstream nobody will miss it there.
if for example:
ReversePath = "/foo/"
and user requests "http://tinyproxy/foo" the common behaviour for HTTP
servers is to send a http 301 redirect to the correct url.
we now do the same.
as pointed out by @craigbarnes [0], using the latest fix for
the tombstone issue, it's possible to provoke a situation
that causes an endless loop when all free slots in the table
are filled up with tombstones and htab_find() is called.
therefore we need to account for those as well when deciding
if there's a need to call resize() so there's never more than
75% of the table used by either dead or live items.
the resize() serves as a rehash which gets rid of all deleted
entries, and it might cause the table size to shrink if
htab_insert() is called after a lot of items have been removed.
[0]: https://github.com/rofl0r/htab/issues/1#issuecomment-800094442
testcase:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hsearch.h"
#define HTAB_OOM_TEST
#include "hsearch.c"
static char *xstrdup(const char *str)
{
char *dup = strdup(str);
assert(dup);
return dup;
}
void utoa(unsigned number, char* buffer) {
int lentest, len = 0, i, start = 0;
lentest = number;
do {
len++;
lentest /= 10;
} while(lentest);
buffer[start+len] = 0;
do {
i = number % 10;
buffer[start+len - 1] = '0' + i;
number -= i;
len -= 1;
number /= 10;
} while (number);
}
#define TESTSIZE 8
#define KEEP 1
static char* notorious[TESTSIZE];
static void prep() {
srand(0);
char buf[16];
size_t filled = 0;
while(filled < TESTSIZE) {
utoa(rand(), buf);
size_t idx = keyhash(buf) & (TESTSIZE-1);
if(!notorious[idx]) {
notorious[idx] = xstrdup(buf);
++filled;
}
}
}
int main(void)
{
struct htab *h = htab_create(TESTSIZE);
size_t i;
assert(h);
prep();
for(i=0; i<TESTSIZE; ++i) {
char *key = notorious[i];
printf("[%zu] = \"%s\"\n", i, key);
int r = htab_insert(h, key, HTV_N(42));
if(!r == 1) {
printf("element %zu couldn't be inserted\n", i);
break;
}
assert(r == 1);
// Ensure newly inserted entry can be found
assert(htab_find(h, key));
if(i >= KEEP) htab_delete(h, key);
}
htab_find(h, "looooop");
return 0;
}
we already required an extra argument inside the headers sent
for 401 and 407 error responses, move those to sent_http_error_message()
and refactor send_http_headers() to always take the extra argument.
in calling sites where the extra arg isn't needed, use "".
we can't just set an item's key to zero and be done with a deletion,
because this will break the item search chain.
a deleted item requires a special marker, also known as tombstone.
when searching for an item, all slots with a tombstone need to treated
as if they were in use, but when inserting an item such a slot needs
to be filled with the new item.
a common procedure is to rehash the table when the number of deleted
items crosses a certain threshold, though for simplicity we leave this
task to the resize() function which does the same thing anyway when
the hashtable grows.
this allows to fix the issue quite elegantly and with almost no
additional overhead, so we don't penalize applications that do very
few deletions.
Try all the addresses specified with Bind in order. This is necessary
e.g. for maintaining IPv4+6 connectivity while still being restricted to
one interface.