Commit Graph

1690 Commits

Author SHA1 Message Date
rofl0r
3a920b7163 conf: add tool to print regex name/regex pairs as re2r input
this is currently not included in the build system and needs to be
compiled by hand.
2020-10-16 12:03:28 +01:00
rofl0r
42bb446c96 conf: shrink back RE_MAX_MATCHES to 16
with the IPv4 regex simplification from 22f059dc5e
we're back to max 15 match groups according to re2r analysis
(the most elaborate regex is the upstream one).
2020-10-16 11:58:48 +01:00
rofl0r
dabfd1ad6c conf: remove pointless assert() statement 2020-10-15 22:39:46 +01:00
rofl0r
ae4cbcabd1 conf: remove trailing whitespace via C code, not regex 2020-10-15 22:36:10 +01:00
rofl0r
22f059dc5e conf: simplify ipv4 regex
use one matching group rather than 3.
2020-10-12 20:05:06 +01:00
rofl0r
86379b4b66 conf: parse regexes case-sensitive
rather than treating everything as case insensitive, we explicitly
allow upper/lowercase where it makes sense.
2020-10-09 01:43:46 +01:00
rofl0r
57f932a33b conf: skip leading whitespace instead of adding it to each regex 2020-10-09 01:26:50 +01:00
rofl0r
173c5b66a7 conf: remove obsolete whitespace from regex start
we already deal with leading whitespace before a command in a manual
way before comparing keywords.
2020-10-09 01:04:44 +01:00
rofl0r
393e51ba45 conf: remove second instance of empty parens ERE group
likewise
2020-10-09 01:00:56 +01:00
rofl0r
b07f7a8422 conf: remove empty parens group from regex
using an empty group () is not defined in the posix spec, and as such
"undefined behaviour", even though it happened to work with both GLIBC
and MUSL libc, as well as with oniguruma's POSIX compatibility API.

we used this idiom as a trick when refactoring the regex parsing,
in order not to change the match indices of all the handler functions,
ignorant that this is not explicitly allowed by the spec.

to make future refactoring easier, we introduce a MGROUP1 macro that's
added to each match group index, so we have only a single knob to turn
in case a similar change becomes necessary again.
2020-10-09 00:38:13 +01:00
rofl0r
3eb238634a conf: properly escape tab in whitespace class 2020-10-09 00:23:47 +01:00
rofl0r
f1f3994d09 conf: factor out list of regex into separate header
this allows to include the regexes in another file and apply
transformations and experiments.
2020-10-09 00:22:14 +01:00
rofl0r
e20aa221ff conf: move inclusion of common.h back to the start
otherwise the feature-test-macros won't kick in as they should.

should fix #329
2020-10-01 15:25:35 +01:00
rofl0r
8d27503cc3 acl: fix regression using ipv6 with netmask
introduced in 0ad8904b40

closes #327
2020-09-30 19:23:34 +01:00
rofl0r
3950a606a4 conf: only treat space and tab as whitespace
other characters in the [[:space:]] set can't possibly be encountered,
and this speeds up parsing by approximately 10%.
2020-09-30 05:31:56 +01:00
rofl0r
a8944b93e7 conf: use [0-9] instead of [[:digit:]] for shorter re strings 2020-09-30 05:28:00 +01:00
rofl0r
960972865c print linenumber from all conf-emitted warnings 2020-09-30 05:21:26 +01:00
rofl0r
f55c46eb39 log: print timestamps with millisecond precision
this allows easier time measurements for benchmarks.
2020-09-30 05:20:09 +01:00
rofl0r
10494cab8c change loglevel of "Not running as root" message to INFO
there's no reason to display this as warning.
2020-09-30 05:19:16 +01:00
rofl0r
4f1a1663ff conf: remove bogus support for hex literals
the INT regex macro supported a 0x prefix (used e.g. for port numbers),
however following that, only digits were accepted, and not the full
range of hexdigits. it's unlikely this was used, so remove it.

note that the () expression is kept, so we don't have to adjust match
number indices all over the place.
2020-09-30 05:14:57 +01:00
rofl0r
35c8edcf73 speed up build by only including regex.h where needed 2020-09-30 05:13:45 +01:00
rofl0r
7c664ad0b2 Release 1.11.0-rc1 2020-09-27 16:22:21 +01:00
rofl0r
8594e9b8cc add conf-tokens.gperf to EXTRA_DIST
otherwise it will be missing in `make dist`-generated tarballs.
2020-09-27 15:55:23 +01:00
rofl0r
094db9d670 version.sh: relax regex for release tag detection
this allows to use tag names with a custom suffix too.
2020-09-27 15:44:50 +01:00
rofl0r
4dfac863a5 version.sh: replace -g with -git-
git describe prefixes the sha1 commit hash with -g, which is exactly what
we're after. this change gets rid of the confusing "g" in the commit hash
and allows tag names that include "-".
2020-09-27 15:41:54 +01:00
rofl0r
c74fe57262 transparent: workaround old glibc bug on RHEL7
it's been reported[0] that RHEL7 fails to properly set the length
parameter of the getsockname() call to the length of the required
struct sockaddr type, and always returns the length passed if it
is big enough.

the SOCKADDR_UNION_* macros originate from my microsocks[1] project,
and facilitate handling of the sockaddr mess without nasty casts.

[0]: https://github.com/tinyproxy/tinyproxy/issues/45#issuecomment-694594990
[1]: https://github.com/rofl0r/microsocks
2020-09-18 12:12:14 +01:00
rofl0r
d4ef2cfa62 child_kill_children(): use method that actually works
it turned out that close()ing an fd behind the back of a thread
doesn't actually cause blocking operations to get a read/write event,
because the fd will stay valid to in-progress operations.
2020-09-17 21:24:45 +01:00
rofl0r
da1bc1425d tune error messages to show select or poll depending on what is used 2020-09-17 21:03:51 +01:00
rofl0r
22e4898519 add autoconf test and fallback code for systems without gperf 2020-09-16 23:04:12 +01:00
rofl0r
45b238fc6f main: print error when config_init() fails 2020-09-16 21:01:02 +01:00
rofl0r
45323584a0 speed up big config parsing by 2x using gperf 2020-09-16 21:01:02 +01:00
rofl0r
caeab31fca conf.c: simplify the huge IPV6 regex
even though the existing IPV6 regex caught (almost?) all invalid
ipv6 addresses, it did so with a huge performance penalty.
parsing a file with 32K allow or deny statement took 30 secs in
a test setup, after this change less than 3.

the new regex is sufficient to recognize all valid ipv6 addresses,
and hands down the responsibility to detect corner cases to the
system's inet_pton() function, which is e.g. called from insert_acl(),
which now causes a warning to be printed in the log if a seemingly
valid address is in fact invalid.

the new regex has been tested with 486 testcases from
http://download.dartware.com/thirdparty/test-ipv6-regex.pl
and accepts all valid ones and rejects most of the invalid ones.

note that the IPV4 regex already did a similar thing and checked only
whether the ip looks like [0-9]+.[0-9]+.[0-9]+.[0-9]+ without pedantry.
2020-09-16 21:01:02 +01:00
rofl0r
0ad8904b40 acl.c: detect invalid ipv6 string 2020-09-16 21:00:50 +01:00
rofl0r
99ed66cbc4 conf.c: warn when encountering invalid address 2020-09-16 21:00:50 +01:00
rofl0r
880a8b0ab6 conf: use cpp stringification for STDCONF macro 2020-09-16 21:00:04 +01:00
rofl0r
551e914d24 conf: merge upstream/upstream_none into single regex/handler 2020-09-16 21:00:04 +01:00
rofl0r
bad36cd9cd move config reload message to reload_config()
move it to before disabling logging, so a message with the correct
timestamp is printed if logging was already enabled.
also add a message when loading finished, so one can see from the
timestamp how long it took.

note that this only works on a real config reload triggered by
SIGHUP/SIGUSR1, because on startup we don't know yet where to log to.
2020-09-16 21:00:04 +01:00
rofl0r
683a354196 remove vector remains 2020-09-16 02:39:09 +01:00
rofl0r
06c96761d5 log_message_storage: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
54ae2d2a19 tests: add some AddHeader directives 2020-09-16 02:39:09 +01:00
rofl0r
e843519fb8 listen_addrs: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
a5381223df basicauth: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
487f2aba47 connect_ports: use sblist 2020-09-16 02:39:09 +01:00
rofl0r
e929e81a55 add_header: use sblist
note that the old code inserted added headers at the beginning of the
list, reasoning unknown. this seems counter-intuitive as the headers
would end up in the request in the reverse order they were added,
but this was irrelevant, as the headers were originally first put
into the hashmap hashofheaders before sending it to the client.
since the hashmap didn't preserve ordering, the headers would appear
in random order anyway.
2020-09-16 02:39:09 +01:00
rofl0r
7d33fc8e8a listen_fds: use sblist 2020-09-16 01:05:58 +01:00
rofl0r
a5890b621b run_tests_valgrind: use tougher valgrind settings 2020-09-15 23:39:04 +01:00
rofl0r
2037bc64f5 free a mem leak by statically allocating global statsbuf 2020-09-15 23:28:33 +01:00
rofl0r
d453a4c2a4 main: include loop header 2020-09-15 23:20:14 +01:00
rofl0r
192f8194e1 free() loop records too 2020-09-15 23:12:00 +01:00
rofl0r
bd92446184 use poll() where available 2020-09-15 23:12:00 +01:00