Commit Graph

13 Commits

Author SHA1 Message Date
rofl0r
235b1c10a7 implement filtertype keyword and fnmatch-based filtering
as suggested in #212, it seems the majority of people don't understand
that input was expected to be in regex format and people were using
filter lists containing plain hostnames, e.g. `www.google.com`.

apart from that, using fnmatch() for matching is actually a lot less
computationally expensive and allows to use big blacklists without
incurring a huge performance hit.

the config file now understands a new option `FilterType` which can
be one of `bre`, `ere` and `fnmatch`.
The `FilterExtended` option was deprecated in favor of it.
It still works, but will be removed in the release after the next.
2022-05-02 13:13:40 +00:00
rofl0r
233ce6de3b filter: reduce memory usage, fix OOM crashes
* check return values of memory allocation and abort gracefully
  in out-of-memory situations

* use sblist (linear dynamic array) instead of linked list
  - this removes one pointer per filter rule
  - removes need to manually allocate/free every single list item
    (instead block allocation is used)
  - simplifies code

* remove storage of (unused) input rule
  - removes one char* pointer per filter rule
  - removes storage of the raw bytes of each filter rule

* add line number to display on out-of-memory/invalid regex situation

* replace duplicate filter_domain()/filter_host() code with a single
  function filter_run()
  - reduces code size and management effort

with these improvements, >1 million regex rules can be loaded with
4 GB of RAM, whereas previously it crashed with about 950K.

the list for testing was assembled from
http://www.shallalist.de/Downloads/shallalist.tar.gz

closes #20
2020-09-05 19:42:34 +01:00
Michael Adam
83987babd3 filter: add function filter_reload() 2009-10-25 23:33:37 +01:00
Mukund Sivaraman
eccc765057 Remove trailing comma from filter_policy_t 2009-09-21 07:32:38 +05:30
Mukund Sivaraman
7b9234f394 Indent code to Tinyproxy coding style
The modified files were indented with GNU indent using the
following command:

indent -npro -kr -i8 -ts8 -sob -l80 -ss -cs -cp1 -bs -nlps -nprs -pcs \
    -saf -sai -saw -sc -cdw -ce -nut -il0

No other changes of any sort were made.
2009-09-15 01:11:25 +05:30
Mukund Sivaraman
a257703e59 Reformat code to GNU coding style
This is a commit which simply ran all C source code files
through GNU indent. No other modifications were made.
2008-12-01 15:01:11 +00:00
Mukund Sivaraman
249d4b7f33 Updated copyright, license notices in source code
The notices have been changed to a more GNU look. Documentation
comments have been separated from the copyright header. I've tried to
keep all copyright notices intact. Some author contact details have
been updated.
2008-05-24 13:35:49 +05:30
Robert James Kaes
c0299e1868 * [Indent] Ran Source Through indent
I re-indented the source code using indent with the following options:

indent -kr -bad -bap -nut -i8 -l80 -psl -sob -ss -ncs

There are now _no_ tabs in the source files, and all indentation is
eight spaces.  Lines are 80 characters long, and the procedure type is
on it's own line.  Read the indent manual for more information about
what each option means.
2005-08-15 03:54:31 +00:00
Robert James Kaes
7e1de2012c Added code to handle the "FilterDefaultDeny" directive. The filter_set_default_policy() function is used to select the default policy (either default allow or default deny) for the filtering code. Also, the two filtering functions now support the policy code. 2002-06-07 18:36:22 +00:00
Robert James Kaes
b11015c2e1 Added a copyright for James E. Flemer since these are his changes.
(filter_init): Added code to handle both host and URLs.  Also include code to use extended regular expressions.
(filter_domain): The old filter_url function has been renamed filter_domain().
(filter_url): This function now actually filters complete URLs.
2002-05-27 01:56:22 +00:00
Robert James Kaes
b023ff577f Changed the filter_host command to filter_url. 2000-11-23 04:46:25 +00:00
Robert James Kaes
f807f4b96c Just using standard malloc() since the xmalloc() didn't really add
anything useful to the command.
2000-09-11 23:43:59 +00:00
Steven Young
37e63909c0 This commit was generated by cvs2svn to compensate for changes in r2,
which included commits to RCS files with non-trunk default branches.
2000-02-16 17:32:49 +00:00