For now the language locales are not available as the text on
the page needs to stabilize before asking translation
volunteers to contribute their time working on the new text.
By default uBO assumed the Shortcut pane was needed,
unless it found the current version of FF was higher
than 73. This commit reverses the test, it assumes
the Shortcut pane is not needed, unless the current
version is lower than 74.
isBlockImportant() was relying strictly on the hash bits
to detect whether a matching filter was `important`, but
this approach regressed with changes with how `important`
filters are compiled. This commit fixed this by no longer
relying on the hash bits but rather on an internal
register variable being set by `important` filters when
they match.
I couldn't find any actual cases in default filter lists
(including a couple of default regional lists) that the
regression is having any effect, due to the limited cases
for which isBlockImportant() is called.
A test was added in a previous commit to detect such
regression in the future:
- a76935b232
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1732
The regression affect filter with the `important` option when
the following conditions were fulfilled:
- The filter pattern is pure hostname
- The filter has not one of the following options:
- domain
- denyallow
- header
- strict1p, strict3p
- csp
- removeparam
- There is a matching exception filter
Related commit:
- a2a8ef7e85
A related mocha test has been added in order to detect this
specific regression in the future through `make test`.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1730
A new filter unit, FilterNotType, is introduced to enforce
negated filter type options.
Before this commit, there was no actual negated types in the
static network filtering engine, as a negated type was internally
converted to non-negated types at compile time. As a result,
the logger would never output a matching filter with its original
negated type options.
This commit no longer causes an internal conversion to take place
at compile time, but explicitly enforce negated types at match time,
and as a result the logger will from now on output matching filter
with their original negated type options.
Name: modifyWebextFlavor
Value: A list of space-separated tokens to be added/removed from the
computed default webext flavor.
The primary purpose is to give filter list authors the ability to
test mobile flavor on desktop computers. Though mobile versions of
web pages can be emulated using browser dev tools, it's not
possible to do so for uBO itself.
By using `+mobile` as a value for this setting will force uBO
to act as if it's being executed on a mobile device.
Important: this setting is best used in a dedicated browser
profile, as this affects how filter lists are compiled. So best
to set it in a new browser profile, then force all filter lists
to be recompiled, and use the profile in the future when there
is a need to test the specific webext flavor.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1692
The ids/classes from html/body elements will leave out
looking up lowly generic cosmetic filters made of a single
identifier.
This does not absolutely guarantee that html/body elements
will never be targeted, but it should greatly mitigate the
probability that this erroneously happens.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1690
New procedural operator: `:matches-path(...)`
Description: this is a all-or-nothing passthrough operator, which
on/off behavior is dictated by whether the argument match the
path of the current location. The argument can be either plain
text to be found at any position in the path, or a literal regex
against which the path is tested.
Whereas cosmetic filters can be made specific to whole domain,
the new `:matches-path()` operator allows to further narrow
the specificity according to the path of the current document
lcoation.
Typically this procedural operator is used as first operator in
a procedural cosmetic filter, so as to ensure that no further
matching work is performed should there be no match against the
current path of the current document location.
Example of usage:
example.com##:matches-path(/shop) p
Will hide all `p` elements when visiting `https://example.com/shop/stuff`,
but not when visiting `https://example.com/` or any other page
on `example.com` which has no instance of `/shop` in the path part
of the URL.
So as to allow nodejs usage to better deal with
out of date serialization/compilation.
Additionally, use FilterImportant() only when a
"block-important" filter is stored in the "block" realm.
When matching a network request in the static network filtering
engine ("snfe"), these are the possible outcomes, from most
to least likely:
- No block
- Block
- Unblock ("exception" filter overriding the block)
- Block-important ("important" filter override the unblock)
Hence why the matching in the snfe always check for a match in
the "block" realm, and the "unblock" realm would be checked
if and only if there was a match in the "block" realm.
However the "block-important" realm was always matched against
first, and when a match in that realm was found, there would
be no need to check in other realms since nothing can override
the "important" option. The problem with this approach though
is that matches in the "block-important" realm are most
unlikely, which means pointless work being done for vast
majority of network requests.
This commit makes it so that the "block-important" realm is
matched against ONLY when there is a matched "unblock" filter.
The result is a measurable improvement in the snfe-related
benchmarks (though given the numbers involved, end users won't
perceive a difference).
Somewhat related discussion which was the motivation to look
more into this:
https://github.com/cliqz-oss/adblocker/discussions/2170#discussioncomment-1168125
Whereas before the string segment was encoded as:
LL OOOOOOOOOOOO
where L are the upper 8 bits and used to encode the length
of the segment, and O are the lower 24 bits and used to
encode the offset of the string data in the character
buffer, the new code encode as follow:
OOOOOOOOOOOO LL
And furthermore the most significant bit of the length
LL is now used to mark whether the current string segment
is a label boundary.
This means a cell can't reference a segment longer then
127 characters. To work around this limitation for when a
segment is longer than 127 characters (a rare occurrence),
the algorithm will simply split the segment into multiple
adjacent cells.
As a result, there is no longer a need to encode
"boundariness" into special cells, which simplifies
both the storing and matching algorithms.
Additionally, added minimal documentation for the NPM
package on how to import and use HNTrieContainer as a
standalone API.
The erroneous test does not seem to interfere
with the proper functioning of the trie, due
to the fact that nodes are never split without
a OR node or boundary node being present.
The issue was found when undertaking a rewrite
of the algorithm to avoid having to create
boundary nodes.
In the static network filtering engine (snfe), the
compiling-related code was spread across two classes.
This commit makes it so that all the compiling-related
code is in FilterCompiler class, which clear purpose is
to compile raw filters into a form which can be persisted
and later fed to the snfe with no parsing overhead.
To compile raw static network filter, the new approach is:
snfe.createCompiler(parser);
Then for each single raw filter to compile:
compiler.compile(parser, writer);
The caller is responsible to keep a reference to the
compiler instance for as long as it is needed. This removes
the need for the clunky code used to keep an instance of
compiler alive in the snfe.
Additionally, snfe.tokenHistograms() has been moved to
benchmarks.js, as it has no dependency on the snfe, it's
just a utility function.