Related issue:
- https://github.com/AdguardTeam/Scriptlets/issues/332
Additionally, uBO's own scriplet syntax now also accept quoting
the parameters with either `'` or `"`. This can be used to avoid
having to escape commas when they are present in a parameter.
Caused by the fact that external filter lists do not have an
`off` property even when they are not enabled.
Additionally, set the default update cycle check period to 2hr.
This commit fix properly handling toggling off the default
status of a list such that the list will be automatically
turned off when its status change from default to non-default.
Additionally, imported lists which become stock lists will
be properly migrated from imported lists section.
Related discussion:
- https://github.com/uBlockOrigin/uBlock-issues/discussions/2234
Example of usage:
@@*$ghide,domain=/img[a-z]{3,5}\.buzz/
Regex-based domain values can be negated just like plain or
entity-based values:
*$domain=~/regex.../
This new syntax does not apply to static extended filters.
This commit is a rewrite of the static filtering parser into a
tree-based data structure, for easier maintenance and better
abstraction of parsed filters.
This simplifies greatly syntax coloring of filters and also
simplify extending filter syntax.
The minimum version of Chromium-based browsers has been raised
to version 73 because of usage of String.matchAll().
Related discussion:
- https://github.com/uBlockOrigin/uBlock-issues/discussions/2412#discussioncomment-4421741
The new option is `to=` and the value is a list of domain list with
similar syntax as `domain=` option. Entity-based syntax is supported,
and also negated hostname.
The main motivation is to give uBO's static network filtering engine
with an equivalent of DNR's `requestDomains` and `excludedRequestDomains`.
Essentially `to=` is a superset of `denyallow=`, but for now I decided
against deprecating `denyallow=`, which still does not support entity-
based syntax and for which negated domains are not allowed.
This commit also introduces the `from=` option, which is just an alias
for the `domain=` option. The logger will render network filters using
the `from=` version.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1861
The "exceptor" feature has been rewritten, with the following
changes as a result:
- The excepted filters cease to exist when closing the logger
- It's now possible to temporary except network filters
When toggling on/off a temporary exception, filter lists are now
fully reloaded. This simplified managing temporary exceptions, and
made it easy to implement temporary exception for network filters,
but this also means there might be a perceptible delay when
adding/removing temporary exceptions. At this point I consider
this an acceptable side-effect just to bring the ability to easily
create temporary exception for network filters, while this
simplified the existing temporary exception code throughout.
Bring latest changes to procedural cosmetic filtering to uBOL.
Fix procedural filtering used in HTML filters.
Standardize quick hash algorithm used throughout to DJB2
(except that initialization step is skipped):
- http://www.cse.yorku.ca/~oz/hash.html#djb2
These two new pseudo selectors are _action_ operators, and thus can
only be used at the end of a selector. They both take as argument
a string or regex literal.
For `:remove-class()`, when the argument matches a class name, that
class name is removed.
For `:remove-attr()`, when the argument matches an attribute name,
that attribute is removed.
These operators are meant to replace `+js(remove-attr, ...)` and
`+js(remove-class, ...)`, which from now on are candidate for
deprecation in some future.
Once the next stable release is widespread, filter authors must use
these two new operators instead of their `+js()` counterparts.
This commit make it so scriptlet injections will occur
at the earliest possible time on all platform.
This should also fix the case reported at:
- https://www.reddit.com/r/uBlockOrigin/comments/ye6abt/
Which is caused by the fact that there is no webNavigation
events being fired by the browser. In such case, the changes
here will make it so that uBO will detect that the scriptlet
were not injected and will inject them at main content script
injection time.
Too many changes to list here, essentially there is now a
user interface setting to enable/disable dark theme, and
I've rearranged a bit the Settings pane as a result and
also altered other visuals in various places.
There are places which I know have not been thoroughly
tested (i.e. logger inspector).
Will fine-tune as per feedback.
Issues with the classic popup panel will not be addressed,
and if feedback is that it has become unusuable, it will be
outright removed.
Related discussion:
- a0a9497b4a (commitcomment-62560291)
The new setting, when disabled (enabled by default), allows a user
to prevent uBO from waiting for all filter lists to be loaded
before allowing network activity at launch. The setting is enabled
by default, meaning uBO waits for all filter lists to be loaded in
memory before unsuspending network activity. Some users may find
this behavior undesirable, hence the new setting.
This gives the option to potentially speed up page load at launch,
at the cost of potentially not properly filtering network requests
as per filter lists/rules.
For platforms not supporting the suspension of network activity,
the setting will merely prevent whatever mechanism exists on the
platform to mitigate improper filtering of network requests at
launch. For example, in Chromium-based browsers, unchecking the
new setting will prevent the browser from re-loading tabs for
which there was network activity while in "suspended" state at
launch.
As the trie is not immediately created, in order to speed up
launch time, the `domain=` option was stored in the filterRefs
array until it was moved to the trie.
This commit instead stores the `domain=` option into the trie
container's character buffer.
Refactored heuristics to collate set of origin-related
filter units are collated into a hostname trie, and
for better reuse of existing classes.
Generalized pre-test idea for bucket of filters, such
that in addition to origin-related filter units, there is
now a class to collate regex-based pattern-related units
into a new pre-test bucket class, FilterBucketIfRegexHits,
in order to test with a single regex test whether there is
a chance of a hit in the underlying bucket of filters.
Instances of these are rare, but at time of commit I found
this occurs with AdGuard France filter list.
Fine-tuned the "SNFE: Dump" output -- this new ability to
see the internal details of the SNFE has been really key
into finding/fixing issues during refactoring.
Turns out the various benchmarks show no benefits when compiling
filters whose pattern contains a single wildcard character into
specialized classes which threat the pattern as two sub-patterns,
and actually there is a slight improvement in performance as per
benchamrks when treating these patterns as generic ones.
This also fixes the following related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1207
The original motivation is to further speed up launch time
for either non-selfie-based and selfie-based initialization
of the static network filtering engine (SNFE).
As a result of the refactoring:
Filters are no longer instance-based, they are sequence-of-
integer-based. This eliminates the need to create instances
of filters at launch, and consequently eliminates all the
calls to class constructors, the resulting churning of memory,
and so forth.
All the properties defining filter instances are now as much
as possible 32-bit integer-based, and these are allocated in a
single module-scoped typed array -- this eliminates the need
to allocate memory for every filter being instantiated.
Not all filter properties can be represented as a 32-bit
integer, and in this case a filter class can allocate slots
into another module-scoped array of references.
As a result, this eliminates a lot of memory allocations when
the SNFE is populated with filters, and this makes the saving
and loading of selfie more straightforward, as the operation
is reduced to saving/loading two arrays, one of 32-bit
integers, and the other, much smaller, an array JSON-able
values.
All filter classes now only contain static methods, and all
of these methods are called with an index to the specific
filter data in the module-scoped array of 32-bit integers.
The filter sequences (used to avoid the use of JS arrays) are
also allocated in the single module-scoped array of 32-bit
integers -- they used to be stored in their own dedicated
array.
Additionally, some filters are now loaded more in a deferred
way, so as reduce uBO's time-to-readiness -- the outcome of
this still needs to be evaluated, time-to-readiness is
especially a concern in Firefox for Android or less powerful
computers.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1730
A new filter unit, FilterNotType, is introduced to enforce
negated filter type options.
Before this commit, there was no actual negated types in the
static network filtering engine, as a negated type was internally
converted to non-negated types at compile time. As a result,
the logger would never output a matching filter with its original
negated type options.
This commit no longer causes an internal conversion to take place
at compile time, but explicitly enforce negated types at match time,
and as a result the logger will from now on output matching filter
with their original negated type options.
Name: modifyWebextFlavor
Value: A list of space-separated tokens to be added/removed from the
computed default webext flavor.
The primary purpose is to give filter list authors the ability to
test mobile flavor on desktop computers. Though mobile versions of
web pages can be emulated using browser dev tools, it's not
possible to do so for uBO itself.
By using `+mobile` as a value for this setting will force uBO
to act as if it's being executed on a mobile device.
Important: this setting is best used in a dedicated browser
profile, as this affects how filter lists are compiled. So best
to set it in a new browser profile, then force all filter lists
to be recompiled, and use the profile in the future when there
is a need to test the specific webext flavor.
When matching a network request in the static network filtering
engine ("snfe"), these are the possible outcomes, from most
to least likely:
- No block
- Block
- Unblock ("exception" filter overriding the block)
- Block-important ("important" filter override the unblock)
Hence why the matching in the snfe always check for a match in
the "block" realm, and the "unblock" realm would be checked
if and only if there was a match in the "block" realm.
However the "block-important" realm was always matched against
first, and when a match in that realm was found, there would
be no need to check in other realms since nothing can override
the "important" option. The problem with this approach though
is that matches in the "block-important" realm are most
unlikely, which means pointless work being done for vast
majority of network requests.
This commit makes it so that the "block-important" realm is
matched against ONLY when there is a matched "unblock" filter.
The result is a measurable improvement in the snfe-related
benchmarks (though given the numbers involved, end users won't
perceive a difference).
Somewhat related discussion which was the motivation to look
more into this:
https://github.com/cliqz-oss/adblocker/discussions/2170#discussioncomment-1168125
Whereas before the string segment was encoded as:
LL OOOOOOOOOOOO
where L are the upper 8 bits and used to encode the length
of the segment, and O are the lower 24 bits and used to
encode the offset of the string data in the character
buffer, the new code encode as follow:
OOOOOOOOOOOO LL
And furthermore the most significant bit of the length
LL is now used to mark whether the current string segment
is a label boundary.
This means a cell can't reference a segment longer then
127 characters. To work around this limitation for when a
segment is longer than 127 characters (a rare occurrence),
the algorithm will simply split the segment into multiple
adjacent cells.
As a result, there is no longer a need to encode
"boundariness" into special cells, which simplifies
both the storing and matching algorithms.
Additionally, added minimal documentation for the NPM
package on how to import and use HNTrieContainer as a
standalone API.