There are currently over 160 patterns with such pointless
trailing `*^` in uBO's filter lists, which ended up being
compiled as generic pattern filters (i.e. regex-based
internally), while the trailing `*^` accomplishes nothing
since it will always match the end of a URL ( `^` can
also match the end of URL).
This commit discards pointless trailing `*^` in patterns,
thus allowing most of those filters to be compiled as
plain pattern filters.
The syntax highlighter will reflect that a trailing
`*^` is pointless.
Rearrange logic to instantiate and add `important` filters
to the block realm when compiled lists are loaded instead
of when lists are compiled.
Additionally, removed now unused properties following
commit 68e14793cc.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1863
As per internal discussion with team, best to have a simpler
scriplet, and which is hard-coded to work only on a specific
set of domains -- only those seen used by BAB.
Turns out the various benchmarks show no benefits when compiling
filters whose pattern contains a single wildcard character into
specialized classes which threat the pattern as two sub-patterns,
and actually there is a slight improvement in performance as per
benchamrks when treating these patterns as generic ones.
This also fixes the following related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1207
Fixed serious regression in previous dev build in applying
`csp=` filters. Reported internally by uBO team.
Promote usage of `removeparam` in code instead of `queryprune`,
which is to be deprecated.
Removed test against previously tested hostname in
FilterHostnameDict since as per various benchmark, the
test does not really help.
Remove serialization API in Node.js code as the API is now
present in SNFE itself.
All the auxiliary data structures must be fully loaded before
the data structure used as entry point is populated. The race
condition could lead to a case of the entry point data structure
being populated while the auxiliary data structures are still
unpopulated, potentially causing exceptions to be thrown at
launch when the static network filtering engine is queried.
I haven't been able to reproduce such exceptions -- but it
could happen on browsers which do not support being suspended
at launch time (i.e. chromium-based browsers).
Additionally, added convenience methods to easily
serialize/unserialize when SNFE is used as a npm package.
Related feedback:
- https://github.com/orgs/uBlockOrigin/teams/ublock-issues-volunteers/discussions/293
Related commit:
- 725e6931f5
Through all the changes, forgot to pay attention to scenarios
where the `filterData` needs to grow -- the buffer's defautl
size is set to accomodate default filter lists, and subscribing
to more lists would cause the static network filtering engine
to fail because the buffer was not resized when needed.
The original motivation is to further speed up launch time
for either non-selfie-based and selfie-based initialization
of the static network filtering engine (SNFE).
As a result of the refactoring:
Filters are no longer instance-based, they are sequence-of-
integer-based. This eliminates the need to create instances
of filters at launch, and consequently eliminates all the
calls to class constructors, the resulting churning of memory,
and so forth.
All the properties defining filter instances are now as much
as possible 32-bit integer-based, and these are allocated in a
single module-scoped typed array -- this eliminates the need
to allocate memory for every filter being instantiated.
Not all filter properties can be represented as a 32-bit
integer, and in this case a filter class can allocate slots
into another module-scoped array of references.
As a result, this eliminates a lot of memory allocations when
the SNFE is populated with filters, and this makes the saving
and loading of selfie more straightforward, as the operation
is reduced to saving/loading two arrays, one of 32-bit
integers, and the other, much smaller, an array JSON-able
values.
All filter classes now only contain static methods, and all
of these methods are called with an index to the specific
filter data in the module-scoped array of 32-bit integers.
The filter sequences (used to avoid the use of JS arrays) are
also allocated in the single module-scoped array of 32-bit
integers -- they used to be stored in their own dedicated
array.
Additionally, some filters are now loaded more in a deferred
way, so as reduce uBO's time-to-readiness -- the outcome of
this still needs to be evaluated, time-to-readiness is
especially a concern in Firefox for Android or less powerful
computers.