1
0
mirror of https://github.com/gorhill/uBlock.git synced 2024-10-04 08:37:11 +02:00
Commit Graph

11 Commits

Author SHA1 Message Date
Raymond Hill
0e6d607484
Add checksum validation when loading trie buffers in selfie
Related issue:
https://github.com/uBlockOrigin/uBlock-issues/issues/3217#issuecomment-2103048654
2024-05-09 21:29:24 -04:00
Raymond Hill
086766a924
Redesign cache storage
In uBO, the "cache storage" is used to save resources which can
be safely discarded, though at the cost of having to fetch or
recompute them again.

Extension storage (browser.storage.local) is now always used as
cache storage backend. This has always been the default for
Chromium-based browsers.

For Firefox-based browsers, IndexedDB was used as backend for
cache storage, with fallback to extension storage when using
Firefox in private mode by default.

Extension storage is reliable since it works in all contexts,
though it may not be the most performant one.

To speed-up loading of resources from extension storage, uBO will
now make use of Cache API storage, which will mirror content of
key assets saved to extension storage. Typically loading resources
from Cache API is faster than loading the same resources from
the extension storage.

Only resources which must be loaded in memory as fast as possible
will make use of the Cache API storage layered on top of the
extension storage.

Compiled filter lists and memory snapshot of filtering engines
(aka "selfies") will be mirrored to the Cache API storage, since
these must be loaded into memory as fast as possible, and reloading
filter lists from their compiled counterpart is a common
operation.

This new design makes it now seamless to work in permanent private
mode for Firefox-based browsers, since extension storage now
always contains cache-related assets.

Support for IndexedDB is removed for the time being, except to
support migration of cached assets the first time uBO runs with
the new cache storage design.

In order to easily support all choices of storage, a new serializer
has been introduced, which is capable of serializing/deserializing
structure-cloneable data to/from a JS string.

Because of this new serializer, JS data structures can be stored
directly from their native representation, and deserialized
directly to their native representation from uBO's point of view,
since the serialization occurs (if needed) only at the storage
interface level.

This new serializer simplifies many code paths where data
structures such as Set, Map, TypedArray, RegExp, etc. had to be
converted in a disparate manner to be able to persist them to
extension storage.

The new serializer supports workers and LZ4 compression. These
can be configured through advanced settings.

With this new layered design, it's possible to introduce more
storage layers if measured as beneficial (i.e. maybe
browser.storage.session)

References:
- https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/storage/local
- https://developer.mozilla.org/en-US/docs/Web/API/Cache
- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm
2024-02-26 16:50:11 -05:00
Raymond Hill
a969a672e0
Change official description in source code top comment 2023-12-04 12:10:34 -05:00
Raymond Hill
a559f5f271
Add experimental mv3 version
This create a separate Chromium extension, named
"uBO Minus (MV3)".

This experimental mv3 version supports only the blocking of
network requests through the declarativeNetRequest API, so as
to abide by the stated MV3 philosophy of not requiring broad
"read/modify data" permission. Accordingly, the extension
should not trigger the warning at installation time:

    Read and change all your data on all websites

The consequences of being permission-less are the following:

- No cosmetic filtering (##)
- No scriptlet injection (##+js)
- No redirect= filters
- No csp= filters
- No removeparam= filters

At this point there is no popup panel or options pages.

The default filterset correspond to the default filterset of
uBO proper:

Listset for 'default':
  https://ublockorigin.github.io/uAssets/filters/badware.txt
  https://ublockorigin.github.io/uAssets/filters/filters.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2020.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2021.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2022.txt
  https://ublockorigin.github.io/uAssets/filters/privacy.txt
  https://ublockorigin.github.io/uAssets/filters/quick-fixes.txt
  https://ublockorigin.github.io/uAssets/filters/resource-abuse.txt
  https://ublockorigin.github.io/uAssets/filters/unbreak.txt
  https://easylist.to/easylist/easylist.txt
  https://easylist.to/easylist/easyprivacy.txt
  https://malware-filter.gitlab.io/malware-filter/urlhaus-filter-online.txt
  https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=1&mimetype=plaintext

The result of the conversion of the filters in all these
filter lists is as follow:

Ruleset size for 'default': 22245
  Good: 21408
  Maybe good (regexes): 127
  redirect-rule= (discarded): 458
  csp= (discarded): 85
  removeparams= (discarded): 22
  Unsupported: 145

The fact that the number of DNR rules are far lower than the
number of network filters reported in uBO comes from the fact
that lists-to-rulesets converter does its best to coallesce
filters into minimal set of rules. Notably, the DNR's
requestDomains condition property allows to create a single
DNR rule out of all pure hostname-based filters.

Regex-based rules are dynamically added at launch time since
they must be validated as valid DNR regexes through
isRegexSupported() API call.

At this point I consider being permission-less the limiting
factor: if broad "read/modify data" permission is to be used,
than there is not much point for an MV3 version over MV2, just
use the MV2 version if you want to benefit all the features
which can't be implemented without broad "read/modify data"
permission.

To locally build the MV3 extension:

    make mv3

Then load the resulting extension directory in the browser
using the "Load unpacked" button.

From now on there will be a uBlock0.mv3.zip package available
in each release.
2022-09-06 13:47:52 -04:00
Raymond Hill
4d482f9133
Store regex filter pattern into bidi-trie buffer
As was done with generic pattern-based filters, the source
string of regex-based filters is now stored into the
bidi-trie (pattern) buffer.

Additionally, added a new "dev tools" page to more
conveniently peer into uBO's internals at run time, without
having to do so from the browser's dev console -- something
which has become more difficult with the use of JS modules.

The new page can be launched from the Support pane through
the "More" button in the troubleshooting section.

The benchmark button in the About pane has been moved to this
new "dev tools" page.

The new "dev tools" page is for development purpose only,
do not open issues about it.
2021-12-12 10:32:49 -05:00
Raymond Hill
725e6931f5
Refactoring work in static network filtering engine
The original motivation is to further speed up launch time
for either non-selfie-based and selfie-based initialization
of the static network filtering engine (SNFE).

As a result of the refactoring:

Filters are no longer instance-based, they are sequence-of-
integer-based. This eliminates the need to create instances
of filters at launch, and consequently eliminates all the
calls to class constructors, the resulting churning of memory,
and so forth.

All the properties defining filter instances are now as much
as possible 32-bit integer-based, and these are allocated in a
single module-scoped typed array -- this eliminates the need
to allocate memory for every filter being instantiated.

Not all filter properties can be represented as a 32-bit
integer, and in this case a filter class can allocate slots
into another module-scoped array of references.

As a result, this eliminates a lot of memory allocations when
the SNFE is populated with filters, and this makes the saving
and loading of selfie more straightforward, as the operation
is reduced to saving/loading two arrays, one of 32-bit
integers, and the other, much smaller, an array JSON-able
values.

All filter classes now only contain static methods, and all
of these methods are called with an index to the specific
filter data in the module-scoped array of 32-bit integers.

The filter sequences (used to avoid the use of JS arrays) are
also allocated in the single module-scoped array of 32-bit
integers -- they used to be stored in their own dedicated
array.

Additionally, some filters are now loaded more in a deferred
way, so as reduce uBO's time-to-readiness -- the outcome of
this still needs to be evaluated, time-to-readiness is
especially a concern in Firefox for Android or less powerful
computers.
2021-12-04 11:16:44 -05:00
Manish Jethani
925c01dc14
Fix ESLint globals error in biditrie.js (#3850) 2021-08-23 11:10:49 -04:00
Manish Jethani
d959c7aabe
Remove globals.js (#3849) 2021-08-23 10:54:16 -04:00
Manish Jethani
ad69c760fb
Run ESLint during Node.js package generation (#3798) 2021-08-02 16:55:03 -04:00
Raymond Hill
98fc66bb1b
Add support for enabling WASM code paths in NodeJS package
See `test.js` for reference on how to enable WASM code
paths (which are disabled by default).
2021-07-29 16:54:51 -04:00
Raymond Hill
22022f636f
Modularize codebase with export/import
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664

The changes are enough to fulfill the related issue.

A new platform has been added in order to allow for building
a NodeJS package. From the root of the project:

    ./tools/make-nodejs

This will create new uBlock0.nodejs directory in the
./dist/build directory, which is a valid NodeJS package.

From the root of the package, you can try:

    node test

This will instantiate a static network filtering engine,
populated by easylist and easyprivacy, which can be used
to match network requests by filling the appropriate
filtering context object.

The test.js file contains code which is typical example
of usage of the package.

Limitations: the NodeJS package can't execute the WASM
versions of the code since the WASM module requires the
use of fetch(), which is not available in NodeJS.

This is a first pass at modularizing the codebase, and
while at it a number of opportunistic small rewrites
have also been made.

This commit requires the minimum supported version for
Chromium and Firefox be raised to 61 and 60 respectively.
2021-07-27 17:26:04 -04:00