In uBO, the "cache storage" is used to save resources which can
be safely discarded, though at the cost of having to fetch or
recompute them again.
Extension storage (browser.storage.local) is now always used as
cache storage backend. This has always been the default for
Chromium-based browsers.
For Firefox-based browsers, IndexedDB was used as backend for
cache storage, with fallback to extension storage when using
Firefox in private mode by default.
Extension storage is reliable since it works in all contexts,
though it may not be the most performant one.
To speed-up loading of resources from extension storage, uBO will
now make use of Cache API storage, which will mirror content of
key assets saved to extension storage. Typically loading resources
from Cache API is faster than loading the same resources from
the extension storage.
Only resources which must be loaded in memory as fast as possible
will make use of the Cache API storage layered on top of the
extension storage.
Compiled filter lists and memory snapshot of filtering engines
(aka "selfies") will be mirrored to the Cache API storage, since
these must be loaded into memory as fast as possible, and reloading
filter lists from their compiled counterpart is a common
operation.
This new design makes it now seamless to work in permanent private
mode for Firefox-based browsers, since extension storage now
always contains cache-related assets.
Support for IndexedDB is removed for the time being, except to
support migration of cached assets the first time uBO runs with
the new cache storage design.
In order to easily support all choices of storage, a new serializer
has been introduced, which is capable of serializing/deserializing
structure-cloneable data to/from a JS string.
Because of this new serializer, JS data structures can be stored
directly from their native representation, and deserialized
directly to their native representation from uBO's point of view,
since the serialization occurs (if needed) only at the storage
interface level.
This new serializer simplifies many code paths where data
structures such as Set, Map, TypedArray, RegExp, etc. had to be
converted in a disparate manner to be able to persist them to
extension storage.
The new serializer supports workers and LZ4 compression. These
can be configured through advanced settings.
With this new layered design, it's possible to introduce more
storage layers if measured as beneficial (i.e. maybe
browser.storage.session)
References:
- https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/storage/local
- https://developer.mozilla.org/en-US/docs/Web/API/Cache
- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm
This is to reduce the diff size of rulesets in new
releases. Beside smaller diff size, this also makes it
easier to investigate rule changes across releases.
This commit brings the following changes to the logger:
All logging output generated by injected scriptlets are now sent to
the logger, the developer console will no longer be used to log
scriptlet logging information.
When the logger is not opened, the scriplets will not output any
logging information.
The goal with this new approach is to allow filter authors to
more easily assess the working of scriptlets without having to
go through scriptlet parameters to enable logging.
Consequently all the previous ways to tell scriptlets to log
information are now obsolete: if the logger is opened, the
scriptlets will log information to the logger.
Another benefit of this approach is that the dev tools do not
need to be open to obtain scriptlets logging information.
Accordingly, new filter expressions have been added to the logger:
"info" and "error". Selecting the "scriptlet" expression will also
keep the logging information from scriptlets.
A new button has been added to the logger (not yet i18n-ed): a
"volume" icon, which allows to enable verbose mode. When verbose
mode is enabled, the scriptlets may choose to output more
information regarding their inner working.
The entries in the logger will automatically expand on mouse hover.
This allows to scroll through entries which text does not fit into
a single row.
Clicking anywhere on an entry in the logger will open the detailed
view when applicable.
Generic information/errors will now be rendered regardless of which
tab is currently selected in the logger (similar to how tabless
entries are already being rendered).
Procedural filters with `:xpath` operator were silently rejected
at conversion time because the parser was failing to evaluate the
xpath expression due to the absence of a `document` object in
nodejs.
If `document` object is not present, the parser will assume the
xpath expression is valid.
The idea is to remove as many dependencies as possible for
low-level ScriptletFilteringEngine in order to make it easier
to reuse the module outside uBO itself.
The high-level derived class takes care of caching and
injection of scriptlets into documents, which requires
more knowledge about the environment in which scriptlets
are to be used.
Also improve scriptlet cache usage to minimize overhead of
retrieving scriptlets.
Related issue:
https://github.com/uBlockOrigin/uBlock-issues/issues/2969
Changes:
Use browser.alarms to trigger selfie creation. Presence of a selfie
improve markedly time to readiness when uBO is unsuspended.
Mirror content of storage.local to (in-memory) storage.session for
faster load to readiness when uBO is ususpended.
Broadcast channels are more suited to uBO than DOM events to dispatch
notifications to different parts of uBO.
DOM events can only be dispatched to local context, broadcast channels
dispatch to all contexts (i.e. background process, workers, auxiliary
pages) -- this last behavior is better suited to uBO to communicate
internal changes to all potential listeners, not just those in the local
context.
Additionally, broadcasting to content scripts is now done through
tabs.sendMessage() instead of through potentially opened message
ports, this simplifies broadcasting to content scripts, and this
doesn't require to have long-lived message ports in content
scripts.
When the target world of a scriptlet is the ISOLATED one,
skip Blob-based injection in Firefox, as the current world
is always the ISOLATED one. This should make ISOLATED
world-based scriptlets more reliable (i.e. execute sooner)
in Firefox.
The `urltransform` option allows to redirect a non-blocked network
request to another URL. There are restrictions on its usage:
- require a trusted source -- thus uBO-maintained lists or user
filters
- the `urltransform` value must start with a `/`
If at least one of these conditions is not fulfilled, the filter
will be invalid and rejected.
The requirement to start with `/` is to enforce that only the path
part of a URL can be modified, thus ensuring the network request
is redirected to the same scheme and authority (as defined at
https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax).
Usage example (redirect requests for CSS resources to a non-existing
resource, for demonstration purpose):
||iana.org^$css,urltransform=/notfound.css
Name of this option is inspired from DNR API:
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/declarativeNetRequest/URLTransform
This commit required to bring the concept of "trusted source" to
the static network filtering engine.
As per discussion with uBO volunteers.
Volunteers offering support for uBO will be able to craft links with
specially formed URLs, which once clicked will cause uBO to automatically
force an update of specified filter lists.
The URL must be crafted as shown in the example below:
https://ublockorigin.github.io/uAssets/update-lists.html?listkeys=ublock-filters,easylist
Where the `listkeys` parameter is a comma-separated list of tokens
corresponding to filter lists. If a token does not match an enabled
filter list, it will be ignored.
The ability to update filter lists through a specially crafted link
is available only on uBO's own support sites:
- https://github.com/uBlockOrigin/
- https://reddit.com/r/uBlockOrigin/
- https://ublockorigin.github.io/
Additionally, a visual cue has been added in the "Filter lists" pane
to easily spot the filter lists which have been recently updated, where
"recently" is currently defined as less than an hour ago.
Additionally, finalize versioning scheme for uBOL. Since most updates
will be simply related to update rulesets, the version will from now
on reflects the date at which the extension package was created:
year.month.day.minutes
So for example:
2023.8.19.690
Related issue:
https://github.com/uBlockOrigin/uBlock-issues/issues/2773
The `randomize` paramater introduced in https://github.com/gorhill/uBlock/commit/418087de9c
is now named `directive`, and beside the `true` value which is meant
to respond with a random 10-character string, it can now take the
following value:
war:[web_accessible_resource name]
In order to mock the XHR response with a web accessible resource. For
example:
piquark6046.github.io##+js(no-xhr-if, adsbygoogle.js, war:googlesyndication_adsbygoogle.js)
Will cause the XHR performed by the webpage to resolve to the content
of `/web_accessible_resources/googlesyndication_adsbygoogle.js`.
Should the resource not exist, the empty string will be returned.
Additionally:
Use `export UBO_VERSION=local` at the console to build MV3 extension using
current version of uBO code base. By default, the version is taken from
`./platform/mv3/ubo-version' and usually set to last stable release.
Since in uBOL filter lists from various sources are combined into
a single list, there must be a way to turn on/off trust level
inside the resulting combined filter list so as to be able to
validate the trust level of filters requiring trust.
This commit adds new parser directives understood only by MV3
compiler to turn on/off trust flag internally.
See `managed_storage.json` for available settings. Currently
only `noFiltering` setting is availale.
`noFiltering` is an array of strings, each being a domain for
which no filtering should occur.
Related discussion:
- https://github.com/uBlockOrigin/uBOL-issues/discussions/35
Avoid using updateContentScripts() as it suffers from an unexpected
behavior, causing injected content scripts to lose proper order
at injection time. The order in which content scripts are injected
is key for uBOL content scripts. Potential out of order injection
was causing cosmetic filtering to be broken.
Use actual storage API to persist data across service worker
wake-ups and browser launches. uBOL was trying to avoid using
storage API, at the cost of somewhat hacky code (using DNR API
to persist settings).
Make use of session storage if available, to speed up
initialization of waking up the service worker (which at this
point is necessary to properly implement cosmetic filtering).
Related issues:
- https://github.com/uBlockOrigin/uBOL-issues/issues/5#issuecomment-1575425913
- https://github.com/w3c/webextensions/issues/403
Currently, there is no other way to inject CSS user styles than to
wake up the service worker, so that it can inject the CSS styles
itself using the `scripting.insertCSS()` method.
If ever the MV3 API supports injecting CSS user styles directly
from a content script, uBOL will be back to be fully declarative.
At this point the service worker is very lightweight since the
filtering is completely declarative, so this is not too much of
an issue performance-wise except for the fact that waking up the
service worker for the sole purpose of injecting CSS user styles
and nothing else introduces a pointless overhead.
Hopefully the MV3 API will mature to address such inefficiency.
Specifically, avoid long list of hostnames for the `matches`
property[1] when registering the content scripts, as this was causing
whole browser freeze for long seconds in Chromium-based browsers
(reason unknown).
The content scripts themselves will sort out which cosmetic filters to
apply on which websites.
This change makes it now possible to support annoyances-related lists,
and thus two lists have been added:
- EasyList -- Annoyances
- EasyList -- Cookies
Related issue:
- https://github.com/uBlockOrigin/uBOL-issues/issues/5
These annoyances-related lists contains many thousands of specific
cosmetic filters and as a result, before the above change this was
causing long seconds of whole browser freeze when simply modifying
the blocking mode of a specific site via the slider in the popup
panel.
It is now virtually instantaneous, at the cost of injecting larger
cosmetic filtering-related content scripts (which typically should
be garbage-collected within single-digit milliseconds).
Also, added support for entity-based cosmetic filters. (They were
previously discarded).
---
[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/scripting/RegisteredContentScript
Source code of scriplets is now fetched directly from uBO
project, so there is no longer the need to keep duplicate
versions of scriplet code.
All scriplet filters are now supported.
Some filters with entity-based domain option can be salvaged
when there are non-entity-based domain option, but since we are
throwing away the entity-based entries, we are only partially
converting to DNR. This commit will log a warning about this
in log.txt. Before this commit, only non-salvageable filters
were logged.
This reflects the _world_ of the MV3 scripting API:
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/scripting/ExecutionWorld
MAIN: page's world
ISOLATED: extension's content script world
Some scriptlets are best executed in either world, so this
commit allows to pick in which world a scriptlet should execute
(default to MAIN).
For instance, the new sed.js scriptlet will now execute in
the ISOLATED world.