Related feedback:
- https://ilakovac.com/teespring-ublock-issue/
The surrogate script googletagmanager_gtm.js was essentially a
subset of surrogate script google-analytics_analytics.js. This
commit makes it a plain alias so that the whole GA API -- often
expected by clients of GTM -- is properly stubbed.
Reported internally:
> STR --
>
> Import https://cdn.statically.io/gh/uBlockOrigin/uAssets/master/filters/filters.txt
> as a Custom filter list.
>
> Observe the filter count at 24K instead of true count
> being 29K.
>
> Force updating is sometimes compiling 24K filters and
> on subsequent updates 29K, so I narrowed it down to
> this host.
Findings:
One of the sublists was erroring when being fetched on
this particular CDN server (reason unknown).
uBO was not properly handling network errors when
fetching a sublist. This commit make it so that if
a sublist can't be fetched, then the error is propagated
as if it affected the whole list, in which case uBO will
use an alternative URL if any.
This advanced setting is not really needed, as the
same can be accomplished with a broad exception
filter such as `#@#+js()`.
Related feedback:
- f5b453fae3 (commitcomment-49499082)
Ever since the `redirect` code was refactored:
157cef6034
This advanced setting is no longer needed, as the same
can be accomplished with a plain network filter:
@@*$redirect-rule
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1553
This commit ensures FLoC is opt-in. The generic filter
`*##+js(no-floc)` in "uBlock filters -- Privacy" ensures
the feature is disabled when using default settings/lists.
Users can opt-in to FLoC by adding a generic exception
filter to their custom filters, `#@#+js(no-floc)`; or they
can opt-in only for a specific set of websites through a
more specific exception filter:
example.com,shopping.example#@#+js(no-floc)
The syntax to remove response header is a special case
of HTML filtering, whereas the response headers are
targeted, rather than the response body:
example.com##^responseheader(header-name)
Where `header-name` is the name of the header to
remove, and must always be lowercase.
The removal of response headers can only be applied to
document resources, i.e. main- or sub-frames.
Only a limited set of headers can be targeted for
removal:
location
refresh
report-to
set-cookie
This limitation is to ensure that uBO never lowers the
security profile of web pages, i.e. we wouldn't want to
remove `content-security-policy`.
Given that the header removal occurs at onHeaderReceived
time, this new ability works for all browsers.
The motivation for this new filtering ability is instance
of website using a `refresh` header to redirect a visitor
to an undesirable destination after a few seconds.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1513
Prior to this commit, the ability to enable/disable the
uncloaking of canonical names was only available to advanced
users. This commit make it so that the setting can be
toggled from the _Settings_ pane.
The setting is enabled by default. The documentation should
be clear that the setting should not be disabled unless it
actually solves serious network issues, for example:
https://bugzilla.mozilla.org/show_bug.cgi?id=1694404
Also, as a result, the advanced setting `cnameUncloak` is no
longer available from within the advanced settings editor.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1501
Exception filters for `document` option are complying with
uBO's own semantic for `document` option, i.e. an exception
filter for `document` option will only allow to bypass a
block filter for `document` (either explicit or implicit)
and nothing else.
Exception filters using `document` option are *not*
compatible with ABP's interpretation of these filters.
Whereas in ABP the purpose of a `document` exception filter
is to wholly disable content blocking, in uBO the same
filter will just cause strict-blocking to be disabled while
leaving content blocking intact.
Additionally, the logger was fixed to properly report pages
which are being strict-blocked.
The procedural cosmetic filtering code has been split from
the content script code injected unconditionally and will
from now on be injected only when it is needed, i.e. when
there are procedural cosmetic filters to enforce.
The motivation for this is:
https://www.debugbear.com/blog/2020-chrome-extension-performance-report#what-can-extension-developers-do-to-keep-their-extensions-fast
Though uBO's content script injected unconditionally in all
pages/frames is relatively small, I still wanted to further
reduce the amount of content script code injected
unconditionally: The procedural cosmetic filtering code
represents roughly 14KB of code the browser won't have to
parse/execute unconditionally unless there exists procedural
cosmetic filters to enforce for a page or frame.
At the time the above article was published, the total
size of unconditional content scripts injected by uBO was
~101 KB, while after this commit, the total size will be
~57 KB (keeping in mind uBO does not minify and does not
remove comments from its JavaScript code).
Additionally, some refactoring on how user stylesheets are
injected so as to ensure that `:style`-based procedural
filters which are essentially declarative are injected
earlier along with plain, non-procedural cosmetic filters.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/210
Additionally, a small (experimental) widget has been added
to emphasize/de-emphasize rows which have 3rd-party
scripts/frames, so as to more easily identify which rows
are "affected" by 3rd-party scripts and/or frames.
Tooltip localization for the new widget is not available
yet as I want wait for the feature to be fully settled.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1480
Forward compatiblity was broken due to `externalLists`
being converted into an Array from a string, i.e.
downgrading to uBO 1.32.4 was completely breaking uBO.
This commit restores `externalLists` as a string which
is what older versions of uBO expect.
A new property `importedLists` has been created to
hold the imported lists as an array, while
`externalLists` will be kept around for a while until
it is completely removed in some future.
The managed `userSettings` entry is an array of entries,
where each entry is a name/value pair encoded into an array
of strings.
The first item in the entry array is the name of a setting,
and the second item is the stringified value for the
setting.
This is a more convenient way for administrators to set
specific user settings. The settings set through
`userSettings` policy will always be set at uBO launch
time.
The new entry is an array of strings, each representing a
distinct line, and all entries are used to populate the
"My filters" pane.
This offers an more straightforward way for administrators
to specify a list of custom filters to use for all
installations.
Content scripts can't properly look up effective context
for sandboxed frames. This commit add ability to extract
effective context from already existing store of frames
used for each tab.
The entry `toOverwrite.filterLists` is an array of
string, where each string is a token identifying a
stock filter list, or a URL for an external filter
list.
This new entry is to make it easier for an
administrator to centrally configure uBO with a
custom set of filter lists.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1433
The new "extraTrustedSiteDirectives" policy is an array
of strings, each of which is parsed as a trusted-site
directive to append to a user's own set of trusted-site
directives at launch time.
The added trusted-site directives will be considered as
part of the default set of directives by uBO.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1241
uBO will not discard secondary requests fired before a root
frame is committed, by ensuring that if newly uncommitted
root frames are of the same origin as previous one(s), the
uncommited journal slot pointer is not updated.
Related commit:
- 6ac09a2856
Patternless `queryprune` ar enow preserved as being
pattern-less while still attempting to extract a token
from the `queryprune` value. This allows to report the
filter in the logger same as its original form.
The following regex are not rejected as invalid when
using built-in regex objects:
/abc]/
/a7,18}/
/a{7,18/
However, as per documentation, they are not supposed to
be valid, as `{` and `}` are special characters and as
such should be escaped:
/abc\]/
/a7,18\}/
/a\{7,18/
With this commit, the regexes will additionally be
validated using the regex analyzer library in the editor
to ensure strict regex syntax compliance so as to avoid
what are likely mistakes in regex crafting by authors.
This commit fixes mouse double-click-and-drag operations,
which was broken due to the implementation of a custom
word selection in the filter list editor/viewer.
Regex-based static network filters are those most likely to
cause performance degradation, and as such the best guard
against undue performance degradation caused by regex-based
filters is the ability to extract valid and good tokens
from regex patterns.
This commit introduces a complete regex parser so that the
static network filtering engine can now safely extract
tokens regardless of the complexity of the regex pattern.
The regex parser is a library imported from:
https://github.com/foo123/RegexAnalyzer
The syntax highlighter adds an underline to regex-based
filters as a visual aid to filter authors so as to avoid
mistakenly creating regex-based filters. This commit
further colors the underline as a warning when a regex-based
filter is found to be untokenizable.
Filter list authors are invited to spot these untokenizable
regex-based filters in their lists to verify that no
mistake were made for those filters, causing them to be
untokenizabke. For example, what appears to be a mistake:
/^https?:\/\/.*\/sw.js?.[a-zA-Z0-9%]{50,}/
Though the mistake is minor, the regex-based filter above
is untokenizable as a result, and become tokenizable when
the `.` is properly escaped:
/^https?:\/\/.*\/sw\.js?.[a-zA-Z0-9%]{50,}/
Filter list authors can use this search expression in the
asset viewer to find instances of regex-based filters:
/^(@@)?\/[^\n]+\/(\$|$)/
This should improve usability of uBO's hard-mode
and "relax blocking mode" operations. This is the
new default behavior.
The previous behavior of forcing a reload of the
page can be re-enabled by simply setting the `3p`
bit of the advanced setting `blockingProfiles`
to 1.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1204
Not much can be done beside reporting to tabless network
requests to all tabs for which the context is a match.
A short term local cache is used to avoid having to iterate
through all existing tabs for each tabless network request
just to find and report to the matching ones -- users
reporting having a lot of opened tabs at once is not so
uncommon.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/857
The recognized resources are:
- abp-resource:blank-mp3
- abp-resource:blank-js
ABP's tokens are excluded from auto-complete so as to not
get in the way of uBO's filter list maintainers.
Reported internally by @uBlock-user.
Also, fixed broken caching of `cname` exception, which forced
uBO to repeatedly evaluate whether a `cname` exception exists
when a block `cname`-cloaked request is encountered.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1365
This commit adds the compiled magic version number to the
compiled data itself, and consequently this allows uBO
to no longer require that any given compiled list with a
mismatched format to be detected and discarded at launch
time.
Given this change, uBO no longer needs to rely on the
deletion of cached data at launch time to ensure it
won't use no longer valid compiled lists.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1388
Fixed the special `none` redirect resource no longer being
enforced.
Fixed the enforcement of `important` redirect rules over
exceptions and non-important ones.
Related issue:
- https://github.com/gorhill/uBlock/issues/1744
A new context menu entry, "Block element in frame...", will
be present when right-clicking on a frame element. When
this entry is clicked, uBO's element picker will be
launched from within the embedded frame and function the
same way as when launched from within the page.
This is particularly helpful for static network filters
used with filter options causing the same pattern to be
reused across multiple filter instances, i.e. `all` or
`~css`, etc.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1365
When compiled data format changes, do not rely on order
of operations at launch to assume deletion of storage
occurs before attempts to access it. It's unclear this
commit will fix the reported issue, as I could not
reproduce it except when outright commenting out the code
to prevent the storage deletion from occurring.
All matching `redirect-rule` directives will now be reported
in the logger, instead of just the effective one.
The highest-ranked redirect directive will be the one
effectively used for redirection. This way filter list
authors can see whether a lower priority redirect is
being overriden by a higher priority one.
The default priority has been changed to 10, so as to allow
more leeway to create lower ranked redirect directives.
Additonally, rendering of redirect directives with explicit
priority has been fixed in the logger, they will no longer
be rendered as unknown redirect tokens.
The header value is no longer implicitly a regex-based literal, but
a plain string against which the header name is compared. The value can
be set to a regex literal by bracing the header value with the usual
forward slashes, `/.../`.
Examples:
*$1p,strict3p,script,header=via:1.1 google
*$1p,strict3p,script,header=via:/1\.1\s+google/
The first form will cause a strict comparison with the value of the header
named `via` against the string `1.1 google`.
The second form will cause a regex-based test with the value of the header
named `via` against the regex `/1\.1\s+google/`.
The header value can be prepended with `~` to reverse the comparison:
*$1p,strict3p,script,header=via:~1.1 google
The header value is optional and may be ommitted to test only for the
presence of a specific header:
*$1p,strict3p,script,header=via
Related discussions:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1356#issuecomment-732411286
- https://github.com/AdguardTeam/CoreLibs/issues/1384
Changes:
Negation character is `~` (instead of `!`).
Drop special anchor character `|` -- leading `|`
will be supported until no such filter is present
in uBO's own filter lists. For example, instance
of `queryprune=|ad` will have to be replaced with
`queryprune=/^ad/` (or `queryprune=ad` if the name
of the parameter to remove is exactly `ad`).
Align semantic with that of AdGuard's `removeparam=`,
except that specifying multiple `|`-separated names
is not supported.
`match-case`
------------
Related issue:
- https://github.com/uBlockOrigin/uAssets/issues/8280#issuecomment-735245452
The new filter option `match-case` can be used only for
regex-based filters. Using `match-case` with any other
sort of filters will cause uBO to discard the filter.
`redirect=`
-----------
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1366
`redirect=` filters with unresolvable resource token at
runtime will be discarded.
Additionally, the implicit priority is now set to 1
(was 0). The idea is to allow custom `redirect=` filters
to be used strictly as fallback `redirect=` filters in case
another `redirect=` filter is not picked up.
For example, one might create a `redirect=click2load.html:0`
filter, to be taken if and only if the blocked resource is
not already being redirected by another "official" filter
in one of the enabled filter lists.
Related issue:
- https://github.com/gorhill/uBlock/issues/3590
Since the `redirect=` option was refactored into a modifier
filter, presence of a type (`script`, `xhr`, etc.) is no
longer a requirement.
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1356
Related commit:
- bde3164eb4
It is not possible to achieve perfect compatiblity at this
point, but reasonable compatibility should be achieved for
a majority of instances of `removeparam=`.
Notable differences:
--------------------
uBO always matches in a case insensitive manner, there is
no need to ask for case-insensitivity, and no need to use
uppercase characters in `queryprune=` values.
uBO does not escape special regex characters since the
`queryprune=` values are always assumed to be literal
regex expression (leaving out the documented special
characters). This means `removeparam=` with characters
which are special regex characters won't be properly
translated and are unlikely to work properly in uBO.
For example, the `queryprune` value of a filter such as
`$removeparam=__xts__[0]` internally become the literal
regex `/__xts__[0]/`, and consequently would not match
a query parameter such as `...?__xts__[0]=...`.
Notes:
------
Additionally, for performance reason, when uBO encounter
a pattern-less `queryprune=` (or `removeparam=`) filter,
it will try to extract a valid pattern from the
`queryprune=` value. For instance, the following filter:
$queryprune=utm_campaign
Will be translated internally into:
utm_campaign$queryprune=utm_campaign
The logger will reflect this internal translation.
New filter options
==================
Strict partyness: `1P`, `3P`
----------------------------
The current options 1p/3p are meant to "weakly" match partyness, i.e. a
network request is considered 1st-party to its context as long as both the
context and the request share the same base domain.
The new partyness options are meant to check for strict partyness, i.e. a
network request will be considered 1st-party if and only if both the context
and the request share the same hostname.
For examples:
- context: `www.example.org`
- request: `www.example.org`
- `1p`: yes, `1P`: yes
- `3p`: no, `3P`: no
- context: `www.example.org`
- request: `subdomain.example.org`
- `1p`: yes, `1P`: no
- `3p`: no, `3P`: yes
- context: `www.example.org`
- request: `www.example.com`
- `1p`: no, `1P`: no
- `3p`: yes, `3P`: yes
The strict partyness options will be visually emphasized in the editor so as
to prevent mistakenly using `1P` or `3P` where weak partyness is meant to be
used.
Filter on response headers: `header=`
-------------------------------------
Currently experimental and under evaluation. Disabled by default, enable by
toggling `filterOnHeaders` to `true` in advanced settings.
Ability to filter network requests according to whether a specific response
header is present and whether it matches or does not match a specific value.
For example:
*$1p,3P,script,header=via:1\.1\s+google
The above filter is meant to block network requests which fullfill all the
following conditions:
- is weakly 1st-party to the context
- is not strictly 1st-party to the context
- is of type `script`
- has a response HTTP header named `via`, which value matches the regular
expression `1\.1\s+google`.
The matches are always performed in a case-insensitive manner.
The header value is assumed to be a literal regular expression, except for
the following special characters:
- to anchor to start of string, use leading `|`, not `^`
- to anchor to end of string, use trailing `|`, not `$`
- to invert the test, use a leading `!`
To block a network request if it merely contains a specific HTTP header is
just a matter of specifying the header name without a header value:
*$1p,3P,script,header=via
Generic exception filters can be used to disable specific block `header=`
filters, i.e. `@@*$1p,3P,script,header` will override the block `header=`
filters given as example above.
Dynamic filtering's `allow` rules override block `headers=` filters.
Important: It is key that filter authors use as many narrowing filter options
as possible when using the `header=` option, and the `header=` option should
be used ONLY when other filter options are not sufficient.
More documentation justifying the purpose of `header=` option will be
provided eventually if ever it is decided to move it from experimental to
stable status.
To be decided: to restrict usage of this filter option to only uBO's own
filter lists or "My filters".
Changes
=======
Fine tuning `queryprune=`
-------------------------
The following changes have been implemented:
The special value `*` (i.e. `queryprune=*`) means "remove all query
parameters".
If the `queryprune=` value is made only of alphanumeric characters
(including `_`), the value will be internally converted to regex equivalent
`^value=`. This ensures a better future compatibility with AdGuard's
`removeparam=`.
If the `queryprune=` value starts with `!`, the test will be inverted. This
can be used to remove all query parameters EXCEPT those who match the
specified value.
Other
-----
The legacy code to test for spurious CSP reports has been removed. This
is no longer an issue ever since uBO redirects to local resources through
web accessible resources.
Notes
=====
The following new and recently added filter options are not compatible with
Chromium's manifest v3 changes:
- `queryprune=`
- `1P`
- `3P`
- `header=`
The auto-complete feature in the _"My filters"_ pane will
use hostname/domain from the set of opened tabs to assist
in entering values for `domain=` option. This also works
for the implict `domain=` option ṗrepending static extended
filters.
`about:srcdoc` frames are their own origin, trying to
use the origin of the parent context causes an
exception to be thrown when accessing location.href.
Notably, add clickable link to open the widget
in its own tab. Also, allows the URL to be text-
selected so that it becomes possible to use the
selection in a browser contextual menu's "Open
in a new tab" option.
Notably, make `queryprune` option available only
to filter list authors, until there are guards
against bad filters in some future and until the
option syntax and behavior is fully settled.
Instances of `queryprune` in filter lists will be
compiled, however instances of `queryprune` in
_"My filters"_ will be ignored unless users
indicated they are a filter list author.
The important bit is now considered an action bit
so that there is no more a need for the `important`
property in the parser. The modify bit is now
considered a realm bit.
When the modify bit is set, the action bits become
available to be used to further narrow the realm.
This could be useful in the future if we want to
spread the population of modifier filters across
different buckets.