1
0
mirror of https://github.com/gorhill/uBlock.git synced 2024-11-17 16:02:33 +01:00
Commit Graph

265 Commits

Author SHA1 Message Date
Raymond Hill
a71b71e4c8
New cosmetic filter parser using CSSTree library
The new parser no longer uses the browser DOM to validate
that a cosmetic filter is valid or not, this is now done
through a JS library, CSSTree.

This means filter list authors will have to be more careful
to ensure that a cosmetic filter is really valid, as there is
no more guarantee that a cosmetic filter which works for a
given browser/version will still work properly on another
browser, or different version of the same browser.

This change has become necessary because of many reasons,
one of them being the flakiness of the previous parser as
exposed by many issues lately:

- https://github.com/uBlockOrigin/uBlock-issues/issues/2262
- https://github.com/uBlockOrigin/uBlock-issues/issues/2228

The new parser introduces breaking changes, there was no way
to do otherwise. Some current procedural cosmetic filters will
be shown as invalid with this change. This occurs because the
CSSTree library gets confused with some syntax which was
previously allowed by the previous parser because it was more
permissive.

Mainly the issue is with the arguments passed to some procedural
cosmetic filters, and these issues can be solved as follow:

Use quotes around the argument. You can use either single or
double-quotes, whichever is most convenient. If your argument
contains a single quote, use double-quotes, and vice versa.

Additionally, try to escape a quote inside an argument using
backslash. THis may work, but if not, use quotes around the
argument.

When the parser encounter quotes around an argument, it will
discard them before trying to process the argument, same with
escaped quotes inside the argument. Examples:

Breakage:

    ...##^script:has-text(toscr')

Fix:

    ...##^script:has-text(toscr\')

Breakage:

    ...##:xpath(//*[contains(text(),"VPN")]):upward(2)

Fix:

    ...##:xpath('//*[contains(text(),"VPN")]'):upward(2)

There are not many filters which break in the default set of
filter lists, so this should be workable for default lists.

Unfortunately those fixes will break the filter for previous
versions of uBO since these to not deal with quoted argument.
In such case, it may be necessary to keep the previous filter,
which will be discarded as broken on newer version of uBO.

THis was a necessary change as the old parser was becoming
more and more flaky after being constantly patched for new
cases arising, The new parser should be far more robust and
stay robist through expanding procedural cosmetic filter
syntax.

Additionally, in the MV3 version, filters are pre-compiled
using a Nodejs script, i.e. outside the browser, so validating
cosmetic filters using a live DOM no longer made sense.

This new parser will have to be tested throughly before stable
release.
2022-09-23 16:03:13 -04:00
Raymond Hill
e31637af78
[mv3] Add ability to enable/disable filter lists 2022-09-13 17:44:24 -04:00
Raymond Hill
a559f5f271
Add experimental mv3 version
This create a separate Chromium extension, named
"uBO Minus (MV3)".

This experimental mv3 version supports only the blocking of
network requests through the declarativeNetRequest API, so as
to abide by the stated MV3 philosophy of not requiring broad
"read/modify data" permission. Accordingly, the extension
should not trigger the warning at installation time:

    Read and change all your data on all websites

The consequences of being permission-less are the following:

- No cosmetic filtering (##)
- No scriptlet injection (##+js)
- No redirect= filters
- No csp= filters
- No removeparam= filters

At this point there is no popup panel or options pages.

The default filterset correspond to the default filterset of
uBO proper:

Listset for 'default':
  https://ublockorigin.github.io/uAssets/filters/badware.txt
  https://ublockorigin.github.io/uAssets/filters/filters.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2020.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2021.txt
  https://ublockorigin.github.io/uAssets/filters/filters-2022.txt
  https://ublockorigin.github.io/uAssets/filters/privacy.txt
  https://ublockorigin.github.io/uAssets/filters/quick-fixes.txt
  https://ublockorigin.github.io/uAssets/filters/resource-abuse.txt
  https://ublockorigin.github.io/uAssets/filters/unbreak.txt
  https://easylist.to/easylist/easylist.txt
  https://easylist.to/easylist/easyprivacy.txt
  https://malware-filter.gitlab.io/malware-filter/urlhaus-filter-online.txt
  https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=1&mimetype=plaintext

The result of the conversion of the filters in all these
filter lists is as follow:

Ruleset size for 'default': 22245
  Good: 21408
  Maybe good (regexes): 127
  redirect-rule= (discarded): 458
  csp= (discarded): 85
  removeparams= (discarded): 22
  Unsupported: 145

The fact that the number of DNR rules are far lower than the
number of network filters reported in uBO comes from the fact
that lists-to-rulesets converter does its best to coallesce
filters into minimal set of rules. Notably, the DNR's
requestDomains condition property allows to create a single
DNR rule out of all pure hostname-based filters.

Regex-based rules are dynamically added at launch time since
they must be validated as valid DNR regexes through
isRegexSupported() API call.

At this point I consider being permission-less the limiting
factor: if broad "read/modify data" permission is to be used,
than there is not much point for an MV3 version over MV2, just
use the MV2 version if you want to benefit all the features
which can't be implemented without broad "read/modify data"
permission.

To locally build the MV3 extension:

    make mv3

Then load the resulting extension directory in the browser
using the "Load unpacked" button.

From now on there will be a uBlock0.mv3.zip package available
in each release.
2022-09-06 13:47:52 -04:00
Raymond Hill
18bc4dd8b8
Lower allowed minimum Expires directive to "12 hours" (from "1 day") 2022-03-24 13:23:24 -04:00
Raymond Hill
34cca8349b
Do not always convert removed stock list into imported list
If the removed stock list is labelled a "bad list", do not
convert it into an imported list.

This will allow to seamlessly merge resource-abuse stock list
with privacy stock list when 1.42.0 is widespread.
2022-03-18 13:27:07 -04:00
myersg86
6573a59a59
Fix typos in README, docs, and JS comments 2022-03-13 08:56:26 -04:00
Raymond Hill
2933016d4b
Rework behavior of "Suspend network activity until ..."
The setting will default to the natural capability of the browser:

- Checked for Firefox
- Unchecked for Chromium-based browsers

For Chromium-based browser, if checked, network requests will be
redirected to an empty resources instead of blocking the
connection.

Related feedback:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1973
- https://www.reddit.com/r/uBlockOrigin/comments/squo8n/latest_update_blocks_network_connections_at/
2022-02-13 09:24:57 -05:00
Raymond Hill
9b22961291
Properly report user-filters in troubleshooting information
User filters are enabled by default, they should be reported
under the `listset` section, along with how many filters are
being enforced.
2022-01-11 07:55:37 -05:00
Raymond Hill
925c8d5d0c
Add setting to control suspension on network activity at launch
Related discussion:
- a0a9497b4a (commitcomment-62560291)

The new setting, when disabled (enabled by default), allows a user
to prevent uBO from waiting for all filter lists to be loaded
before allowing network activity at launch. The setting is enabled
by default, meaning uBO waits for all filter lists to be loaded in
memory before unsuspending network activity. Some users may find
this behavior undesirable, hence the new setting.

This gives the option to potentially speed up page load at launch,
at the cost of potentially not properly filtering network requests
as per filter lists/rules.

For platforms not supporting the suspension of network activity,
the setting will merely prevent whatever mechanism exists on the
platform to mitigate improper filtering of network requests at
launch. For example, in Chromium-based browsers, unchecking the
new setting will prevent the browser from re-loading tabs for
which there was network activity while in "suspended" state at
launch.
2021-12-30 09:24:38 -05:00
Raymond Hill
72bb89495b
Change compiled list format to a saner block id management
Just use self-described readable section identifiers instead
of difficult-to-manage arbitrary integers.
2021-12-07 11:15:14 -05:00
Raymond Hill
4efa6be96b
Fix sticky imported list after removal
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1803
2021-11-08 12:49:03 -05:00
Raymond Hill
5daa6a7ff3
Get current language using extensions API (instead of navigator.language)
Related feedback:
- https://github.com/gorhill/uBlock/pull/3860
2021-11-06 12:49:27 -04:00
Manish Jethani
9761b02c79
Convert publicsuffixlist.js into an ES module (#3846) 2021-08-23 09:42:27 -04:00
Manish Jethani
9ddbb293c0
Convert punycode.js into an ES module (#3845) 2021-08-22 12:03:59 -04:00
Raymond Hill
a33f70cf20
Provide compiler/selfie versions for snfe
So as to allow nodejs usage to better deal with
out of date serialization/compilation.

Additionally, use FilterImportant() only when a
"block-important" filter is stored in the "block" realm.
2021-08-16 12:15:30 -04:00
Raymond Hill
22768ddcd0
Remove undue dependencies on vAPI
Whether WebAssembly can be enabled or not should be
decided at a higher level.
2021-08-08 11:41:05 -04:00
Raymond Hill
4818405cf6
Remove need to pass parser at every compile() call
The compiler instance is already initialized with a
reference to the parser, no need to keep passing the
reference at each call to compile().
2021-08-05 13:30:20 -04:00
Raymond Hill
85c68116bd
Group all compiling-related code into FilterCompiler() class
In the static network filtering engine (snfe), the
compiling-related code was spread across two classes.
This commit makes it so that all the compiling-related
code is in FilterCompiler class, which clear purpose is
to compile raw filters into a form which can be persisted
and later fed to the snfe with no parsing overhead.

To compile raw static network filter, the new approach is:

    snfe.createCompiler(parser);

Then for each single raw filter to compile:

    compiler.compile(parser, writer);

The caller is responsible to keep a reference to the
compiler instance for as long as it is needed. This removes
the need for the clunky code used to keep an instance of
compiler alive in the snfe.

Additionally, snfe.tokenHistograms() has been moved to
benchmarks.js, as it has no dependency on the snfe, it's
just a utility function.
2021-08-04 15:14:48 -04:00
Raymond Hill
89c5653bc6
Export the rule-based filtering engines to the nodejs package
The code exported to nodejs package was revised to use modern
JavaScript syntax. A few issues were fixed at the same time.

The exported classes are:
- DynamicHostRuleFiltering
- DynamicURLRuleFiltering
- DynamicSwitchRuleFiltering

These related to the content the of "My rules" pane in the
uBlock Origin extension.
2021-08-03 12:19:25 -04:00
Raymond Hill
f8daea085b
Remove assets dependency from redirect engine
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664

This change allows to add the redirect engine into the
nodejs package. The purpose of the redirect engine is to
resolve a redirect token into a path to a local resource,
to be used by the caller as wished.
2021-08-02 09:23:48 -04:00
Raymond Hill
cb72211795
Move orphanizeString() into text-utils module
Another small step toward the goal of reducing dependency
on `µb`.

Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664

text-iterators module has been renamed text-utils to better
reflect its content.
2021-07-31 08:38:33 -04:00
Raymond Hill
98fc66bb1b
Add support for enabling WASM code paths in NodeJS package
See `test.js` for reference on how to enable WASM code
paths (which are disabled by default).
2021-07-29 16:54:51 -04:00
Raymond Hill
62b6826dd5
Further modularize uBO's codebase
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664

Modularization is a necessary step toward possibly publishing
a more complete nodejs package to allow using uBO's filtering
capabilities outside of the uBO extension.

Additionally, as per feedback, remove undue usage of console
output as per feedback:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664#issuecomment-888451032
2021-07-28 19:48:38 -04:00
Raymond Hill
22022f636f
Modularize codebase with export/import
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1664

The changes are enough to fulfill the related issue.

A new platform has been added in order to allow for building
a NodeJS package. From the root of the project:

    ./tools/make-nodejs

This will create new uBlock0.nodejs directory in the
./dist/build directory, which is a valid NodeJS package.

From the root of the package, you can try:

    node test

This will instantiate a static network filtering engine,
populated by easylist and easyprivacy, which can be used
to match network requests by filling the appropriate
filtering context object.

The test.js file contains code which is typical example
of usage of the package.

Limitations: the NodeJS package can't execute the WASM
versions of the code since the WASM module requires the
use of fetch(), which is not available in NodeJS.

This is a first pass at modularizing the codebase, and
while at it a number of opportunistic small rewrites
have also been made.

This commit requires the minimum supported version for
Chromium and Firefox be raised to 61 and 60 respectively.
2021-07-27 17:26:04 -04:00
Raymond Hill
e85c6f2d3e
Merge background changes to user filters in "My filters" pane
Related issue:
- https://github.com/gorhill/uBlock/issues/3704
2021-07-17 12:03:56 -04:00
Raymond Hill
1c3b45f75d
Expose ability to toggle on/off cname-uncloaking to all users
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1513

Prior to this commit, the ability to enable/disable the
uncloaking of canonical names was only available to advanced
users. This commit make it so that the setting can be
toggled from the _Settings_ pane.

The setting is enabled by default. The documentation should
be clear that the setting should not be disabled unless it
actually solves serious network issues, for example:

https://bugzilla.mozilla.org/show_bug.cgi?id=1694404

Also, as a result, the advanced setting `cnameUncloak` is no
longer available from within the advanced settings editor.
2021-03-02 13:00:56 -05:00
Raymond Hill
3bb73065e3
Fix broken forward compatibility re. imported lists
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1480

Forward compatiblity was broken due to `externalLists`
being converted into an Array from a string, i.e.
downgrading to uBO 1.32.4 was completely breaking uBO.

This commit restores `externalLists` as a string which
is what older versions of uBO expect.

A new property `importedLists` has been created to
hold the imported lists as an array, while
`externalLists` will be kept around for a while until
it is completely removed in some future.
2021-01-31 10:30:12 -05:00
Raymond Hill
6eb1246508
Add userSettings entry to managed storage
The managed `userSettings` entry is an array of entries,
where each entry is a name/value pair encoded into an array
of strings.

The first item in the entry array is the name of a setting,
and the second item is the stringified value for the
setting.

This is a more convenient way for administrators to set
specific user settings. The settings set through
`userSettings` policy will always be set at uBO launch
time.
2021-01-16 10:35:56 -05:00
Raymond Hill
649b3480e0
Add "toOverwrite.filters" entry as managed storage property
The new entry is an array of strings, each representing a
distinct line, and all entries are used to populate the
"My filters" pane.

This offers an more straightforward way for administrators
to specify a list of custom filters to use for all
installations.
2021-01-10 12:31:31 -05:00
Raymond Hill
0e3071dd50
Add filterLists property to managed storage
The entry `toOverwrite.filterLists` is an array of
string, where each string is a token identifying a
stock filter list, or a URL for an external filter
list.

This new entry is to make it easier for an
administrator to centrally configure uBO with a
custom set of filter lists.
2021-01-08 09:18:26 -05:00
Raymond Hill
e4e7cbc78f
Use better identifying name for overview panel 2021-01-07 08:19:47 -05:00
Raymond Hill
cc9c45f1e4
Adding to and further reviewing admin-managed settings 2021-01-06 11:39:24 -05:00
Raymond Hill
c1130ec843
Add support for admin-managed hidden settings
Related discussion:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1437#issuecomment-754127066
2021-01-05 12:16:50 -05:00
Raymond Hill
b28acfccbc
Add "extraTrustedSiteDirectives" as new admin policy
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1433

The new "extraTrustedSiteDirectives" policy is an array
of strings, each of which is parsed as a trusted-site
directive to append to a user's own set of trusted-site
directives at launch time.

The added trusted-site directives will be considered as
part of the default set of directives by uBO.
2021-01-04 07:54:24 -05:00
Raymond Hill
5d7b2918ef
Harden processing of changes in compiled list format
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1365

This commit adds the compiled magic version number to the
compiled data itself, and consequently this allows uBO
to no longer require that any given compiled list with a
mismatched format to be detected and discarded at launch
time.

Given this change, uBO no longer needs to rely on the
deletion of cached data at launch time to ensure it
won't use no longer valid compiled lists.
2020-12-08 10:00:47 -05:00
Raymond Hill
e8e4a1ac74
Wait for removal of storage entries to be completed
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1365

When compiled data format changes, do not rely on order
of operations at launch to assume deletion of storage
occurs before attempts to access it. It's unclear this
commit will fix the reported issue, as I could not
reproduce it except when outright commenting out the code
to prevent the storage deletion from occurring.
2020-12-04 06:17:18 -05:00
Raymond Hill
ee2fd45f00
Ensure we do not extract truncated URL for Homepage directive
Related feedback:
- b12e0e05ea (commitcomment-44309540)
2020-11-18 12:14:23 -05:00
Raymond Hill
b12e0e05ea
Extract Homepage URL from a list when present
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1346

Additionally, fixed a case of filter list being compiled
twice at subscription time.
2020-11-18 10:02:22 -05:00
Raymond Hill
2cfeaddbed
Fine tune various static filtering code
Notably, make `queryprune` option available only
to filter list authors, until there are guards
against bad filters in some future and until the
option syntax and behavior is fully settled.

Instances of `queryprune` in filter lists will be
compiled, however instances of `queryprune` in
_"My filters"_ will be ignored unless users
indicated they are a filter list author.
2020-11-13 09:23:25 -05:00
Raymond Hill
0196993828
Use buffer-like approach for filterUnits array
filterUnits is now treated as a buffer which is
pre-allocated and which will grow in chunks so as
to minimize memory allocations. Entries are never
released, just null-ed.

Additionally, move urlTokenizer into the static
network filtering engine, since it's not used
anywhere else.
2020-11-09 06:54:51 -05:00
Raymond Hill
4f00c08f6b
Fix detection of already present comment
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1281

Related feedback:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1281#issuecomment-705081202
2020-10-07 14:23:57 -04:00
Raymond Hill
46ec969411
Add ability to use full URL in auto-generated comment
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1281

New supported placeholder: `{{url}}`, which will be
replaced by the full URL of the page for which a filter
is created.
2020-10-07 11:52:38 -04:00
Raymond Hill
f4aebc9390
Backup/restore only modified advanced settings
This reduces the size of the backup file and also
ensures that default values can be changed.
2020-10-03 12:34:21 -04:00
Raymond Hill
4c7635514a
Fine tuning changes to click-to-subscribe code
Related feedback:
- https://github.com/uBlockOrigin/uBlock-issues/issues/763#issuecomment-691682195

Additionally, enable an existing subscription when
subscribing again to it.
2020-09-13 11:44:42 -04:00
Raymond Hill
d4182add6e
Add ability to outright remove/ignore "really bad lists"
In addition to what is deemed really bad lists by consensus,
some lists will also be labelled "really bad list"
temporarily so as to force-remove them from the set of
filter lists.

This will be the case for filter lists which are not
necessarily "bad lists" but which were once part of
uBO's stock filter lists and have been removed since
then for various reasons.

This will ensure that the majority of users who do not
modifies uBO's default listset will still have a
configuration which matches the official default listset.
2020-09-09 09:57:29 -04:00
Raymond Hill
0e9d4714e9
Mibor: better variable name 2020-08-24 12:40:36 -04:00
Raymond Hill
4150c17f4a
Add concept of "really bad list" to badlists infrastructure
This commit adds concept of "really bad list" to the
badlists infrastructure. Really bad lists won't be
fetched from a remote server, while plain bad list
will be fetched but won't be compiled.

A really bad list is denoted by the `nofetch` token
following the URL.

Really bad lists can cause more serious issues such
as causing undue launch delays because the remote
server where a really bad list is hosted fails to
respond properly and times out.

Such an example of really bad list is hpHosts which
original server no longer exist.
2020-08-22 08:43:16 -04:00
Raymond Hill
23f08f0274
Add support for blocklist of filter lists
Many filter lists are known to cause serious filtering
issues in uBO and are not meant to be used in uBO.

Unfortunately, unwitting users keep importing these
filter lists and as a result this ends up causing
filtering issues for which the resolution is always
to remove the incompatible filter list.

Example of inconpatible filter lists:
- Reek's Anti-Adblock Killer
- AdBlock Warning Removal List
- ABP anti-circumvention filter list

uBO will use the following resource to know
which filter lists are incompatible:
- https://github.com/uBlockOrigin/uAssets/blob/master/filters/badlists.txt

Incompatible filter lists can still be imported into
uBO, useful for asset-viewing purpose, but their content
will be discarded at compile time.
2020-08-21 11:57:20 -04:00
Raymond Hill
24ef0cb753
Fix typo in comment 2020-08-13 09:40:43 -04:00
Raymond Hill
00b790ce72
Add support for more !#if pre-parser directive tokens
Related issue:
- https://github.com/uBlockOrigin/uBlock-issues/issues/1205
2020-08-13 09:32:34 -04:00