uBlock

mirror of https://github.com/gorhill/uBlock.git synced 2024-11-17 16:02:33 +01:00

Author	SHA1	Message	Date
Raymond Hill	ca34bc4f3e	Fix "Revert" button not resetting after saving changes Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/367	2019-05-18 17:48:19 -04:00
Raymond Hill	3cf71835c4	Set default delay for creating selfie to 3 minutes Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bq49zi/	2019-05-18 14:43:44 -04:00
Raymond Hill	f7bbc80717	Improve "Whitelist pane"; remove now useless built-in switch rule Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/214 Built-in whitelist directives are now rendered differently than user-defined whitelist directives. Also, removing a built-in whitelist directive will only cause that directive to be commented out, so that users do not have to remember built-in directives should they want to bring them back. Related issue: https://github.com/uBlockOrigin/uBlock-issues/issues/494 The built-in per-site switch rule `no-scripting: behind-the-scene false` has been removed, it should not ever be needed since there will always be a valid root context for main- and sub-frames.	2019-05-18 14:20:05 -04:00
Raymond Hill	de41c1bf53	Fix parsing of recursive `!#if`-`!#endif directives Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/270	2019-05-18 10:31:04 -04:00
Raymond Hill	62387fb87a	Prevent picker's preview mode from modifying style attribute Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/551 The issue fixes previewing the hiding/unhiding of targeted elements in the element picker. However it does not address the case of previewing `:style(...)` operators -- this would require a much more complex fix, which I am not sure is worth the amount of work and increased code complexity.	2019-05-17 19:26:48 -04:00
Raymond Hill	9bfbbfec84	Adjust visual of cosmetic exception filters in logger The invariant prefixes `##` and `#@#` are now hidden, allowing to reveal more of the filter itself when the logger view is narrow.	2019-05-17 11:45:07 -04:00
Raymond Hill	0ca44b847c	Avoid duplicated strings in filterOrigin w/ new approach The new approach is simpler and should benefit selfie serialization/unserialization. This renders stringDeduplicater obsolete -- it has been removed.	2019-05-17 10:13:58 -04:00
Raymond Hill	1386429382	Fix regression in applying procedural cosmetic filters Related commit: - `3573b6b32c`	2019-05-16 17:22:20 -04:00
Raymond Hill	3573b6b32c	Add ability to report exception cosmetic filters in the logger Related issue: - https://github.com/gorhill/uBlock/issues/127 Additionally, the extended exception filters in the logger will be rendered with a line-through to more easily distinguish them from non-exception ones. Also, opportunistically converted revisited code to ES6 syntax.	2019-05-16 13:44:49 -04:00
Raymond Hill	fc109c8b7c	Revisit code to benefit from ES6 syntax	2019-05-15 14:49:12 -04:00
Raymond Hill	1fe3b54acc	Fix cosmetic exception filters not applying Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/575 Regression from: - `93f80eedfa` Specific cosmetic exception filters need to be returned so that they can be applied to generic cosmetic filters.	2019-05-15 14:43:59 -04:00
Raymond Hill	39e2a03edb	Fix comment	2019-05-14 09:31:51 -04:00
Raymond Hill	a14dcecf8f	Do not assume wildcards fall on label boundaries Related commit: - `fe0b7a0e0f` Related feedback: - https://github.com/uBlockOrigin/uBlock-issues/issues/572#issuecomment-492223980	2019-05-14 09:29:45 -04:00
Raymond Hill	93f80eedfa	Refactor runtime storage of specific cosmetic filters This was a TODO item: - `07cbae66a4/src/js/cosmetic-filtering.js (L375)` µBlock.staticExtFilteringEngine.HostnameBasedDB has been re-factored to accomodate the storing of specific cosmetic filters. As a result of this refactoring: - Memory usage has been further decreased - Performance of selector retrieval marginally improved - New internal representation opens the door to use a specialized version of HNTrie, which should further improve performance/memory usage	2019-05-14 08:52:34 -04:00
Raymond Hill	8a312b9bbb	Support cases with more than one wildcard Related commit: - `fe0b7a0e0f` Related feedback: - https://github.com/uBlockOrigin/uBlock-issues/issues/572#issuecomment-492147440	2019-05-14 06:52:13 -04:00
Raymond Hill	fe0b7a0e0f	Relax destination hostname requirements in redirect filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/572 Wildcards are now allowed in the hostname part of redirect filters. There will be an attempt to find the longest right-hand portion of the hostname with no wildcard. If no non-empty hostname can be extracted, `*` will be used.	2019-05-13 20:19:10 -04:00
Raymond Hill	1e40f50eb3	Add benchmark method to cosmetic filtering engine To measure retrieval of site-specific selectors. From uBO's own dev console: µBlock.cosmeticFilteringEngine.benchmark();	2019-05-12 11:41:47 -04:00
Raymond Hill	57890d60ff	Fix incorrect use of `this` in static method Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/568 Regression from: - `19ece97b0c`	2019-05-11 17:40:55 -04:00
Raymond Hill	8a7e704080	Add support for `nth-ancestor` operator in HTML filtering Also opportunitisically converted some code to ES6's `class`.	2019-05-11 13:21:23 -04:00
Raymond Hill	915c1f1f3c	Report resources blocked by `csp=` option in logger Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/552	2019-05-11 10:40:34 -04:00
Raymond Hill	12bdd01595	Ensure "Ignore generic cosmetic filters" sticks on Fennec Related issue: - https://www.reddit.com/r/uBlockOrigin/comments/blkudl/ The setting was not sticking at first-install time.	2019-05-11 09:04:13 -04:00
Raymond Hill	e59bdb1485	Defuse `fixed` position on `body` element in element zapper The `fixed` style property on the `body` element will be defused if an overlay element is removed using the element zapper. Related: - https://www.reddit.com/r/uBlockOrigin/comments/bktxtb/scrolling_doesnt_work/emlscyz	2019-05-06 13:32:55 -04:00
Raymond Hill	3692bb4ada	Add HNTrieRef.dump() and STrieRef.dump() as dev tool To be used at the console, as an investigation tool for development purpose. Using it to verify the content of the largest FilterHostnameDict instance, I spotted an all-uppercase hostname in the HNTrieRef instance: µBlock.staticNetFilteringEngine.categories.get(0).get(0x10000000).dict.dump(); Thus the changes to static-net-filtering.js are to fix the erroneous insertion of filters with uppercase characters. The single instance found was a hostname entry in Malware Domain List (TRIANGLESERVICESLTD dot COM).	2019-05-06 11:12:39 -04:00
Raymond Hill	0e4fbefd07	Remove unecessary `null` placeholders FilterOriginHitSet et al. The `null` placeholder are not necessary, we can just use default arguments instead, and add the HNTrieContainer references if and only if they are instanciated.	2019-05-01 18:54:11 -04:00
Raymond Hill	9e4385243c	Web accessible secrets can be used for at most one second Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/550 Related Chromium issue (I can't access it): - https://bugs.chromium.org/p/chromium/issues/detail?id=957866 Findings so far: affects browsers based on Chromium 74. I could not reproduce the issue with either Chromium 73 or Google Chrome 75. This commit is a mitigation: to prevent sites from using uBO's internal WAR secret for tracking purpose. A secret can be used for at most one second, after which a new secret is generated. The original issue related to the implementation of secret-gated web accessible resources is: - https://github.com/gorhill/uBlock/issues/2823	2019-04-30 14:36:07 -04:00
Raymond Hill	73e2f25e95	Add new cosmetic procedural operator: `:nth-ancestor(n)` The purpose of this new `:nth-ancestor(n)` operator is to lookup the nth ancestor relative to the currently selected node. It is essentially equivalent to `:xpath(..)`, where ancestor distance is expressed as a number rather than a sequence of slash-separated `..`. The rationale to introduce this new procedural selector is to have a low overhead way to accomplish ancestor selection.	2019-04-30 09:02:14 -04:00
Raymond Hill	42bf659695	Revert "Order HNTrie nodes alphabetically to allow for early bailout" This reverts commit `f5f9e05071`.	2019-04-30 07:00:52 -04:00
Raymond Hill	f5f9e05071	Order HNTrie nodes alphabetically to allow for early bailout This commit implements the alphabetical ordering of HNTrie nodes, so as to make it possible to bail out early at HNTrie.matches() time. Contrary to what I expected, there is no performance gain observed to HNTrie.matches() as per benchmarks -- I find the results perplexing. Because of this I will revert this commit immediately. The purpose of this commit is to record the changes so that I can bring them back to life in the future whenever I want to investigate further.	2019-04-30 06:47:54 -04:00
Raymond Hill	adabb56dc9	Do not store impossible to match filters in HNTrie Consider the two following filters: example.com www.example.com This commit make it so that if the first filter is already present in a given HNTrie, the second filter will not be stored, since HNTrie will _always_ return the first filter as a match whenever the hostname to match is example.com or any subdomain of example.com. The detection of such pointless filters is virtually free when adding a hostname to an HNTrie instance (given how data is stored in the trie), so in practice no overhead is incurred to detect such pointless filters. The ability to ignore impossible to match filters in HNTrie instances will _especially_ benefit those using large hosts files. Examples of how this helps using real configurations: - Default lists: 444 filters out of 100,382 were ignored as a result of this commit. - Default lists + "Energized Ultimate Protection": 283,669 filters out of 903,235 were ignored as a result of this commit. Side note: There was no measurable difference between the two configurations above in the performance of the matching algorithm as reported by the built-in benchmark tool.	2019-04-29 13:15:16 -04:00
Raymond Hill	c4f9ae706a	Fix alternate code path introduced in `295f08da97` (oops)	2019-04-28 14:18:09 -04:00
Raymond Hill	295f08da97	Implement code path for when TextDecoder() is not available The primary purpose is to unbreak https://github.com/cliqz-oss/adblocker/tree/master/bench/comparison	2019-04-28 14:07:21 -04:00
Raymond Hill	ac58b8e688	Make token hashes fit within a 32-bit integer The staticNetFilteringEngine uses token hashes to store/lookup filters into Map objects. Before this commit, the tokens were encoded into token hashes as JS numbers (not exceeding MAX_SAFE_INTEGER) using at most the 8 first characters of the token. With this commit, token hashes are now restricted to fit into 32-bit integers, and are derived from at most the 7 first characters. This improves filter look-up performance as per built-in benchmark().	2019-04-28 10:15:15 -04:00
Raymond Hill	96dce22218	Increase resolution of known-token lookup table Related commit: - `69a43e07c4` Using 32 bits of token hash rather than just the 16 lower bits does help discard more unknown tokens. Using the default filter lists, the known-token lookup table is populated by 12,276 entries, out of 65,536, thus making the case that theoretically there is a lot of possible tokens which can be discarded. In practice, running the built-in staticNetFilteringEngine.benchmark() with default filter lists, I find that 1,518,929 tokens were skipped out of 4,441,891 extracted tokens, or 34%.	2019-04-27 08:18:01 -04:00
Raymond Hill	a8946c8d73	Fix list lookup of multi-hostname `domain=` filters in logger Related commit: - `3f3a1543ea` The regression was preventing uBO to find from which list a filter originated. This affected only filters for which the `domain=` option had multiple hostnames.	2019-04-27 07:04:43 -04:00
Raymond Hill	69a43e07c4	Ignore unknown tokens in urlTokenizer.getTokens() Given that all tokens extracted from one single URL are potentially iterated multiple times in a single URL-matching cycle, it pays to ignore extracted tokens which are known to not be used anywhere in the static filtering engine. The gain in processing a single network request in the static filtering engine can become especially high when dealing with long and random-looking URLs, which URLs have a high likelihood of containing a majority of tokens which are known to not be in use.	2019-04-26 17:14:00 -04:00
Raymond Hill	19ece97b0c	Leverage compile-time token information in new fitler classes Related commit: - `99390390fc` The token information available at compile time can be stored in the filter to be used at match() time. This allows the use of startsWith() rather than a more costly indexOf() call as a first quick test to detect mismatches.	2019-04-26 11:16:47 -04:00
Raymond Hill	e0d2285da0	Convert HNTrie code to ES6 `class`	2019-04-25 19:38:07 -04:00
Raymond Hill	155abfba18	Cache and reuse result of HNTrieRef.matches() when possible Due to how web pages typically load secondary resources and due to how HNTrieContainer instances are used in uBO, there is a great likelihood that the result of a previous call to HNTrieRef.matches() can be reused in a subsequent call. This has been confirmed by instrumenting HNTrieRef.matches(). Since uBO uses distinct HNTrieContainer instances to either match against the request or the origin hostnames, this means a high likelihood of repeated calls to HNTrieRef.matches() with the same hostname as argument, hence a performance gain when caching the argument+result -- as despite that HNTrie.matches() is fast, comparing two short strings is even faster if this allows to skip HNTrie.matches() altogether.	2019-04-25 18:36:03 -04:00
Raymond Hill	99390390fc	Introduce three more specialized filter classes to avoid regexes Performance- and memory-related work. Three more classes have been created to avoid regex-based filters internally. Purpose is to enforce filters which have only one single wildcard in their pattern, a common occurrence. The filter pattern is split in two literal string segments. Similar as above, with the added condition that the filter is hostname-anchored (`\|\|`). The "Wildcard2" variant is a further specialization to enforce filters where the only wildcard is immediately preceded by the `^` special character, again a very common occurrence. Using two literal string segments in lieu of regexes allows to quickly detect a mismatch by just testing the first segment. Additionally, this reduces memory footprint as regexes are much more expensive memory-wise than plain strings. These three new filter classes allow to replace the use of 5276 regex-based filters internally with plain string-based filters. Often-called isHnAnchored() has been further fine-tuned to avoid as much work as possible. I have also observed that using an arrow function for closure-purpose helps measurably performance, as per built-in benchmark.	2019-04-25 17:48:08 -04:00
Raymond Hill	fff2bb6290	Assume media elements with no Content-Length header to be of size 0 Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/543	2019-04-24 08:30:54 -04:00
Raymond Hill	72bbcdd93c	Prevent search expression in CodeMirror editor from crossing line boundaries Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/493	2019-04-23 19:26:02 -04:00
Raymond Hill	43ecffc295	Fix overzealous strict blocking (regression) Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/536 Regression from: - `3f3a1543ea (diff-522a16ddeed280252d7c3a351261b441R2767)`	2019-04-21 09:17:31 -04:00
Raymond Hill	f10b100379	Fix the handling of pseudoclass-based generic cosmetic filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/464 Regression from: `261ef8c510 (diff-3b15596213ed9ba37fb5b8bb1402a6c2R599)` Pseudoclass-based generic cosmetic filters were improperly seen as invalid following the regression.	2019-04-21 07:49:44 -04:00
Raymond Hill	7735b35e21	Fix uncaught rejected promise in assets.fetchText() Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/534 Regression from `a52b07ff6e`	2019-04-21 06:12:20 -04:00
Raymond Hill	97f91f8be9	Small code review of `a52b07ff6e`	2019-04-20 19:10:34 -04:00
Raymond Hill	f0d5205bd7	Discard existing lines when importing from file in "My filters" Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/519	2019-04-20 18:57:16 -04:00
Raymond Hill	537271f26b	Fix how `*$`, `\|https://`, `http://` filters are reported in logger This was a regression introduced in `3f3a1543ea` Reported in issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528#issuecomment-485163348	2019-04-20 17:25:32 -04:00
Raymond Hill	a52b07ff6e	Make `userResourcesLocation` able to support multiple URLs The URLs must be space-separated. Reminders: - The additional resources will be updated at the same time the built-in resource file is updated - Purging the cache of 'uBlock filters' will also purge the cache of the built-in resource file -- and hence force a reload of the user's custom resources if any Related issues: - https://github.com/gorhill/uBlock/issues/3307 - https://github.com/uBlockOrigin/uAssets/issues/5184#issuecomment-475875189 Addtionally: - Opportunitically promisified assets.fetchText() - Fixed https://github.com/gorhill/uBlock/issues/3586	2019-04-20 17:16:49 -04:00
Raymond Hill	fa83744b58	Use a sequence of base 64 numbers to encode array buffers The purpose of using a custom base128 encoder is to convert array buffers into strings, to allow a direct string-to-array buffer conversion at load time: string => array buffer Whereas a JSON array would require an extra step: JSON array as string => JS array => array buffer Turns out that the current use of a custom base128 encoding results in a significantly larger selfie storage usage when converting array buffers into strings. Speculation: possibly the browser convert the strings to save into JSON strings internally. Since the custom base128 encoder is likely to cause the resulting string to contain a lot of unprintable ASCII characters, these will need to be escaped when converted to JSON -- escaped characters occupy more space than non-escaped ones. Using a sequence of base 64 numbers means only printable will be present in the output string, hence no escaping necessary. I have observed significant reduction in storage usage for selfie purpose.	2019-04-20 09:06:54 -04:00
Raymond Hill	3f3a1543ea	Add HNTrie-based filter classes to store origin-only filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528#issuecomment-484408622 Following STrie-related work in above issue, I noticed that a large number of filters in EasyList were filters which only had to match against the document origin. For instance, among just the top 10 most populous buckets, there were four such buckets with over hundreds of entries each: - bits: 72, token: "http", 146 entries - bits: 72, token: "https", 139 entries - bits: 88, token: "http", 122 entries - bits: 88, token: "https", 118 entries These filters in these buckets have to be matched against all the network requests. In order to leverage HNTrie for these filters[1], they are now handled in a special way so as to ensure they all end up in a single HNTrie (per bucket), which means that instead of scanning hundreds of entries per URL, there is now a single scan per bucket per URL for these apply-everywhere filters. Now, any filter which fulfill ALL the following condition will be processed in a special manner internally: - Is of the form `\|https://` or `\|http://` or ``; and - Does have a `domain=` option; and - Does not have a negated domain in its `domain=` option; and - Does not have `csp=` option; and - Does not have a `redirect=` option If a filter does not fulfill ALL the conditions above, no change in behavior. A filter which matches ALL of the above will be processed in a special manner: - The `domain=` option will be decomposed so as to create as many distinct filter as there is distinct value in the `domain=` option - This also apply to the `badfilter` version of the filter, which means it now become possible to `badfilter` only one of the distinct filter without having to `badfilter` all of them. - The logger will always report these special filters with only a single hostname in the `domain=` option. ** [1] HNTrie is currently WASM-ed on Firefox.	2019-04-19 16:33:46 -04:00

1 2 3 4 5 ...

1702 Commits