uBlock

mirror of https://github.com/gorhill/uBlock.git synced 2024-11-17 16:02:33 +01:00

Author	SHA1	Message	Date
Raymond Hill	fee217c59c	Fix regression introduced in `2f63fb3fd4` Related feedback: - `2f63fb3fd4 (commitcomment-34222571)`	2019-07-08 12:33:11 -04:00
Raymond Hill	2f63fb3fd4	Prevent adding known invalid URL-based rules Related discussion: - https://github.com/uBlockOrigin/uBlock-issues/issues/662#issuecomment-509220702 Currently, `doc` (aka `main_frame`) rules are not evaluated to decide whether a network request must be blocked or not, by design. This commits adjust uBO's UI to account for this.	2019-07-08 10:49:53 -04:00
Raymond Hill	e55cae6232	Fine tune new resources-related code Make sure the parser is safely compatible with old resources format -- for those users still using custom resources (via `userResourcesLocation`). Prepare code for future fix to <https://github.com/uBlockOrigin/uBlock-issues/issues/156>: This commit introduces a new private Map() object, `uBOSafe`, accessible by all injected scriptlets. This private safe can be used to store data which can be shared with different scriptlets. The idea is for scriptlets to use that safe to graciously deal with the need to install multiple listeners for the same property.	2019-07-08 08:56:36 -04:00
Raymond Hill	da4c4ded8d	Add a way to reload resources in dev build Since resources are now immutable, by default they are only compiled once each time uBO updates to a new version. However I need a way to force a re-compiling of the resource in the dev build. This commit adds code to invalidate the resources selfie when forcing the update of any filter list.	2019-07-08 08:41:28 -04:00
Raymond Hill	ad9b34bc7a	Code review of `9d1913a16e` Also eat backslash for `\\`, to allow searching for literal `\n`, `\t`.	2019-07-07 07:52:37 -04:00
Raymond Hill	47a5caef54	Fix last newline not being automatically appended Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/657	2019-07-07 06:57:30 -04:00
Raymond Hill	9d1913a16e	Eat backslashes only for common control characters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/658	2019-07-07 06:29:14 -04:00
Raymond Hill	6f5aa947fb	Finalize converting resources.txt into immutable resources With hindsight, I revised decisions made earlier during this development cycle: Un-redirectable scriptlets have been removed from /web_accessible_resources and instead put in the new /assets/resources/scriptlets.js, which contains all scriptlets used for web page injection purpose. uBO will no longer fetch a remote version of built-in resources. Advanced setting `userResourcesLocation` will still be honoured by uBO, and if set, will be fetched every time at least one asset is updated.	2019-07-06 12:36:28 -04:00
Raymond Hill	ae56c4dfe8	Fix whitelist status evaluation of tabless network requests Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/651 The `behnind-the-scene` context was wrongly used to evaluate the whitelist status of the context of tabless network requests. The document origin must be used instead when it's available. Additionally, much code has been revisited for better ES6 syntax compliance.	2019-07-05 17:44:08 -04:00
Raymond Hill	9693d07a6d	Code review of https://github.com/gorhill/uBlock/commit/f930da7ad64a	2019-07-05 12:33:14 -04:00
Raymond Hill	f930da7ad6	Fix regression of reverse-lookup of scriptlet filters in logger Related commit: - `5552d6717d`	2019-07-05 11:44:40 -04:00
Raymond Hill	e107b6bcf1	Fix typo in comment	2019-07-05 10:32:19 -04:00
Raymond Hill	5552d6717d	Implement scriptlet token normalization The goal is to be able to specify a scriptlet token without the `.js` part at the end, because that part is essentially redundant with the `+js` part of the syntax. When the next stable release is in widespread use (to determine), scriptlet tokens will have to be specified without the `.js` part, and with this commit the logger will already report the normalized version of scriptlets. Eventually, when the migration to sans-`.js` is complete (also to determine), the internal normalization of the token will be removed and this will become official syntax. Filter list maintainers will have to mind that uAssets is becoming in use beyond uBO (i.e. Brave) when skipping the `.js` part -- hopefully Brave will go along with the change here, which is to remove a bit of tediousness for filter list maintainers.	2019-07-05 10:10:59 -04:00
Raymond Hill	6220e1d3eb	Add missing newline	2019-07-05 08:22:26 -04:00
Raymond Hill	a992875c94	Save only modified immediate hidden settings	2019-07-05 07:33:09 -04:00
Raymond Hill	1fb9845c35	Remove useless code	2019-07-04 14:10:23 -04:00
Raymond Hill	f9e680f111	Convert more resources as immutable Related commit: - `152cea2dfe`	2019-07-04 14:08:56 -04:00
Raymond Hill	8e245c8919	Convert more resources as immutable Related commit: - `152cea2dfe`	2019-07-03 19:26:09 -04:00
Raymond Hill	0ba9a35818	Convert more resources as immutable Related commit: - `152cea2dfe`	2019-07-03 14:33:06 -04:00
Raymond Hill	152cea2dfe	Refactor management of injectable resources This is a first step, the ultimate goal is to remove the need for resources.txt, or at least to reduce to only hotfixes or for trivial resources targeting very specific websites. Most resources will become immutable, i.e. they will be part of uBO's code base. Advantages include easier code maintenance (jshint, syntax highlight), and to make scriptlets more easy to code review by external parties (for example extension store reviewers). TODO: - More scriptlets need to be imported before next release. - Need to make legacy versions of uBO use a legacy version of resources.txt, as all the now obsolete scriptlets will have to be removed once uBO's next release become widespread. - Possibly need to add code to load binary resources so that they can be injected as data: URI. So far it's unclear whether this is really needed. For example, this would be needed if a xmlhttprequest is redirected to an image resource.	2019-07-03 09:47:56 -04:00
Raymond Hill	41636c59fb	Strict-block only if match is anchored to end of hostname As per feedback from filter list maintainers.	2019-07-02 11:56:27 -04:00
Raymond Hill	730a83377e	Minor code review re. context menu code Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/151 I have been unsuccessful fixing the above issue, but I will keep the changes made in the process of trying to fix it.	2019-07-02 09:43:26 -04:00
Raymond Hill	cdd1aac442	Add convenience link to network resources in logger Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/648	2019-06-30 16:15:19 -04:00
Raymond Hill	2bcf671dae	Put back erroneously removed line Regression from `1dfdc40e09`	2019-06-30 12:54:05 -04:00
Raymond Hill	c8860ff61d	Code review of `c1bdc123f2`	2019-06-30 10:22:06 -04:00
Raymond Hill	1dfdc40e09	Add ability to suspend network request handler at will This works only for platforms supporting the return of Promise by network listeners, i.e. only Firefox at this point. When filter lists are reloaded[1], there is a small time window in which some network requests which should have normally been blocked are not being blocked because the static network filtering engine may not have yet loaded all the filters in memory This is now addressed by suspending the network request handler when filter lists are reloaded -- again, this works only on supported platforms. [1] Examples: when a filter list update session completes; when user filters change, when adding/removing filter lists.	2019-06-30 10:09:27 -04:00
Raymond Hill	c1bdc123f2	Fix use of sibling-related CSS syntax at prefix position Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/c6iem5/	2019-06-29 14:07:54 -04:00
Raymond Hill	cf4345ffc4	Fix some element picker-related issues Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/c5do7w/ Make the element picker better reflect network filters as parsed by the static network filtering engine. Additionally, discard single alphanumeric character-based filters. Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/c62irc/ Inject newly created cosmetic filters into the DOM filterer, in order for these filters to be enforced by the DOM filterer in subsequent dynamic DOM changes.	2019-06-29 11:06:03 -04:00
Raymond Hill	6c34b3c3c9	Use "relax" instead of "toggle" Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/371	2019-06-27 08:16:18 -04:00
Raymond Hill	693687fd74	Add keyboard support for toggling down blocking profile Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/371 By default, no specific keyboard shortcut is predefined, this will have to be assigned by the user. The command name in English is "Toggle blocking profile". The default behavior is to toggle down according to one of the following scenarios. a) If script execution is disabled through the no-scripting switch, the no-scripting switch will be locally toggled so as to allow script execution. The page will be automatically reloaded. b) If script execution is not blocked but the 3rd-party script and/or frame cells are blocked, local no-op rules will be set so as to no longer block 3rd-party scripts and/or frames. The page will be automatically reloaded. Given this, it may take more than one toggle down command to reach the lowest blocking profile, which is one where JavaScript execution is not blocked and 3rd-party scripts and frames resources block rules, if any, are bypassed with local no-op rules. TODO: At this point, I haven't yet decided whether toggling from the lowest profile should restore the original highest blocking profile.	2019-06-26 07:47:14 -04:00
Raymond Hill	d1df2b5e73	Fix merging multiple URls in element picker Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/c5do7w/ Fixed: - Expect the differ can return the first input as is when there is no difference between the two items. - Better deal with extraneous whitespaces in `srcset`	2019-06-25 17:09:04 -04:00
Raymond Hill	9065bbdd48	Code review of whitelisting-related code - Use `Map()` instead of `{}` for internal data structure - Export as array of directives instead of as a string	2019-06-25 11:57:14 -04:00
Raymond Hill	8e7384ba84	Prevent duplicate inline-script entries in the logger Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/c4340z/filter_problem/ervpjd8/	2019-06-24 11:40:14 -04:00
Raymond Hill	41685f4cce	Replace `exec` with `transpose` in procedural filters The purpose is to avoid having to iterate through all input nodes at each operator implementation level. The `transpose` method deals with only one input node, and the iteration is performed by the main procedural filtering entry points. Additionally: - Add `:spath` to HTML filtering - Rename `:watch-attrs` to `:watch-attr` - `:watch=attrs` is deprecated and will be kept around until it is safe to remove it completely	2019-06-23 08:05:53 -04:00
Raymond Hill	b428a25c3f	Add new procedural operator: `:min-text-length(x)` Where `x` is the minimal text length of the subject DOM element. DOM elements whose text length is greater than or equal to `x` will be selected. The original rationale for such procedural cosmetic operator[1] is to be able to remove inline script elements according to a minimum text length using HTML filtering. [1] As a result of internal discussion with filter list maintainers @ uAssets.	2019-06-20 14:11:54 -04:00
Raymond Hill	822e0a133d	Provide visual feedback for invalid entries in "My rules" Related issue: - https://github.com/gorhill/uBlock/issues/1039	2019-06-19 18:28:44 -04:00
Raymond Hill	be2a950541	Code review of HNTrie/staticNetFilteringEngine - Remove HNTrieContainer class from global context by storing it as a property of µBlock. - Use block scope to isolate HNTrie-related constants from global context. - Prevent filters which are pure IP address from being stored in an HNTrie instance -- as this could cause false positives.	2019-06-19 10:00:19 -04:00
Raymond Hill	7303776757	Use async/await in Matrix.benchmark()	2019-06-19 08:37:48 -04:00
Raymond Hill	cfc2ce333d	Implement bidirectional plain-string trie The bidirectional trie allows storing the right and left parts of a string into a trie given a pivot position. Releated issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528 Additionally, the mandatory token-at-index-0 rule for FilterPlainHnAnchored has been lifted, thus allowing the engine to pick a potentially better token at any position in the filter string. *** TODO: Eventually rename `strie.js` to `biditrie.js`. TODO: Fix dump() method, it currently only show the right-hand side of a filter string.	2019-06-18 19:16:39 -04:00
Raymond Hill	2eb9b726a5	Fix `generichide` not being evaluated for local context Related issue: - https://github.com/uBlockOrigin/uAssets/issues/5704	2019-06-03 06:37:39 -04:00
Raymond Hill	27e8c8d468	Normalize tabless xhr to image/media in onHeadersReceived() Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/610 The service worker-related issue affects both Chromium/Firefox: the type of resources fetched from a service worker are uniformly set to `xmlhttprequest`, hence losing a key piece of information for the purpose of accurate content filtering.	2019-05-31 09:02:07 -04:00
Raymond Hill	8828522fe8	Fix errors with cosmetic filter exception in the logger Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/602	2019-05-28 07:21:16 -04:00
Raymond Hill	85b89fbe63	Fix broken import-from-file in Whitelist pane Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bt2d1f/	2019-05-26 08:03:44 -04:00
Raymond Hill	a7bfff03d6	Avoid spurious diff at edit time in "My rules" Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/593 The issue was caused by the lack of empty last line, since the differ taking into account new line characters.	2019-05-25 10:04:31 -04:00
Raymond Hill	80a8750d35	Select existing "Advanced settings" page if any Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/591 Additionally, I added a link to the logger in the "About" pane in the dashboard in order to be able to access the logger without having to go through the popup panel.	2019-05-25 08:31:06 -04:00
Raymond Hill	fb6d69f543	Discard whole filter with bad `csp=` content Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bshn7z/ uBO was just removing the bad option, while the whole filter needs to be discarded.	2019-05-24 15:41:37 -04:00
Raymond Hill	1e9528e2a6	Fix regression affecting `*$csp=`-like filters Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bshn7z/filter_question/ Regression introduced in: - `3f3a1543ea`	2019-05-24 12:15:32 -04:00
Raymond Hill	26708b37c1	Integrate bare-bone filter hit stats in the logger Related issue: - https://github.com/gorhill/uBlock/issues/983 - https://github.com/gorhill/uBlock/issues/1353 The current implementation reports statistics for all static filters, and the presentation/featureset is intentionally minimal: Do not open issues about this. It's still a work in progress and it will be worked on slowly and thoughtfully over time and as time allows. Pausing the logger will not pause the collation of filter hit statistics, thus it is possible to lower the logger overhead by pausing logger output without losing filter hit collation.	2019-05-24 11:18:39 -04:00
Raymond Hill	eef76c49ae	Add a link to the remote asset in asset viewer The link will be present if and only if the content of the currently viewed asset has been fetched from a remote location.	2019-05-23 19:29:59 -04:00
Raymond Hill	294ea41fde	Import emergency fix `5a29a21c81` in dev build	2019-05-23 10:22:51 -04:00
Raymond Hill	1f398134f9	Minor code reivew of `4430ec11e2`	2019-05-23 08:15:26 -04:00
Raymond Hill	7b8c087fdd	Start using async/await where it makes sense	2019-05-22 19:23:04 -04:00
Raymond Hill	4430ec11e2	Rearrange inner loop of static network filtering engine The motivations for the re-arrangement: - Reducing the number of entry points: matchStringExactString() has been removed and matchString() is simply reused with a modifier parameter to enable matching variants. - Presumption that most matches, if any, occur early with the left-most tokens in a URL. This gives a very small marginal performance gain as per built-in benchmark.	2019-05-22 17:51:03 -04:00
Raymond Hill	e8c2f7eea3	Fix "Close this window" not working on document-blocked page Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/breeux/	2019-05-21 18:56:59 -04:00
Raymond Hill	32b04fa262	Re-arrange parsing of type options to be order-independent Related commit: - `1888033070` This removes the need to place `all` before any negated type in the list of options.	2019-05-21 14:04:21 -04:00
Raymond Hill	5eff4a027a	Fix https://github.com/gorhill/uBlock/issues/3541	2019-05-20 18:29:28 -04:00
Raymond Hill	1888033070	Add support for `all` filter option Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bqnsoa/ The `all` option is equivalent to specifying all network-based types + `popup`, `document`, `inline-font`, `inline-script`. Example from discussion: \|\|bet365.com^$all Above will block all network requests, block all popups, prevent inline fonts/scripts from `bet365.com`. EasyList- compatible syntax does not allow to accomplish that semantic when using only `\|\|bet365.com^`. If using specific negated type options along with `all`, the order in which the options appear is important. In such case `all` should always be first, followed by the negated type option(s).	2019-05-20 13:46:36 -04:00
Raymond Hill	72d9758faa	Ensure the "Filter lists" pane is in sync with update status Related issue: - https://github.com/gorhill/uBlock/issues/2394 Additionally, I added a new advanced setting to control how long after launch an auto-update session should be started -- value is in seconds: autoUpdateDelayAfterLaunch 180	2019-05-19 18:31:12 -04:00
Raymond Hill	a0ac1b7ee8	Fix handling of `data:` for filtering purpose in logger Related issue: - https://github.com/gorhill/uBlock/issues/2469	2019-05-19 17:00:49 -04:00
Raymond Hill	f677443878	Warn when navigating away from pane with unsaved changes Related issue: - https://github.com/gorhill/uBlock/issues/3271 When navigating away by clicking another pane tab button, there will be an embedded warning, which can be ignore in order to proceed to the new pane, or dismissed by either clicking on the "Stay" button or anywhere else in the dashboard. When navigating away by trying to close the tab, there will be a built-in browser warning asking for confirmation.	2019-05-19 15:35:00 -04:00
Raymond Hill	1caff7429e	Add optional support for generic procedural cosmetic filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/131 The new advanced setting and its default value is: allowGenericProceduralFilters false Whenever this setting is toggled, the user is responsible of forcing a reload of all filter lists so as to allow uBO to process differently any existing generic procedural cosmetic filters.	2019-05-18 18:57:32 -04:00
Raymond Hill	ca34bc4f3e	Fix "Revert" button not resetting after saving changes Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/367	2019-05-18 17:48:19 -04:00
Raymond Hill	3cf71835c4	Set default delay for creating selfie to 3 minutes Related discussion: - https://www.reddit.com/r/uBlockOrigin/comments/bq49zi/	2019-05-18 14:43:44 -04:00
Raymond Hill	f7bbc80717	Improve "Whitelist pane"; remove now useless built-in switch rule Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/214 Built-in whitelist directives are now rendered differently than user-defined whitelist directives. Also, removing a built-in whitelist directive will only cause that directive to be commented out, so that users do not have to remember built-in directives should they want to bring them back. Related issue: https://github.com/uBlockOrigin/uBlock-issues/issues/494 The built-in per-site switch rule `no-scripting: behind-the-scene false` has been removed, it should not ever be needed since there will always be a valid root context for main- and sub-frames.	2019-05-18 14:20:05 -04:00
Raymond Hill	de41c1bf53	Fix parsing of recursive `!#if`-`!#endif directives Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/270	2019-05-18 10:31:04 -04:00
Raymond Hill	62387fb87a	Prevent picker's preview mode from modifying style attribute Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/551 The issue fixes previewing the hiding/unhiding of targeted elements in the element picker. However it does not address the case of previewing `:style(...)` operators -- this would require a much more complex fix, which I am not sure is worth the amount of work and increased code complexity.	2019-05-17 19:26:48 -04:00
Raymond Hill	9bfbbfec84	Adjust visual of cosmetic exception filters in logger The invariant prefixes `##` and `#@#` are now hidden, allowing to reveal more of the filter itself when the logger view is narrow.	2019-05-17 11:45:07 -04:00
Raymond Hill	0ca44b847c	Avoid duplicated strings in filterOrigin w/ new approach The new approach is simpler and should benefit selfie serialization/unserialization. This renders stringDeduplicater obsolete -- it has been removed.	2019-05-17 10:13:58 -04:00
Raymond Hill	1386429382	Fix regression in applying procedural cosmetic filters Related commit: - `3573b6b32c`	2019-05-16 17:22:20 -04:00
Raymond Hill	3573b6b32c	Add ability to report exception cosmetic filters in the logger Related issue: - https://github.com/gorhill/uBlock/issues/127 Additionally, the extended exception filters in the logger will be rendered with a line-through to more easily distinguish them from non-exception ones. Also, opportunistically converted revisited code to ES6 syntax.	2019-05-16 13:44:49 -04:00
Raymond Hill	fc109c8b7c	Revisit code to benefit from ES6 syntax	2019-05-15 14:49:12 -04:00
Raymond Hill	1fe3b54acc	Fix cosmetic exception filters not applying Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/575 Regression from: - `93f80eedfa` Specific cosmetic exception filters need to be returned so that they can be applied to generic cosmetic filters.	2019-05-15 14:43:59 -04:00
Raymond Hill	39e2a03edb	Fix comment	2019-05-14 09:31:51 -04:00
Raymond Hill	a14dcecf8f	Do not assume wildcards fall on label boundaries Related commit: - `fe0b7a0e0f` Related feedback: - https://github.com/uBlockOrigin/uBlock-issues/issues/572#issuecomment-492223980	2019-05-14 09:29:45 -04:00
Raymond Hill	93f80eedfa	Refactor runtime storage of specific cosmetic filters This was a TODO item: - `07cbae66a4/src/js/cosmetic-filtering.js (L375)` µBlock.staticExtFilteringEngine.HostnameBasedDB has been re-factored to accomodate the storing of specific cosmetic filters. As a result of this refactoring: - Memory usage has been further decreased - Performance of selector retrieval marginally improved - New internal representation opens the door to use a specialized version of HNTrie, which should further improve performance/memory usage	2019-05-14 08:52:34 -04:00
Raymond Hill	8a312b9bbb	Support cases with more than one wildcard Related commit: - `fe0b7a0e0f` Related feedback: - https://github.com/uBlockOrigin/uBlock-issues/issues/572#issuecomment-492147440	2019-05-14 06:52:13 -04:00
Raymond Hill	fe0b7a0e0f	Relax destination hostname requirements in redirect filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/572 Wildcards are now allowed in the hostname part of redirect filters. There will be an attempt to find the longest right-hand portion of the hostname with no wildcard. If no non-empty hostname can be extracted, `*` will be used.	2019-05-13 20:19:10 -04:00
Raymond Hill	1e40f50eb3	Add benchmark method to cosmetic filtering engine To measure retrieval of site-specific selectors. From uBO's own dev console: µBlock.cosmeticFilteringEngine.benchmark();	2019-05-12 11:41:47 -04:00
Raymond Hill	57890d60ff	Fix incorrect use of `this` in static method Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/568 Regression from: - `19ece97b0c`	2019-05-11 17:40:55 -04:00
Raymond Hill	8a7e704080	Add support for `nth-ancestor` operator in HTML filtering Also opportunitisically converted some code to ES6's `class`.	2019-05-11 13:21:23 -04:00
Raymond Hill	915c1f1f3c	Report resources blocked by `csp=` option in logger Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/552	2019-05-11 10:40:34 -04:00
Raymond Hill	12bdd01595	Ensure "Ignore generic cosmetic filters" sticks on Fennec Related issue: - https://www.reddit.com/r/uBlockOrigin/comments/blkudl/ The setting was not sticking at first-install time.	2019-05-11 09:04:13 -04:00
Raymond Hill	e59bdb1485	Defuse `fixed` position on `body` element in element zapper The `fixed` style property on the `body` element will be defused if an overlay element is removed using the element zapper. Related: - https://www.reddit.com/r/uBlockOrigin/comments/bktxtb/scrolling_doesnt_work/emlscyz	2019-05-06 13:32:55 -04:00
Raymond Hill	3692bb4ada	Add HNTrieRef.dump() and STrieRef.dump() as dev tool To be used at the console, as an investigation tool for development purpose. Using it to verify the content of the largest FilterHostnameDict instance, I spotted an all-uppercase hostname in the HNTrieRef instance: µBlock.staticNetFilteringEngine.categories.get(0).get(0x10000000).dict.dump(); Thus the changes to static-net-filtering.js are to fix the erroneous insertion of filters with uppercase characters. The single instance found was a hostname entry in Malware Domain List (TRIANGLESERVICESLTD dot COM).	2019-05-06 11:12:39 -04:00
Raymond Hill	0e4fbefd07	Remove unecessary `null` placeholders FilterOriginHitSet et al. The `null` placeholder are not necessary, we can just use default arguments instead, and add the HNTrieContainer references if and only if they are instanciated.	2019-05-01 18:54:11 -04:00
Raymond Hill	9e4385243c	Web accessible secrets can be used for at most one second Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/550 Related Chromium issue (I can't access it): - https://bugs.chromium.org/p/chromium/issues/detail?id=957866 Findings so far: affects browsers based on Chromium 74. I could not reproduce the issue with either Chromium 73 or Google Chrome 75. This commit is a mitigation: to prevent sites from using uBO's internal WAR secret for tracking purpose. A secret can be used for at most one second, after which a new secret is generated. The original issue related to the implementation of secret-gated web accessible resources is: - https://github.com/gorhill/uBlock/issues/2823	2019-04-30 14:36:07 -04:00
Raymond Hill	73e2f25e95	Add new cosmetic procedural operator: `:nth-ancestor(n)` The purpose of this new `:nth-ancestor(n)` operator is to lookup the nth ancestor relative to the currently selected node. It is essentially equivalent to `:xpath(..)`, where ancestor distance is expressed as a number rather than a sequence of slash-separated `..`. The rationale to introduce this new procedural selector is to have a low overhead way to accomplish ancestor selection.	2019-04-30 09:02:14 -04:00
Raymond Hill	42bf659695	Revert "Order HNTrie nodes alphabetically to allow for early bailout" This reverts commit `f5f9e05071`.	2019-04-30 07:00:52 -04:00
Raymond Hill	f5f9e05071	Order HNTrie nodes alphabetically to allow for early bailout This commit implements the alphabetical ordering of HNTrie nodes, so as to make it possible to bail out early at HNTrie.matches() time. Contrary to what I expected, there is no performance gain observed to HNTrie.matches() as per benchmarks -- I find the results perplexing. Because of this I will revert this commit immediately. The purpose of this commit is to record the changes so that I can bring them back to life in the future whenever I want to investigate further.	2019-04-30 06:47:54 -04:00
Raymond Hill	adabb56dc9	Do not store impossible to match filters in HNTrie Consider the two following filters: example.com www.example.com This commit make it so that if the first filter is already present in a given HNTrie, the second filter will not be stored, since HNTrie will _always_ return the first filter as a match whenever the hostname to match is example.com or any subdomain of example.com. The detection of such pointless filters is virtually free when adding a hostname to an HNTrie instance (given how data is stored in the trie), so in practice no overhead is incurred to detect such pointless filters. The ability to ignore impossible to match filters in HNTrie instances will _especially_ benefit those using large hosts files. Examples of how this helps using real configurations: - Default lists: 444 filters out of 100,382 were ignored as a result of this commit. - Default lists + "Energized Ultimate Protection": 283,669 filters out of 903,235 were ignored as a result of this commit. Side note: There was no measurable difference between the two configurations above in the performance of the matching algorithm as reported by the built-in benchmark tool.	2019-04-29 13:15:16 -04:00
Raymond Hill	c4f9ae706a	Fix alternate code path introduced in `295f08da97` (oops)	2019-04-28 14:18:09 -04:00
Raymond Hill	295f08da97	Implement code path for when TextDecoder() is not available The primary purpose is to unbreak https://github.com/cliqz-oss/adblocker/tree/master/bench/comparison	2019-04-28 14:07:21 -04:00
Raymond Hill	ac58b8e688	Make token hashes fit within a 32-bit integer The staticNetFilteringEngine uses token hashes to store/lookup filters into Map objects. Before this commit, the tokens were encoded into token hashes as JS numbers (not exceeding MAX_SAFE_INTEGER) using at most the 8 first characters of the token. With this commit, token hashes are now restricted to fit into 32-bit integers, and are derived from at most the 7 first characters. This improves filter look-up performance as per built-in benchmark().	2019-04-28 10:15:15 -04:00
Raymond Hill	96dce22218	Increase resolution of known-token lookup table Related commit: - `69a43e07c4` Using 32 bits of token hash rather than just the 16 lower bits does help discard more unknown tokens. Using the default filter lists, the known-token lookup table is populated by 12,276 entries, out of 65,536, thus making the case that theoretically there is a lot of possible tokens which can be discarded. In practice, running the built-in staticNetFilteringEngine.benchmark() with default filter lists, I find that 1,518,929 tokens were skipped out of 4,441,891 extracted tokens, or 34%.	2019-04-27 08:18:01 -04:00
Raymond Hill	a8946c8d73	Fix list lookup of multi-hostname `domain=` filters in logger Related commit: - `3f3a1543ea` The regression was preventing uBO to find from which list a filter originated. This affected only filters for which the `domain=` option had multiple hostnames.	2019-04-27 07:04:43 -04:00
Raymond Hill	69a43e07c4	Ignore unknown tokens in urlTokenizer.getTokens() Given that all tokens extracted from one single URL are potentially iterated multiple times in a single URL-matching cycle, it pays to ignore extracted tokens which are known to not be used anywhere in the static filtering engine. The gain in processing a single network request in the static filtering engine can become especially high when dealing with long and random-looking URLs, which URLs have a high likelihood of containing a majority of tokens which are known to not be in use.	2019-04-26 17:14:00 -04:00
Raymond Hill	19ece97b0c	Leverage compile-time token information in new fitler classes Related commit: - `99390390fc` The token information available at compile time can be stored in the filter to be used at match() time. This allows the use of startsWith() rather than a more costly indexOf() call as a first quick test to detect mismatches.	2019-04-26 11:16:47 -04:00
Raymond Hill	e0d2285da0	Convert HNTrie code to ES6 `class`	2019-04-25 19:38:07 -04:00
Raymond Hill	155abfba18	Cache and reuse result of HNTrieRef.matches() when possible Due to how web pages typically load secondary resources and due to how HNTrieContainer instances are used in uBO, there is a great likelihood that the result of a previous call to HNTrieRef.matches() can be reused in a subsequent call. This has been confirmed by instrumenting HNTrieRef.matches(). Since uBO uses distinct HNTrieContainer instances to either match against the request or the origin hostnames, this means a high likelihood of repeated calls to HNTrieRef.matches() with the same hostname as argument, hence a performance gain when caching the argument+result -- as despite that HNTrie.matches() is fast, comparing two short strings is even faster if this allows to skip HNTrie.matches() altogether.	2019-04-25 18:36:03 -04:00
Raymond Hill	99390390fc	Introduce three more specialized filter classes to avoid regexes Performance- and memory-related work. Three more classes have been created to avoid regex-based filters internally. Purpose is to enforce filters which have only one single wildcard in their pattern, a common occurrence. The filter pattern is split in two literal string segments. Similar as above, with the added condition that the filter is hostname-anchored (`\|\|`). The "Wildcard2" variant is a further specialization to enforce filters where the only wildcard is immediately preceded by the `^` special character, again a very common occurrence. Using two literal string segments in lieu of regexes allows to quickly detect a mismatch by just testing the first segment. Additionally, this reduces memory footprint as regexes are much more expensive memory-wise than plain strings. These three new filter classes allow to replace the use of 5276 regex-based filters internally with plain string-based filters. Often-called isHnAnchored() has been further fine-tuned to avoid as much work as possible. I have also observed that using an arrow function for closure-purpose helps measurably performance, as per built-in benchmark.	2019-04-25 17:48:08 -04:00

1 2 3 4 5 ...

1813 Commits