uBlock

mirror of https://github.com/gorhill/uBlock.git synced 2024-11-07 03:12:33 +01:00

Author	SHA1	Message	Date
Raymond Hill	acb12d2a1d	New revision for dev build	2019-04-27 08:37:37 -04:00
Raymond Hill	96dce22218	Increase resolution of known-token lookup table Related commit: - `69a43e07c4` Using 32 bits of token hash rather than just the 16 lower bits does help discard more unknown tokens. Using the default filter lists, the known-token lookup table is populated by 12,276 entries, out of 65,536, thus making the case that theoretically there is a lot of possible tokens which can be discarded. In practice, running the built-in staticNetFilteringEngine.benchmark() with default filter lists, I find that 1,518,929 tokens were skipped out of 4,441,891 extracted tokens, or 34%.	2019-04-27 08:18:01 -04:00
Raymond Hill	60938451ab	Make Firefox dev build auto-update	2019-04-27 07:23:11 -04:00
Raymond Hill	8715dadd02	Nre revision for dev build	2019-04-27 07:17:07 -04:00
Raymond Hill	a8946c8d73	Fix list lookup of multi-hostname `domain=` filters in logger Related commit: - `3f3a1543ea` The regression was preventing uBO to find from which list a filter originated. This affected only filters for which the `domain=` option had multiple hostnames.	2019-04-27 07:04:43 -04:00
Raymond Hill	761a0ef27c	Make Firefox dev build auto-update	2019-04-26 17:33:59 -04:00
Raymond Hill	4c9e760a10	Import translation work from https://crowdin.com/project/ublock	2019-04-26 17:30:19 -04:00
Raymond Hill	671908a45f	New revision for dev build	2019-04-26 17:27:41 -04:00
Raymond Hill	69a43e07c4	Ignore unknown tokens in urlTokenizer.getTokens() Given that all tokens extracted from one single URL are potentially iterated multiple times in a single URL-matching cycle, it pays to ignore extracted tokens which are known to not be used anywhere in the static filtering engine. The gain in processing a single network request in the static filtering engine can become especially high when dealing with long and random-looking URLs, which URLs have a high likelihood of containing a majority of tokens which are known to not be in use.	2019-04-26 17:14:00 -04:00
Raymond Hill	19ece97b0c	Leverage compile-time token information in new fitler classes Related commit: - `99390390fc` The token information available at compile time can be stored in the filter to be used at match() time. This allows the use of startsWith() rather than a more costly indexOf() call as a first quick test to detect mismatches.	2019-04-26 11:16:47 -04:00
Raymond Hill	1f8f616faf	Make Firefox dev build auto-update	2019-04-25 19:44:33 -04:00
Raymond Hill	74c24dbf37	New revision for dev build	2019-04-25 19:40:48 -04:00
Raymond Hill	f667fc2d65	Fix page count computation in publicSuffixList.enableWASM()	2019-04-25 19:40:07 -04:00
Raymond Hill	e0d2285da0	Convert HNTrie code to ES6 `class`	2019-04-25 19:38:07 -04:00
Raymond Hill	155abfba18	Cache and reuse result of HNTrieRef.matches() when possible Due to how web pages typically load secondary resources and due to how HNTrieContainer instances are used in uBO, there is a great likelihood that the result of a previous call to HNTrieRef.matches() can be reused in a subsequent call. This has been confirmed by instrumenting HNTrieRef.matches(). Since uBO uses distinct HNTrieContainer instances to either match against the request or the origin hostnames, this means a high likelihood of repeated calls to HNTrieRef.matches() with the same hostname as argument, hence a performance gain when caching the argument+result -- as despite that HNTrie.matches() is fast, comparing two short strings is even faster if this allows to skip HNTrie.matches() altogether.	2019-04-25 18:36:03 -04:00
Raymond Hill	99390390fc	Introduce three more specialized filter classes to avoid regexes Performance- and memory-related work. Three more classes have been created to avoid regex-based filters internally. Purpose is to enforce filters which have only one single wildcard in their pattern, a common occurrence. The filter pattern is split in two literal string segments. Similar as above, with the added condition that the filter is hostname-anchored (`\|\|`). The "Wildcard2" variant is a further specialization to enforce filters where the only wildcard is immediately preceded by the `^` special character, again a very common occurrence. Using two literal string segments in lieu of regexes allows to quickly detect a mismatch by just testing the first segment. Additionally, this reduces memory footprint as regexes are much more expensive memory-wise than plain strings. These three new filter classes allow to replace the use of 5276 regex-based filters internally with plain string-based filters. Often-called isHnAnchored() has been further fine-tuned to avoid as much work as possible. I have also observed that using an arrow function for closure-purpose helps measurably performance, as per built-in benchmark.	2019-04-25 17:48:08 -04:00
Raymond Hill	dfd6076a5e	Make Firefox dev build auto-update	2019-04-24 08:37:58 -04:00
Raymond Hill	b59f7d44ee	New revision for dev build	2019-04-24 08:32:33 -04:00
Raymond Hill	fff2bb6290	Assume media elements with no Content-Length header to be of size 0 Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/543	2019-04-24 08:30:54 -04:00
Raymond Hill	72bbcdd93c	Prevent search expression in CodeMirror editor from crossing line boundaries Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/493	2019-04-23 19:26:02 -04:00
Raymond Hill	cd1a11fa9d	Update to CodeMirror version 5.46	2019-04-23 19:06:03 -04:00
Raymond Hill	3efb0daa66	Make Firefox dev build auto-update	2019-04-23 09:46:46 -04:00
Raymond Hill	c535c624bd	Import translation work from https://crowdin.com/project/ublock	2019-04-23 09:32:15 -04:00
Raymond Hill	dd7125378b	New revision for dev build	2019-04-23 09:29:49 -04:00
Raymond Hill	3c5102811a	Fix the logger's rendering of hostnames starting with digits Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/541	2019-04-23 09:28:00 -04:00
Raymond Hill	16a76aa524	Add filter expressions in logger's expression picker - Added `media` - Include `generichide` in `dom` filter expression - Include `beacon`/`csp_report`/`ping` in `other filter expression	2019-04-22 10:23:58 -04:00
Raymond Hill	bb406bd883	Make Firefox dev build auto-update	2019-04-21 17:07:24 -04:00
Raymond Hill	cd832bb102	New revision for dev build	2019-04-21 17:03:49 -04:00
Raymond Hill	43ecffc295	Fix overzealous strict blocking (regression) Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/536 Regression from: - `3f3a1543ea (diff-522a16ddeed280252d7c3a351261b441R2767)`	2019-04-21 09:17:31 -04:00
Raymond Hill	cb18ec54f0	Make Firefox dev build auto-update	2019-04-21 08:04:17 -04:00
Raymond Hill	918116af52	New revision for dev build	2019-04-21 08:00:50 -04:00
Raymond Hill	f10b100379	Fix the handling of pseudoclass-based generic cosmetic filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/464 Regression from: `261ef8c510 (diff-3b15596213ed9ba37fb5b8bb1402a6c2R599)` Pseudoclass-based generic cosmetic filters were improperly seen as invalid following the regression.	2019-04-21 07:49:44 -04:00
Raymond Hill	59f4fd1f43	Make Firefox dev build auto-update	2019-04-21 06:20:55 -04:00
Raymond Hill	fae91c7c55	New revision for dev build	2019-04-21 06:15:13 -04:00
Raymond Hill	7735b35e21	Fix uncaught rejected promise in assets.fetchText() Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/534 Regression from `a52b07ff6e`	2019-04-21 06:12:20 -04:00
Raymond Hill	1c63aa719d	Make Firefox dev build auto-update	2019-04-20 19:29:47 -04:00
Raymond Hill	605adfe689	New revision for dev build	2019-04-20 19:25:34 -04:00
Raymond Hill	97f91f8be9	Small code review of `a52b07ff6e`	2019-04-20 19:10:34 -04:00
Raymond Hill	f0d5205bd7	Discard existing lines when importing from file in "My filters" Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/519	2019-04-20 18:57:16 -04:00
Raymond Hill	1de75ced5c	Make Firefox dev build auto-update	2019-04-20 17:53:36 -04:00
Raymond Hill	ca7745697a	New revision for dev build	2019-04-20 17:33:48 -04:00
Raymond Hill	537271f26b	Fix how `*$`, `\|https://`, `http://` filters are reported in logger This was a regression introduced in `3f3a1543ea` Reported in issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528#issuecomment-485163348	2019-04-20 17:25:32 -04:00
Raymond Hill	a52b07ff6e	Make `userResourcesLocation` able to support multiple URLs The URLs must be space-separated. Reminders: - The additional resources will be updated at the same time the built-in resource file is updated - Purging the cache of 'uBlock filters' will also purge the cache of the built-in resource file -- and hence force a reload of the user's custom resources if any Related issues: - https://github.com/gorhill/uBlock/issues/3307 - https://github.com/uBlockOrigin/uAssets/issues/5184#issuecomment-475875189 Addtionally: - Opportunitically promisified assets.fetchText() - Fixed https://github.com/gorhill/uBlock/issues/3586	2019-04-20 17:16:49 -04:00
Raymond Hill	d9fe40f1ce	Make Firefox dev build auto-update	2019-04-20 09:36:16 -04:00
Raymond Hill	78dcf5949a	New revision for dev build	2019-04-20 09:33:01 -04:00
Raymond Hill	fa83744b58	Use a sequence of base 64 numbers to encode array buffers The purpose of using a custom base128 encoder is to convert array buffers into strings, to allow a direct string-to-array buffer conversion at load time: string => array buffer Whereas a JSON array would require an extra step: JSON array as string => JS array => array buffer Turns out that the current use of a custom base128 encoding results in a significantly larger selfie storage usage when converting array buffers into strings. Speculation: possibly the browser convert the strings to save into JSON strings internally. Since the custom base128 encoder is likely to cause the resulting string to contain a lot of unprintable ASCII characters, these will need to be escaped when converted to JSON -- escaped characters occupy more space than non-escaped ones. Using a sequence of base 64 numbers means only printable will be present in the output string, hence no escaping necessary. I have observed significant reduction in storage usage for selfie purpose.	2019-04-20 09:06:54 -04:00
Raymond Hill	a0c4183cad	Make Firefox dev build auto-update	2019-04-19 17:15:21 -04:00
Raymond Hill	69cb5d8abd	Import translation work from https://crowdin.com/project/ublock	2019-04-19 17:07:27 -04:00
Raymond Hill	b08e6b009f	New revision for dev build	2019-04-19 17:02:04 -04:00
Raymond Hill	3f3a1543ea	Add HNTrie-based filter classes to store origin-only filters Related issue: - https://github.com/uBlockOrigin/uBlock-issues/issues/528#issuecomment-484408622 Following STrie-related work in above issue, I noticed that a large number of filters in EasyList were filters which only had to match against the document origin. For instance, among just the top 10 most populous buckets, there were four such buckets with over hundreds of entries each: - bits: 72, token: "http", 146 entries - bits: 72, token: "https", 139 entries - bits: 88, token: "http", 122 entries - bits: 88, token: "https", 118 entries These filters in these buckets have to be matched against all the network requests. In order to leverage HNTrie for these filters[1], they are now handled in a special way so as to ensure they all end up in a single HNTrie (per bucket), which means that instead of scanning hundreds of entries per URL, there is now a single scan per bucket per URL for these apply-everywhere filters. Now, any filter which fulfill ALL the following condition will be processed in a special manner internally: - Is of the form `\|https://` or `\|http://` or ``; and - Does have a `domain=` option; and - Does not have a negated domain in its `domain=` option; and - Does not have `csp=` option; and - Does not have a `redirect=` option If a filter does not fulfill ALL the conditions above, no change in behavior. A filter which matches ALL of the above will be processed in a special manner: - The `domain=` option will be decomposed so as to create as many distinct filter as there is distinct value in the `domain=` option - This also apply to the `badfilter` version of the filter, which means it now become possible to `badfilter` only one of the distinct filter without having to `badfilter` all of them. - The logger will always report these special filters with only a single hostname in the `domain=` option. ** [1] HNTrie is currently WASM-ed on Firefox.	2019-04-19 16:33:46 -04:00

1 2 3 4 5 ...

6510 Commits