Ensure serialization returns copy of data rather than live
references to data. This allows to immediately deserialize() the
result of serialize().
Also, adjust code to modified behavior of filterQuery().
This commit is a rewrite of the static filtering parser into a
tree-based data structure, for easier maintenance and better
abstraction of parsed filters.
This simplifies greatly syntax coloring of filters and also
simplify extending filter syntax.
The minimum version of Chromium-based browsers has been raised
to version 73 because of usage of String.matchAll().
Fixed serious regression in previous dev build in applying
`csp=` filters. Reported internally by uBO team.
Promote usage of `removeparam` in code instead of `queryprune`,
which is to be deprecated.
Removed test against previously tested hostname in
FilterHostnameDict since as per various benchmark, the
test does not really help.
Remove serialization API in Node.js code as the API is now
present in SNFE itself.
The main nodejs flavor is "npm", which is to be used to
lint/test and the publication of an official npm
package -- and by design it has dependencies on mocha,
eslint, etc.
A new flavor "dig" has been created with minimal
dependencies and which purpose is to easily allow to
write specialized code to investigate local code changes
in uBO -- and it's not meant for publication.
Consequently, "make nodejs" has been replaced with
"make npm", and a new "dig" target has been added to the
makefile, to be used for instrumenting local code changes
for investigation purpose.
Whereas before the string segment was encoded as:
LL OOOOOOOOOOOO
where L are the upper 8 bits and used to encode the length
of the segment, and O are the lower 24 bits and used to
encode the offset of the string data in the character
buffer, the new code encode as follow:
OOOOOOOOOOOO LL
And furthermore the most significant bit of the length
LL is now used to mark whether the current string segment
is a label boundary.
This means a cell can't reference a segment longer then
127 characters. To work around this limitation for when a
segment is longer than 127 characters (a rare occurrence),
the algorithm will simply split the segment into multiple
adjacent cells.
As a result, there is no longer a need to encode
"boundariness" into special cells, which simplifies
both the storing and matching algorithms.
Additionally, added minimal documentation for the NPM
package on how to import and use HNTrieContainer as a
standalone API.