The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.
TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.
(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
This allows for image filtering via Python expressions by the same
metadata that is also used to build filenames (--list-keywords).
The usually shunned eval() function is used to evaluate
filter-expressions, but it seemed quite appropriate in this case and
shouldn't introduce any new security issues, as any attacker that could do
> gallery-dl --filter "delete-everything()" ...
could as well do
> python -c "delete-everything()"
Duplicate URLs might occur if, for example, an artist adds another
image to his gallery while an extractor is running and images are being
downloaded on sites like pixiv/nijie/hentaifoundry.
The next image on the next page will have already been downloaded and
will cause a premature end if '--abort-on-skip' is being used.
Metadata of several year old lists shouldn't change as much as it
would for newer ones, which makes metadata-comparisons of the output
of build_testresult_db.oy easier.
All FoolFuuka based 4chan-archive extractors can now be configured using
their own config keys (extractor.<category>) as well as a common shared
one (extractor.foolfuuka).
This is done by prepending "group-" to an extractor's subcategory
if the URL belongs to a group ("folder" becomes "group-folder" and
so on). This changes the configuration-path being used and is also
reflected in the output of '--list-keywords'.
Using argument groups is a definite improvement over how things looked
previously, but general group membership of individual items might be
a thing to reconsider.
When using 2 or more config files, the values of the second would
improperly overwrite nested dictionaries of the first one.
The new method properly combines these nested dictionaries as well.