This change introduces 'extractor.*.retries/timeout/verify' options
as a general way to set these values for all HTTP requests.
'downloader.http.retries/timeout/verify' is a way to override these
options for file downloads only and will fall back to 'extractor.*.…*
values if they haven't been explicitly set.
Also: downloader classes now take an extractor object as first argument
instead of a requests.session.
URLs starting with 'ytdl:' will now be handled by youtube-dl.
There is probably a lot to fix and improve, but the basic use case
works.
TODO:
- format selection and ytdl options in general
- better filename/path handling
- ytdl support for "unsupported URLs"
- ...
Enabling this option will detect videos in tweets and output them as
"unsupported" URLs, so that these can then be downloaded with youtube-dl
There are a lot of improvements to be made to the current
implementation, but it works and does what it is supposed to, even if
inefficient as can be ...
and some smaller changes ...
'user' is the name of the account an image is listed at and
'artist' is now the name of the account who created the image.
For example "https://www.hentai-foundry.com/user/Tenpura/faves/pictures"
- 'user': Tenpura
- 'artist' of the only image: LewdBrush
- rename "deleted" to "same-blog"
- change test for deleted original post to test if
original post owner has the same UUID (full blog name) as the one
being downloaded from
- add 'blog[uuid]' metadata to allow comparison with
'reblogged_from_uuid'
Setting 'reblogs' to "deleted" will check if the parent post of a
reblog has been deleted and download its media content if that is the
case, otherwise it will be skipped.
This is a rather costly operation (1 API request per reblogged post)
and should therefore be used with care.
Each post-processor config dict now supports a list of extractor
categories for which it should/shouldn't be active for.
For example:
"postprocessors": [
{"name": "classify",
"whitelist": ["tumblr", "deviantart"],
...
}
]