- uses an OrderedDict for session.headers (since 2.9.2)
- ships with urllib3 1.16, which is the first version to have an
'allowed_gai_family()' function
The selective-checkout scriptlet is only used during the build step, don't let it make into the final snap.
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
Flickr started serving images from live.staticflickr.com (see ec88ff1),
but the old farmN.staticflickr.com URLs still work - at least for the
time being.
Filesize (and most likely quality as well) for images from live.… is
severely reduced compared to images from farmN.… for non-original files,
so all live URLs are replaced to point to a randomly chosen farm server.
- 'post_id' and 'image_id' are only unique per user
- /image/ pages only show a maximum of 24 images, but there can be more
images than that in a blog post
- let extraction run in its own thread and maybe improve speed
- #190
This commit adds support for the two new JS expressions embedded in the
overall challenge code.
It does compute the correct 'js_answer' value, but the HTTP request to
/cdn-cgi/l/chk_jschl to get the 'cf_clearance' cookie always results in
a 403 response with a CAPTCHA inside (hence 'wip')
All steps to make this HTTP request indistinguishable from a regular web
browser (which passes the test) show no effect. This includes:
- using the exact same HTTP headers as a web browser
- follow query argument order
- different wait times
Images are now randomly served from the 'live.staticflickr.com' domain
instead of the "old" 'farmN.staticflickr.com' one, making it impossible
to use static 'url' and 'keyword' hashes as results.
Image quality doesn't appear to be effected by which image-server is
used. Files from 'farmN' and 'live' are the same.
The build failed due to missing `requests` build dependency, this patch
drops the unused component to build to avoid the problem.
The manpages are still built for the upcoming read-manual workaround.
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
* fixup! snap: Support official config paths via *-files confinement interfaces (#197)
* FIXME no longer applied
* Obsoleted HOME environment variable assignment
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
* snap: Migrate to selective-checkout
The selective-pull stage snap is superseded by selective-checkout, prefer the new one.
Refer-to: Selective-checkout: Check out the tagged release revision if it isn't promoted to the stable channel <https://forum.snapcraft.io/t/the-selective-pull-scriptlet-stage-snap-workaround/10389>
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
* snap: Support bash completion
Refer-to: Scriptlets <https://docs.snapcraft.io/scriptlets/4892>
Refer-to: Tab completion for snaps <https://docs.snapcraft.io/tab-completion-for-snaps/2261>
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
* snap: Implement interface connection warning in the launcher
This patch ensures that the user will be acknowledge the missing
connection to the `removable-media` interface.
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
removes basically all metadata, but that can be compensated for with the
right search query. writing "parsers" for all 4 possible views that have
been introduced in the latest changes is too much of a hassle ...
… a man-page containing all of gallery-dl's configuration file options.
This implementation relies on Python dicts preserving their insertion
order. Python 3.4 and 3.5 need to use OrderedDict or they produce
randomly ordered man-page sections.
The man-page formatting is a bit rough around the edges, but it works
for the most part. The only real "problem" are inline-links, but it's
better if they are left in there.
Add support for hashtags (TagPage-s), i.e. explore/tags/<tag> URLs.
This also introduce a get_metadata() method in order to append
possible further metadata per-(sub)extractor.
Refactor and generalize _extract_profilepage() to _extract_page()
in order to be reused by _extract_profilepage() and _extract_tagpage()
simply by passing the type of page (`ProfilePage' or `TagPage') and picking up
the respective fields in shared data.
* [instagram] Add support for GraphSidecar media types
Refactor _extract_postpage() to always return a list of medias.
Fetch common keywords and gracefully handle GraphSidecar media type
by extracting each single media and adding `sidecar_media_id' and
`sidecar_shortcode' keywords to indicate the parent of sidecar
childrens.
While here join the copyright comment lines in a single one.
Closes#178.
* [instagram] Use `yield from' instead of `for ... yield' (thanks @mikf)!
* [instagram] Adjust filename for GraphSidecar medias
Add a possible leading `media_id' of the sidecar for GraphSidecar
media.
Thanks to @mikf for the suggestion!
* [instagram] Add extra metadata for youtube-dl in GraphSidecar childrens
GraphSidecar children ytdl: URLs when consumed by youtube-dl
redirects to the URL of their parent. In GraphSidecar-s with
multiple GraphVideo-s this leads to downloading the same video
multiple times.
Add a `_ytdl_index' field to indicate the index of the youtube-dl
playlist corresponding the children of the sidecar.
This will be used by the `ytdl' downloader.
This patch adds a copy of the youtube-dl package to the snap to enable the video downloading feature.
Tested with the Twitter extractor.
Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>