Mike Fährmann
6c71e9cf5d
[deviantart] add separate 'sta.sh' extractor ( #113 )
...
- supports multiple stashed deviations per page
- explicitly mentions sta.sh support on supportedsites.rst
2018-12-26 18:56:57 +01:00
Mike Fährmann
f9ace0f4a3
[mangapark] fix manga extraction ... again
2018-12-26 18:56:57 +01:00
Mike Fährmann
28f9539551
[tumblr] change default values for post types and inline media
2018-12-26 18:55:59 +01:00
Mike Fährmann
5be95034ba
[tumblr] add option to download avatars ( #137 )
2018-12-26 14:29:30 +01:00
Mike Fährmann
7471933d5f
use extractor.request for all other API calls
...
- deviantart
- pawoo
- pixiv
- reddit
2018-12-22 14:42:23 +01:00
Mike Fährmann
995844c915
[instagram] relax test pattern even more
2018-12-22 14:25:55 +01:00
Mike Fährmann
2e5f82e59e
[tumblr] don't follow 'external' Tumblr URLs ( #139 )
2018-12-22 14:05:43 +01:00
Mike Fährmann
c5d4f558c9
allow missing field access keys in format strings ( #136 )
2018-12-22 13:54:14 +01:00
Mike Fährmann
0c9762f00e
[mangapark] fix extraction
2018-12-22 13:52:48 +01:00
Mike Fährmann
c9ef5ed364
[luscious] ensure URLs have a scheme
2018-12-21 17:56:51 +01:00
Mike Fährmann
851ee9f89f
[sensescans] replace tests
...
the old ones got removed
2018-12-21 16:05:07 +01:00
Mike Fährmann
c14d44e1bc
[downloader:common] retry downloads on SSL errors ( #130 )
2018-12-14 16:33:04 +01:00
Mike Fährmann
0be7ee3106
[hitomi] fix image subdomains ( closes #142 )
...
galleries with an ID ending in 1 need some special treatment
2018-12-14 16:15:06 +01:00
Mike Fährmann
fe96835d25
[kissmanga] add fallback for chapter-string parsing ( #20 )
2018-12-14 16:08:36 +01:00
Mike Fährmann
4d73cc785d
update test results
2018-12-14 16:07:32 +01:00
Mike Fährmann
049a9575c4
[tumblr] fix inline extraction #2
...
Using only the "comment" field isn't enough ...
[ci skip]
2018-12-11 21:57:20 +01:00
Mike Fährmann
f6bf66f72c
[pixiv] create directory for each "work" item ( #136 )
2018-12-11 20:37:47 +01:00
Mike Fährmann
79f6755c60
[postprocessor:classify] handle missing "extension" ( #138 )
2018-12-11 20:10:02 +01:00
Mike Fährmann
b7a9f6cc49
[tumblr] improve inline extraction ( #137 )
2018-12-11 20:02:48 +01:00
Mike Fährmann
010da8372a
[instagram] relax test pattern
2018-12-11 19:59:28 +01:00
Mike Fährmann
1c6b9ba322
[readcomiconline] use HTTPS
2018-12-09 14:54:55 +01:00
Leonardo Taccari
2655a2ea02
Add support for instagram.com user profiles and pages ( #134 )
...
* [instagram] Add extractor for instagram.com user profiles and pages
The extractor scrapes `instagram.com/<user>' timelines and
`instagram.com/p/<shortcode>' by mimicking the behaviour of a web
browser and extracting the sharedData JSON of the single pages.
Please note that this mean that for user timelines we also do an
extra request to the `instagram.com/p/<shortcode>' page but this
permit to have consistent (and all) information about the media
fetched.
The MD5 logic used for X-Instagram-GIS was documented in
<https://stackoverflow.com/questions/49786980/ >
* [instagram] Test for keywords, not url for GraphImage and GraphSidecar
URLs returned by instagram seems not stable so avoid testing for
them and instead test for keyword returned.
* [instagram] Improve test of InstagramProfilepageExtractor
Also check the count of media returned.
* [instagram] Several cleanup and improvements
- Change description, subcategories to generate a better description in
docs/supportedsite.rst
- Remove not needed InstagramExtractor.__init__()
- Use text.parse_int() instead of directly using int() (the former is more
robust)
- Use self.request().json() instead of using json.loads() the
self.request().text()
- Add `pattern:' to check the URLs where we do not have a stable URLs.
It seems that only the subdomain is not stable.
Thanks to @mikf!
2018-12-09 12:52:14 +01:00
HRXN
e80ee77d71
tumblr.py: update regex for video ( #133 )
...
There seems to be another sub-domain for videos, apparently..
Not just
`vt(.media).tumblr`
`vtt(media).tumblr`
But also
`ve(.media).tumblr`
2018-12-09 09:07:46 +01:00
Mike Fährmann
9a98b6769d
use extractor.request for API calls ( #130 )
...
... at least for OAuth1.0 based APIs (flickr, smugmug, tumblr)
2018-12-04 21:29:06 +01:00
Mike Fährmann
0225d90078
add exception name and traceback for OSErrors
2018-12-04 19:24:50 +01:00
Mike Fährmann
ad2cefda6b
[tumblr] in case of exception use filename as 'hash' ( #129 )
...
While a filename might not be a real 'hash', or comparable to what
tumbler usually provides, it is still better than an empty string.
At least as long as "alternatives" in format strings aren't implemented.
2018-12-04 19:15:23 +01:00
Mike Fährmann
95636418ad
[tumblr] catch exception for 'hash' extraction ( fixes #129 )
2018-12-02 19:48:09 +01:00
Mike Fährmann
40e30694f3
[pinterest] fix pin.it redirects
2018-12-02 19:38:50 +01:00
Mike Fährmann
770200888e
[gfycat] use public API endpoint
2018-12-02 18:56:53 +01:00
Mike Fährmann
b1e22e8354
release version 1.6.1
2018-11-28 15:34:01 +01:00
Mike Fährmann
be52069cbc
update CHANGELOG and docs/supportedsites
2018-11-28 14:53:27 +01:00
Mike Fährmann
5d6e219fb2
[joyreactor] update tests
2018-11-28 14:52:19 +01:00
Mike Fährmann
c59f56fe7e
[gfycat] fix extraction
...
/cajax/get/<id> doesn't work anymore
2018-11-28 13:26:21 +01:00
Mike Fährmann
ba56827f36
[newgrounds] add user-, video-, image-extractors ( #119 )
2018-11-27 15:44:53 +01:00
Mike Fährmann
15890930ea
[mangafox] fix extraction
...
use mobile version since desktop version is obfuscated
2018-11-26 16:13:41 +01:00
Mike Fährmann
a4263fb253
[luscious] add extractor for search results ( closes #127 )
2018-11-25 18:57:51 +01:00
Mike Fährmann
fb53b5dd55
fix control+c during -j and range tests
2018-11-25 18:54:05 +01:00
Mike Fährmann
a0ae156edc
[pornreactor] add tag-, user-, post-extractors ( #114 )
2018-11-23 14:41:26 +01:00
Mike Fährmann
bacbc2e7bd
[joyreactor] try to prevent JsonDecodeErrors ( #114 )
2018-11-23 14:32:37 +01:00
Mike Fährmann
503d42a1c2
[joyreactor] add tag-, user-, post-extractors ( #114 )
2018-11-23 09:25:02 +01:00
Mike Fährmann
59bb434ba5
[flickr] add ability to download all albums of a user
...
for example with 'https://www.flickr.com/photos/shona_s/albums '
2018-11-23 09:09:37 +01:00
Mike Fährmann
13cb270326
set target directory before postprocessor init ( fixes #126 )
2018-11-21 22:21:26 +01:00
Mike Fährmann
9e188f6a21
[4chan] support 4channel.org domain
2018-11-21 17:40:38 +01:00
Mike Fährmann
041bd501fc
[hentaifoundry] unescape YII_CSRF_TOKEN value
...
This fixes the POST requests to /site/filters
2018-11-19 21:46:17 +01:00
Mike Fährmann
b828473aa3
retry HTTP requests for more exception classes
2018-11-19 15:49:13 +01:00
Mike Fährmann
c2e59b9a7d
update CHANGELOG.md
...
[ci skip]
2018-11-18 22:33:35 +01:00
Mike Fährmann
d4b2b73bef
release version 1.6.0
2018-11-17 18:28:02 +01:00
Mike Fährmann
ea9d1b6501
update README.rst
...
- point to pip3/python3 in installation-instructions (#118 , #121 )
- add dependency list
- update URLs to external resources
- remove incomplete list of supported sites
2018-11-17 17:46:19 +01:00
Mike Fährmann
c47482b110
smaller changes, missing docs, etc.
...
- make 'netrc' extractor-specific
- rename 'downloader.enable' to 'enabled'
- document 'downloader.ytdl.format'
- consistent newlines in configuration.rst
2018-11-16 18:18:07 +01:00
Mike Fährmann
b17a5d6f3b
give downloader classes proper names
2018-11-16 14:40:05 +01:00