Mike Fährmann
1adafdd3d0
document cache file requirement for DeviantArt refresh tokens
2019-10-13 23:01:57 +02:00
Mike Fährmann
df2b3c6888
restore OAuth2 authentication error messages
2019-10-13 22:48:01 +02:00
Mike Fährmann
6779512fc7
[nozomi] add post and tag extractors ( #388 )
2019-10-13 22:16:03 +02:00
Mike Fährmann
6abe5f5bbb
[patreon] fix pagination ( #444 )
...
The Patreon-provided URLs for the next set of posts aren't
always complete, i.e. they can be missing their scheme and
the subsequent double slash: "www.patreon.com/…"
2019-10-12 22:30:51 +02:00
Mike Fährmann
ff1e4a86aa
release version 1.10.6
2019-10-11 20:15:56 +02:00
Mike Fährmann
d4ffd6c952
[yaplog] improve metadata extraction ( #443 )
...
- provide a fallback if there is no numerical image ID
- add a 'filename' field
- convert 'date' to an actual datetime object
2019-10-11 18:39:52 +02:00
Mike Fährmann
15af2f8464
[hitomi] fallback to /reader/ page if main page returns 404
...
Some galleries return a 404: Not Found error when trying to access
them through the main gallery URL, but their content is still
available on the respective /reader/ page.
2019-10-11 18:39:52 +02:00
Mike Fährmann
8af59a4bba
fix & update docs
...
- update Requests links
- add example for --exec
- set '-dev' version
2019-10-11 18:36:25 +02:00
Mike Fährmann
dc6ad81e2e
[yaplog] prevent crash on empty posts ( #443 )
2019-10-10 21:19:09 +02:00
Mike Fährmann
94eb7c6cad
[deviantart] fix sta.sh extraction (436)
2019-10-10 18:40:15 +02:00
Mike Fährmann
1032cfa34b
[downloader:http] extend mimetype map with archive formats
2019-10-10 18:30:23 +02:00
Mike Fährmann
27b5b2497e
[deviantart] fix download URLs ( #436 )
...
... except for sta.sh content.
Instead of using the old '/api/v1/oauth2/deviation/download' endpoint,
which started delivering URLs to 404 pages a while ago,
it is also possible to get a download URL from the relatively new
'/_napi/da-browse/shared_api/deviation/extended_fetch' endpoint
used by DeviantArt's Eclipse interface.
The current strategy is therefore:
- Iterate over deviations using the OAuth2 API
- Fetch original download URLs with the new NAPI/Shared API
2019-10-09 20:35:52 +02:00
Mike Fährmann
93aac8dfea
[yaplog] fix incomplete image URLs ( #443 )
2019-10-09 17:42:15 +02:00
Mike Fährmann
a782b009b8
[yaplog] match blog names with '-' ( #443 )
2019-10-09 17:40:30 +02:00
Mike Fährmann
cf5e716b9d
[hitomi] fix image URLs
2019-10-09 17:21:37 +02:00
Mike Fährmann
ad81c07204
[postprocessor] match logger names of downloader modules
...
The logger name for a postprocessor object got changed to
"postprocessor.<module-name>" instead of just
"postprocessor"
2019-10-06 23:30:18 +02:00
Mike Fährmann
03bc8adfc7
[postprocessor:exec] run after file moved to target location
...
(#421 )
2019-10-06 23:12:22 +02:00
Mike Fährmann
35958bebd4
[postprocessor:exec] fix filename quoting on Windows ( #421 )
2019-10-06 15:09:00 +02:00
Mike Fährmann
b06c372e4d
[postprocessor:exec] improve; add command-line option ( #421 )
2019-10-05 23:46:55 +02:00
Mike Fährmann
5a54efa025
[xhamster] unescape 'title' and 'description'
2019-10-04 14:44:51 +02:00
Mike Fährmann
1b9bf4fc6e
[behance] fix 'tags' extraction
2019-10-03 17:36:02 +02:00
Mike Fährmann
bb97e87989
[komikcast] ignore banner image
2019-10-03 17:34:06 +02:00
Mike Fährmann
0ff90a3f7d
[gfycat] include title in default filenames ( closes #434 )
2019-10-02 21:46:01 +02:00
Mike Fährmann
fabdc3b0c6
release version 1.10.5
2019-09-28 22:13:41 +02:00
Mike Fährmann
de4e2029d1
[nsfwalbum] update test album
...
the old one is no longer available
2019-09-28 20:48:15 +02:00
Mike Fährmann
1faec285d1
[nijie] further improvements ( closes #423 )
...
- provide a 'user_name' metadata field
- usually the same as 'artist_id', except for favorite downloads
- extract the whole description text and properly escape HTML entities
- fixed an issue with titles or tags containing double quotes
2019-09-27 23:14:32 +02:00
Mike Fährmann
6d0a533d68
[reddit] respect 'comments:0' for single submissions ( #429 )
2019-09-27 23:11:28 +02:00
Mike Fährmann
803d8f814e
[oauth] update scope for reddit tokens ( #428 )
...
'/user/<username>/...' requires the 'history' scope to be accessible
(https://www.reddit.com/dev/api/#GET_user_{username}_{where} )
2019-09-27 17:38:55 +02:00
Mike Fährmann
46ba173ded
[reddit] fix documentation inconsistencies ( closes #429 )
...
- Require 'reddit.comments' to be a number and convert it to an
integer to be extra sure
- Link to the README's OAuth section were appropriate
2019-09-27 17:34:10 +02:00
Mike Fährmann
20eb6c401f
[nijie] improvements and fixes ( #423 )
...
- ignore unavailable image pages
- more metadata fields: artist_name, date, tags
- rename 'index' to 'num'
- improved code structure
2019-09-26 21:45:01 +02:00
Mike Fährmann
d1ea08c67d
[weibo] fixes and improvements
...
- ignore unavailable videos (fixes #427 )
- handle empty 'geo' fields
- consistent metadata fields for images and videos
2019-09-26 14:57:35 +02:00
Mike Fährmann
38d97f3da6
[deviantart] add debug message about API credentials ( #424 )
2019-09-25 21:20:55 +02:00
Mike Fährmann
80c2104fb5
[deviantart] fix 429 handling if 'fatal' is False ( closes #424 )
2019-09-25 21:16:35 +02:00
Mike Fährmann
913460240d
[reddit] fix 'extractor.blacklist()' arguments
...
The second argument must support 'append()'.
2019-09-24 23:01:12 +02:00
Mike Fährmann
22bac14452
[pixiv] match '/artworks/' URLs
2019-09-24 21:53:14 +02:00
Mike Fährmann
66cac207ac
[twitter] match and use 'i/web' status URLs
2019-09-24 21:18:05 +02:00
Mike Fährmann
5a1a0f5325
change text representation of user extractors to "User Profiles"
2019-09-22 22:21:48 +02:00
Mike Fährmann
946f2751e2
[reddit] add 'user' extractor ( closes #350 )
2019-09-22 22:18:17 +02:00
Mike Fährmann
c14abb9fb8
[reddit] improve URL parameter handling for subreddit links
2019-09-22 22:03:22 +02:00
Mike Fährmann
ee8b654464
[instagram] implement 'highlights' option ( closes #329 )
2019-09-21 23:38:20 +02:00
Mike Fährmann
f63c3097a9
[instagram] rework some code paths
...
- combine fetching an HTML page and extracting its 'shared_data'
- move 'shared_data' and field access info out of '_extract_page()'
- introduce a '_request_graphql()' method
2019-09-21 23:10:41 +02:00
Mike Fährmann
4330133114
[imgur] add 'favorite' extractor ( closes #420 )
...
… and use a newer site-internal API endpoint for user posts
2019-09-19 15:54:26 +02:00
Mike Fährmann
ee5e20221f
[imgth] fix image URLs
2019-09-19 14:56:48 +02:00
Mike Fährmann
b63b126808
[hentaicafe] extend URL pattern
2019-09-18 19:08:45 +02:00
Mike Fährmann
d780f0357e
[imgur] add user extractor
2019-09-17 22:58:18 +02:00
Mike Fährmann
11ea689013
[simplyhentai] fix image and video URLs
2019-09-16 21:37:16 +02:00
Mike Fährmann
15632a1570
[tsumino] fix extraction
2019-09-15 22:09:59 +02:00
Mike Fährmann
d92802fd37
[luscious] fix detection of unavailable galleries
2019-09-15 21:16:25 +02:00
Mike Fährmann
f99da2b866
[imgbb] detect invalid album and user profile links
...
and update test results, since the old album got deleted
2019-09-14 23:22:08 +02:00
Mike Fährmann
01bc7adadc
[deviantart] improve journal detection ( #419 )
...
Some journal-like posts are not reported to be journals (isJournal
is set to False), even though they have a textContent field.
https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668
2019-09-14 22:45:22 +02:00