Mike Fährmann
102c482f5e
[reddit] skip invalid/failed gallery items ( fixes #1127 )
2020-11-21 17:34:38 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
76dfa11a65
[reddit] add 'date' metadata field ( closes #1068 )
2020-10-16 15:48:04 +02:00
Mike Fährmann
d4ff767291
[reddit] improve gallery extraction ( fixes #955 )
2020-08-23 22:06:06 +02:00
Mike Fährmann
a57b6b3c3a
[reddit] handle deleted galleries ( fixes #953 )
2020-08-20 20:14:07 +02:00
Mike Fährmann
60ba3cb946
[reddit] support gallery posts ( closes #920 )
2020-08-03 22:06:15 +02:00
Mike Fährmann
5a6e750704
[reddit] fix AttributeError when using 'recursion' ( fixes #879 )
2020-07-09 19:19:05 +02:00
Mike Fährmann
94a08f0bcb
[reddit] limit title length in default filenames ( #873 )
2020-07-09 18:19:33 +02:00
Mike Fährmann
be04e44e2c
[reddit] catch JSON decode errors ( #765 )
2020-06-11 18:32:52 +02:00
Mike Fährmann
dfcf2a2c91
write OAuth token to cache by default ( #616 )
2020-05-25 22:35:45 +02:00
Mike Fährmann
0bf0146bfe
[reddit] don't send OAuth headers for file downloads ( fixes #729 )
2020-05-08 21:42:52 +02:00
Mike Fährmann
d02f7c1118
improve Extractor.wait()
...
- allow 'until' to be a datetime object
- do "time calculations" with UTC timestamps
- set a default 'reason'
2020-04-05 21:23:05 +02:00
Mike Fährmann
dff33b260c
[reddit] add 'videos' option
2020-01-31 23:45:02 +01:00
Mike Fährmann
d086f30b42
[reddit] restore archive keys for i.redd.it images
2020-01-29 22:12:55 +01:00
Mike Fährmann
56f1c96168
implement 'parent-directory' option ( #551 )
2020-01-29 18:32:37 +01:00
Mike Fährmann
ae07f92f7e
[reddit] rewrite extractor logic ( closes #551 )
...
Handle images and videos hosted on Reddit "natively",
allowing them to use reddit-specific metadata to build directory
and file names.
2020-01-29 17:57:25 +01:00
Mike Fährmann
569747a78d
implement extractor.wait()
2020-01-04 23:42:07 +01:00
Mike Fährmann
ce54b8c04c
let extractors opt-out of cookie option usage
...
useful to avoid sending unnecessary cookies when all authentication
is done through OAuth tokens
2020-01-01 21:12:37 +01:00
Mike Fährmann
48e42e73fb
[reddit] change default value for 'comments' to '0'
2019-12-20 16:54:59 +01:00
Mike Fährmann
9c0928457a
[reddit] fix errors with 't1_…' submissions
2019-12-20 16:49:44 +01:00
Mike Fährmann
df2b3c6888
restore OAuth2 authentication error messages
2019-10-13 22:48:01 +02:00
Mike Fährmann
6d0a533d68
[reddit] respect 'comments:0' for single submissions ( #429 )
2019-09-27 23:11:28 +02:00
Mike Fährmann
46ba173ded
[reddit] fix documentation inconsistencies ( closes #429 )
...
- Require 'reddit.comments' to be a number and convert it to an
integer to be extra sure
- Link to the README's OAuth section were appropriate
2019-09-27 17:34:10 +02:00
Mike Fährmann
913460240d
[reddit] fix 'extractor.blacklist()' arguments
...
The second argument must support 'append()'.
2019-09-24 23:01:12 +02:00
Mike Fährmann
946f2751e2
[reddit] add 'user' extractor ( closes #350 )
2019-09-22 22:18:17 +02:00
Mike Fährmann
c14abb9fb8
[reddit] improve URL parameter handling for subreddit links
2019-09-22 22:03:22 +02:00
Mike Fährmann
f4bc75e854
fix rate limit handling for OAuth APIs ( #368 )
2019-08-03 13:43:00 +02:00
Mike Fährmann
09f37fde39
[reddit] move date-min/-max handling into Extractor class
2019-07-16 22:54:39 +02:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
a2af2d2965
adjust cache maxage values
2019-03-14 22:21:49 +01:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
2e516a1e3e
store the full original URL in Extractor.url
2019-02-12 18:46:48 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
abbd45d0f4
update handling of extractor URL patterns
...
When loading extractor classes during 'extractor.find(…)', their
'pattern' attribute will be replaced with a compiled version of itself.
2019-02-08 20:08:16 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
6126615698
update URLs for supportedsites.rst
2019-01-30 16:18:22 +01:00
Mike Fährmann
4ab0960083
[reddit] add metadata to extracted URLs
2018-12-29 17:52:43 +01:00
Mike Fährmann
7471933d5f
use extractor.request for all other API calls
...
- deviantart
- pawoo
- pixiv
- reddit
2018-12-22 14:42:23 +01:00
Mike Fährmann
966a9ca3a0
update test results
2018-11-10 19:14:54 +01:00
Mike Fährmann
c9b8e6aefc
[reddit] fix submission-ID parsing ( #104 )
...
Uppercase characters caused a ValueError exception
2018-09-07 18:27:54 +02:00
Mike Fährmann
4313c95bc9
improve error message for OAuth2 authentication
2018-08-11 23:54:25 +02:00
Mike Fährmann
92fc199b07
[reddit] allow arbitrary subdomains
2018-05-13 11:23:23 +02:00
Mike Fährmann
3cec533c28
Merge branch 'archive'
2018-02-12 18:07:58 +01:00
Mike Fährmann
20af86b2ea
add more extractor tests
...
for mangastream, reddit and imgur
2018-02-12 17:07:18 +01:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
cc0c2cca57
[reddit] add extractor for reddit-hosted images ( closes #68 )
2018-01-14 18:55:42 +01:00
Mike Fährmann
676602056c
[reddit] unescape output URLs
2017-12-19 22:22:43 +01:00
Mike Fährmann
864a63ed33
fix typo
...
[skip ci]
2017-10-10 17:42:06 +02:00
Mike Fährmann
f3fbaa5c3e
[reddit] allow users to override the API User-Agent
...
Only overriding the Client-ID is not enough if you want to follow
Reddit's API access rules [1].
[1] https://github.com/reddit/reddit/wiki/API#rules
2017-10-10 17:29:46 +02:00
Mike Fährmann
0dedbe759c
enable '--chapter-filter'
...
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.
TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00