gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 19:22:32 +01:00

Author	SHA1	Message	Date
Vrihub	96fcff182c	generic extractor (#735 ) * Generic extractor, see issue #683 * Fix failed test_names test, no subcategory needed * Prefix directory_fmt with "generic" * Relax regex (would break some urls) * Flake8 compliance * pattern: don't require a scheme This fixes a bug when we force the generic extractor on urls without a scheme (that are allowed by all other extractors). * Fix using g: and r: on urls without http(s) scheme Almost all extractors accept urls without an initial http(s) scheme. Many extractors also allow for generic subdomains in their "pattern" variable; some of them implement this with the regex character class "[^.]+" (everything but a dot). This leads to a problem when the extractor is given a url starting with g: or r: (to force using the generic or recursive extractor) and without the http(s) scheme: e.g. with "r:foobar.tumblr.com" the "r:" is wrongly considered part of the subdomain. This commit fixes the bug, replacing the too generic "[^.]+" with the more specific "[\w-]+" (letters, digits and "-", the only characters allowed in domain names), which is already used by some extractors. * Relax imageurl_pattern_ext: allow relative urls * First round of small suggested changes * Support image urls starting with "//" * self.baseurl: remove trailing slash * Relax regexp (didn't catch some image urls) * Some fixes and cleanup * Fix domain pattern; option to enable extractor Fixed the domain section for "pattern", to pass "test_add" and "test_add_module" tests. Added the "enabled" configuration option (default False) to enable the generic extractor. Using "g(eneric):URL" forces using the extractor.	2021-12-29 22:39:29 +01:00
Mike Fährmann	bd08ee2859	remove most 'yield Message.Version' statements only leave them in oauth.py as noop results	2021-08-16 03:10:48 +02:00
Mike Fährmann	968d3e8465	remove '&' from URL patterns '/?&#' -> '/?#' and '?&#' -> '?#' According to https://www.ietf.org/rfc/rfc3986.txt, URLs are "organized hierarchically" by using "the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components"	2020-10-22 23:31:25 +02:00
Mike Fährmann	19bf76bcf8	update extractor test results	2020-08-03 21:57:00 +02:00
Mike Fährmann	e62ebb4643	update CHANGELOG before building sdist and wheel packages	2020-06-27 19:45:09 +02:00
Mike Fährmann	921914141e	[imgbb] improve redirect handling	2020-04-20 23:36:57 +02:00
Mike Fährmann	ce5e2a58fe	[imgbb] update test results Image server domain changed from https://image.ibb.co/ to https://i.ibb.co/	2020-03-01 20:38:25 +01:00
Mike Fährmann	521fcd2eb9	[imgbb] fix error in galleries without user info (closes #471 )	2019-11-10 17:10:51 +01:00
Mike Fährmann	8061263d4c	[imgbb] improve pagination logic - avoid unnecessary API calls for small or empty galleries - combine duplicate code	2019-11-10 17:07:27 +01:00
Mike Fährmann	de83ae4576	make 'method' argument of Extractor.request keyword-only	2019-11-05 17:28:09 +01:00
Mike Fährmann	f99da2b866	[imgbb] detect invalid album and user profile links and update test results, since the old album got deleted	2019-09-14 23:22:08 +02:00
Mike Fährmann	189acbeac9	[imgbb] add extractor for individual images (closes #363 )	2019-08-05 22:52:08 +02:00
Mike Fährmann	2c839f3760	[imgbb] add user extractor + login support (#361 )	2019-08-01 21:39:20 +02:00
Mike Fährmann	2153206093	[imgbb] add album extractor (#361 )	2019-07-30 23:11:19 +02:00

14 Commits