gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 19:22:32 +01:00

Author	SHA1	Message	Date
Mike Fährmann	32c30754d1	[tumblr] warn when unable to fetch higher-resolution images (#2957 ) and download the smaller version instead of failing with a 404 error	2022-09-26 12:05:34 +02:00
Mike Fährmann	46fe469c53	[tumblr] implement 'ratelimit' option (#2919 )	2022-09-17 14:10:33 +02:00
Mike Fährmann	7a799df17f	[tumblr] pre-compile regular expressions	2022-09-13 17:50:48 +02:00
blankie	9745b48830	[tumblr] attempt to fetch high-quality inline images (#2877 ) * [tumblr] attempt to fetch high-quality images (again) Fixes #1846, and fixes #1344 * slight refactor * update configuration.rst entry	2022-08-31 10:53:50 +02:00
blankie	e4cff67aaa	[tumblr] add count metadata field (#2804 ) Fixes #2778	2022-08-18 18:24:37 +02:00
Mike Fährmann	a27b17481f	[tumblr] restrict condition for calling _original_image	2022-08-11 12:20:39 +02:00
Mike Fährmann	df1c643dda	[tumblr] attempt to extract full-resolution photos - for photos with apparent width == 2048 or height == 3072 - can be disabled with 'original' option	2022-08-10 20:01:46 +02:00
blankie	5b63df46c0	[tumblr] attempt to get higher-quality images (#2761 )	2022-07-27 10:47:43 +02:00
Mike Fährmann	a566e63cdf	[tumblr] support '/blog/view' URLs (#2760 )	2022-07-15 15:22:54 +02:00
Mike Fährmann	6ea3ff5173	[tumblr] notify users about registering an oauth application if they hit the daily rate limit and are using default API credentials	2022-03-06 16:28:53 +01:00
Vrihub	96fcff182c	generic extractor (#735 ) * Generic extractor, see issue #683 * Fix failed test_names test, no subcategory needed * Prefix directory_fmt with "generic" * Relax regex (would break some urls) * Flake8 compliance * pattern: don't require a scheme This fixes a bug when we force the generic extractor on urls without a scheme (that are allowed by all other extractors). * Fix using g: and r: on urls without http(s) scheme Almost all extractors accept urls without an initial http(s) scheme. Many extractors also allow for generic subdomains in their "pattern" variable; some of them implement this with the regex character class "[^.]+" (everything but a dot). This leads to a problem when the extractor is given a url starting with g: or r: (to force using the generic or recursive extractor) and without the http(s) scheme: e.g. with "r:foobar.tumblr.com" the "r:" is wrongly considered part of the subdomain. This commit fixes the bug, replacing the too generic "[^.]+" with the more specific "[\w-]+" (letters, digits and "-", the only characters allowed in domain names), which is already used by some extractors. * Relax imageurl_pattern_ext: allow relative urls * First round of small suggested changes * Support image urls starting with "//" * self.baseurl: remove trailing slash * Relax regexp (didn't catch some image urls) * Some fixes and cleanup * Fix domain pattern; option to enable extractor Fixed the domain section for "pattern", to pass "test_add" and "test_add_module" tests. Added the "enabled" configuration option (default False) to enable the generic extractor. Using "g(eneric):URL" forces using the extractor.	2021-12-29 22:39:29 +01:00
Mike Fährmann	ddd48ceee5	update extractor test results	2021-03-28 23:06:44 +02:00
Mike Fährmann	968d3e8465	remove '&' from URL patterns '/?&#' -> '/?#' and '?&#' -> '?#' According to https://www.ietf.org/rfc/rfc3986.txt, URLs are "organized hierarchically" by using "the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components"	2020-10-22 23:31:25 +02:00
Mike Fährmann	3918b69677	remove 'extractor.blacklist' context manager	2020-09-11 13:17:35 +02:00
Mike Fährmann	7876a03ece	[tumblr] create directories for each post (fixes #965 ) This changes the identifiers for directory format string fields. Everything blog related is now inside a 'blog' object and not at the "base level" anymore. E.g. '{name}' for directories is now '{blog[name]}' (or '{blog_name}', since that is also available)	2020-08-31 21:58:20 +02:00
Mike Fährmann	2ecf1efb16	update extractor test results - tumblr: remove deleted post - jaiminisbox: replace removed manga/chapters - smugmug: one inconsequential field got removed	2020-07-18 15:12:28 +02:00
Mike Fährmann	5e5be67c26	[tumblr] prevent KeyErrors when using reblogs=same-blog (fixes #851)	2020-06-25 19:00:12 +02:00
Mike Fährmann	09cc9dbec0	prevent flake8 errors from comments looking like type annotations	2020-05-12 20:08:05 +02:00
Mike Fährmann	d02f7c1118	improve Extractor.wait() - allow 'until' to be a datetime object - do "time calculations" with UTC timestamps - set a default 'reason'	2020-04-05 21:23:05 +02:00
Mike Fährmann	d94215d119	[tumblr] replace '-' with ' ' in tag searches (fixes #611 ) To search for tags with actual minus signs in them (there shouldn't be too many,) manually replace those with url-encoded minus characters ('-' -> '%2d') before inputting them into gallery-dl: https://s679874.tumblr.com/tagged/tag-with-minus -> https://s679874.tumblr.com/tagged/tag%2dwith%2dminus	2020-02-17 23:29:13 +01:00
Mike Fährmann	3811fd8a25	fix time formatting for Python 3.4 and 3.5 'datetime.time.isoformat()' only has an optional 'timespec' argument since Python 3.6.	2020-01-05 00:47:10 +01:00
Mike Fährmann	569747a78d	implement extractor.wait()	2020-01-04 23:42:07 +01:00
Mike Fährmann	ce54b8c04c	let extractors opt-out of cookie option usage useful to avoid sending unnecessary cookies when all authentication is done through OAuth tokens	2020-01-01 21:12:37 +01:00
Mike Fährmann	c4702ec9b6	simplify some logging calls	2019-12-10 21:30:08 +01:00
Mike Fährmann	4409d00141	embed error messages in StopExtraction exceptions	2019-10-28 16:39:49 +01:00
Mike Fährmann	d5fbb2d9de	[tumblr] ignore audio links from Spotify etc.	2019-09-07 18:18:12 +02:00
Mike Fährmann	1133b7fcbd	[smugmug] update unit tests The account used for tests before has been deleted.	2019-07-19 17:16:24 +02:00
Mike Fährmann	8d1ae9b715	[tumblr] enable date-min/-max/-format options (#337 )	2019-07-17 14:36:41 +02:00
Mike Fährmann	208202b962	[tumblr] improve error handling (#297 ) In some cases Tumblr's API responds with an HTML document. Trying to decode it as JSON would raise an uncaught exception.	2019-06-04 14:02:17 +02:00
Mike Fährmann	add7e693d0	[tumblr] provide parsed 'date' metadata (#232 )	2019-04-29 17:30:42 +02:00
Mike Fährmann	fb14f80d62	[tumblr] fix avatar URLs for non-OAuth1.0 calls (closes #193 )	2019-03-17 11:07:22 +01:00
Mike Fährmann	d0059cab79	[tumblr] check for null URLs (closes #165 )	2019-02-19 13:49:55 +01:00
Mike Fährmann	5530871b5a	change results of text.nameext_from_url() Instead of getting a complete 'filename' from an URL and splitting that into 'name' and 'extension', the new approach gets rid of the complete version and renames 'name' to 'filename'. (Using anything other than {extension} for a filename extension doesn't really work anyway) Example: "https://example.org/path/filename.ext" before: - filename : filename.ext - name : filename - extension: ext now: - filename : filename - extension: ext	2019-02-14 16:07:17 +01:00
Mike Fährmann	4b1880fa5e	propagate 'match' to base extractor constructor	2019-02-11 13:31:10 +01:00
Mike Fährmann	6284731107	simplify extractor constants - single strings for URL patterns - tuples instead of lists for 'directory_fmt' and 'test' - single-tuple tests where applicable	2019-02-08 13:45:40 +01:00
Mike Fährmann	0afa913de4	[tumblr] add tests for hidden and private blogs (#145 ) Hidden / dashboard-only blogs are pretty straightforward and "only" require a valid 'access-token' and 'access-token-secret' for the given 'api-key' and 'api-secret', so that signed OAuth1.0 requests are possible. Private / password protected blogs on the other hand are a bit cumbersome. In addition to a valid 'access-token' and 'access-token-secret', they also require the account belonging to those tokens to be a member of the blog itself. Knowing the password and entering it in the website isn't enough to access a blog through the API. Following a private blog is also impossible, so that option can't work either.	2019-01-03 16:12:24 +01:00
Mike Fährmann	2f4f60de33	[tumblr] add tests for each post type	2018-12-27 22:41:42 +01:00
Mike Fährmann	28f9539551	[tumblr] change default values for post types and inline media	2018-12-26 18:55:59 +01:00
Mike Fährmann	5be95034ba	[tumblr] add option to download avatars (#137 )	2018-12-26 14:29:30 +01:00
Mike Fährmann	2e5f82e59e	[tumblr] don't follow 'external' Tumblr URLs (#139 )	2018-12-22 14:05:43 +01:00
Mike Fährmann	049a9575c4	[tumblr] fix inline extraction #2 Using only the "comment" field isn't enough ... [ci skip]	2018-12-11 21:57:20 +01:00
Mike Fährmann	b7a9f6cc49	[tumblr] improve inline extraction (#137 )	2018-12-11 20:02:48 +01:00
HRXN	e80ee77d71	tumblr.py: update regex for video (#133 ) There seems to be another sub-domain for videos, apparently.. Not just `vt(.media).tumblr` `vtt(media).tumblr` But also `ve(.media).tumblr`	2018-12-09 09:07:46 +01:00
Mike Fährmann	9a98b6769d	use extractor.request for API calls (#130 ) ... at least for OAuth1.0 based APIs (flickr, smugmug, tumblr)	2018-12-04 21:29:06 +01:00
Mike Fährmann	ad2cefda6b	[tumblr] in case of exception use filename as 'hash' (#129 ) While a filename might not be a real 'hash', or comparable to what tumbler usually provides, it is still better than an empty string. At least as long as "alternatives" in format strings aren't implemented.	2018-12-04 19:15:23 +01:00
Mike Fährmann	95636418ad	[tumblr] catch exception for 'hash' extraction (fixes #129 )	2018-12-02 19:48:09 +01:00
Mike Fährmann	7742cf8601	[tumblr] change 'reblogs' option (#103 ) - rename "deleted" to "same-blog" - change test for deleted original post to test if original post owner has the same UUID (full blog name) as the one being downloaded from - add 'blog[uuid]' metadata to allow comparison with 'reblogged_from_uuid'	2018-09-10 15:40:25 +02:00
Mike Fährmann	d4d95d3154	[tumblr] improve rewrite rules for video URLs	2018-09-09 14:09:47 +02:00
Mike Fährmann	a666ddd16b	[tumblr] extend 'reblogs' functionality (#103 ) Setting 'reblogs' to "deleted" will check if the parent post of a reblog has been deleted and download its media content if that is the case, otherwise it will be skipped. This is a rather costly operation (1 API request per reblogged post) and should therefore be used with care.	2018-09-07 19:13:52 +02:00
Mike Fährmann	b4eca2633e	[tumblr] support /archive URLs	2018-09-06 11:09:13 +02:00

1 2 3

104 Commits