gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 11:12:40 +01:00

Author	SHA1	Message	Date
Mike Fährmann	38bc6430d3	[downloader:http] don't overwrite existing '_mtime' fields	2020-04-10 23:08:03 +02:00
Mike Fährmann	115fd2c6f2	"fix" incomplete MIME types (#632 ) e-/exhentai's original image downloads currently send incomplete/invalid Content-Type headers, "jpg" instead of "image/jpg" etc, since the last update. (https://forums.e-hentai.org/index.php?showtopic=236113) This change prepends any Content-Type value missing a media type specification with "image/", transforming it into a valid MIME type. (A global solution to a local problem, but it shouldn't cause any issues anywhere else)	2020-03-03 21:21:57 +01:00
Mike Fährmann	adcd7cb24a	[downloader:http] add another MIME type for '.rar' files (#628 )	2020-03-01 20:42:13 +01:00
Mike Fährmann	380b693fad	[downloader:http] add more MIME types for '.bmp' files (#621 )	2020-02-23 16:51:04 +01:00
Mike Fährmann	760b9b4db4	add remove_file() and remove_directory() helpers these functions call os.unlink() or os.rmdir() while catching and suppressing potential OSErrors	2020-01-18 00:21:26 +01:00
Mike Fährmann	c4702ec9b6	simplify some logging calls	2019-12-10 21:30:08 +01:00
Mike Fährmann	c59b98c81b	[downloader:http] improve rate limit handling - Move the download "logic" with rate limit checks into its own method that only gets used if a rate limit should be enforced - Fix an issue where suspending gallery-dl during a download would basically ignore the rate limit for the remaining download when resuming its execution.	2019-12-09 20:34:22 +01:00
Mike Fährmann	bbbafc1c24	[downloader:http] catch both possible SSLException instances With pyOpenSSL installed, but disabled, the SSLError exception would be set to the one from pyOpenSSL, which could never get raised. This commit solves this problem by catching both, the native SSLError exception as well as the one from pyOpenSSL (if available.1)	2019-12-09 20:34:10 +01:00
Mike Fährmann	bbbeff4c41	[downloader.http] implement file-specific HTTP headers	2019-11-19 23:50:54 +01:00
Mike Fährmann	d44f790e81	adjust output for HTTP status related errors	2019-10-27 23:55:02 +01:00
Mike Fährmann	1032cfa34b	[downloader:http] extend mimetype map with archive formats	2019-10-10 18:30:23 +02:00
Mike Fährmann	8eaae58045	[downloader:http] change log message level to 'debug'	2019-08-29 23:05:47 +02:00
Mike Fährmann	ebabc5caf1	[downloader:http] treat 416 without downloaded data as error Downloading https://pbs.twimg.com/media/EB2cGUYX4AI2Vuu.jpg:orig (NSFW) sometimes returns a 416 status code, even though no 'Range' header was sent and no data was downloaded prior. This code usually means a file has already been downloaded completely and the download method indicates success, but in this case it causes an exception down the pipeline since no file was created.	2019-08-20 00:15:17 +02:00
Mike Fährmann	0bb873757a	update PathFormat class - change 'has_extension' from a simple flag/bool to a field that contains the original filename extension - rename 'keywords' to 'kwdict' and some other stuff as well - inline 'adjust_path()' - put enumeration index before filename extension (#306)	2019-08-12 21:40:37 +02:00
Mike Fährmann	b7fb93e2b2	[downloader:http] add 'adjust-extensions' option	2019-08-08 16:54:20 +02:00
Mike Fährmann	16c582aaf9	implement 'mtime' post-processor (#332 ) This can set a file's modification time according to a UNIX timestamp or a datetime object from its metadata.	2019-07-14 22:39:17 +02:00
Mike Fährmann	8966930c5c	[downloader:http] try to import SSL exception class from OpenSSL (#324)	2019-07-01 20:10:26 +02:00
Mike Fährmann	69205df68d	allow '-1' for infinite retries (#300 )	2019-06-30 23:10:47 +02:00
Mike Fährmann	f7b5c4c3e7	use values of 'retries' options correctly The RE-tries option now specifies exactly that: the maximum number a failed HTTP request is re-tried. For example a value of 2 will now correctly stop after 3 attempts: the initial one + 2 re-tries. The maximum wait-time now also caps at 30min and increases exponentially for both extractor.request() and downloader.http.download().	2019-06-30 23:10:18 +02:00
Mike Fährmann	db3f52881a	add 'mtime' option	2019-06-20 17:19:44 +02:00
Mike Fährmann	f4ba98771d	use Last-Modified header to set file modification time (#236, #277)	2019-06-19 23:16:32 +02:00
Mike Fährmann	179d112083	[downloader] overhaul http and text modules Get rid of the modular structure and simplify/specialize those modules.	2019-06-19 22:56:11 +02:00
Mike Fährmann	b17a5d6f3b	give downloader classes proper names	2018-11-16 14:40:05 +01:00
Mike Fährmann	4a348990f4	adjust value resolution for retries/timeout/verify options This change introduces 'extractor..retries/timeout/verify' options as a general way to set these values for all HTTP requests. 'downloader.http.retries/timeout/verify' is a way to override these options for file downloads only and will fall back to 'extractor..…* values if they haven't been explicitly set. Also: downloader classes now take an extractor object as first argument instead of a requests.session.	2018-10-07 21:13:39 +02:00
Mike Fährmann	cc36f88586	rename safe_int to parse_int; move parse_* to text module	2018-04-20 14:53:21 +02:00
Mike Fährmann	ebe9b0a04c	another attempt at downloader retry behavior This commit changes the general behavior from 'Retry on every exception and abort on DownloadError' to 'Only retry on DownloadRetry exceptions and abort on every other one' The previous version would have retried on several states which would have no chance of ever succeeding (invalid URLs, etc.)	2017-12-07 15:31:14 +01:00
Mike Fährmann	8f518e03f8	add options to set maximum download rate - -r/--limit-rate as cmdline option - downloader.http.rate as config option This implementation very roughly uses the idea of the token bucket algorithm [1] and mostly uses Wget's approach [2] as inspiration. [1] https://en.wikipedia.org/wiki/Token_bucket [2] http://git.savannah.gnu.org/cgit/wget.git/tree/src/retr.c?h=v1.19.2&id=ba6b44f6745b14dce414761a8e4b35d31b176bba#n111	2017-12-02 01:47:26 +01:00
Mike Fährmann	3dc1169736	use own mapping before relying on the 'mimetypes' module	2017-12-01 13:50:31 +01:00
Mike Fährmann	79bcaa8726	improve downloader retry behavior - only retry download on 5xx and 429 status codes - immediately fail on 4xx status codes	2017-11-10 21:46:18 +01:00
Mike Fährmann	707b15b586	create missing directories for 'part-directory' also some code improvements regarding downloader config values	2017-10-27 12:22:45 +02:00
Mike Fährmann	caf26412dd	add option to set alternate location of .part files (#29 ) Note: The path set for 'downloader.*.part-directory' needs to point to an already existing directory.	2017-10-26 00:16:48 +02:00
Mike Fährmann	963670d73b	add options to control usage of .part files (#29 ) - '--no-part' command line option to disable them - 'downloader.http.part' and 'downloader.text.part' config options Disabling .part files restores the behaviour of the old downloader implementation.	2017-10-24 23:33:44 +02:00
Mike Fährmann	b0353aa02d	rewrite download modules (#29 ) - use '.part' files during file-download - implement continuation of incomplete downloads - check if file size matches the one reported by server	2017-10-24 12:53:03 +02:00
Mike Fährmann	2e982f56af	use 'Content-Length' to determine incomplete downloads (#29 )	2017-10-20 18:56:18 +02:00
Mike Fährmann	b8862ff15e	add 'downloader.http.verify' option (also: change the default 'timeout' from None to 30)	2017-08-31 15:21:08 +02:00
Mike Fährmann	58e95a7487	share extractor and downloader sessions There was never any "good" reason for the strict separation between extractors and downloaders. This change allows for reduced resource usage (probably unnoticeable) and less lines of code at the "cost" of tighter coupling.	2017-06-30 19:38:14 +02:00
Mike Fährmann	fac6c02224	[downloader] fix extension from content-type	2017-06-19 09:24:00 +02:00
Mike Fährmann	48a5b11204	fix error if no file extension is found	2017-04-26 12:31:42 +02:00
Mike Fährmann	e3212dd98f	fix some smaller stuff - remove support for old windows config paths - catch exception if cache-database can't be opened - fix username/password settings for unit tests - rename variable 'max_tries' to 'retries'	2017-03-27 14:30:32 +02:00
Mike Fährmann	e2b5cd9918	change config-path for 'retries' and 'timeout'	2017-03-26 18:24:46 +02:00
Mike Fährmann	0b5076815d	always delete incompletely downloaded files	2017-03-21 15:53:43 +01:00
Mike Fährmann	22910f9562	improve error handling of http file downloads (#10)	2017-03-16 04:17:35 +01:00
Mike Fährmann	4f123b8513	code adjustments according to pep8	2017-01-30 19:40:15 +01:00
Mike Fährmann	3c1daef839	don't delete downloaded files in certain edge cases	2016-11-27 23:43:25 +01:00
Mike Fährmann	2b2bdce366	don't raise an exception if a download fails (#5 )	2016-11-23 13:07:44 +01:00
Mike Fährmann	dd8236e733	enable non-standard MIME types	2016-09-30 16:41:49 +02:00
Mike Fährmann	29692c5784	get extension from Content-Type header if not provided	2016-09-30 12:32:48 +02:00
Mike Fährmann	4b377ccc09	use output-module during downloads	2015-12-01 21:22:58 +01:00
Mike Fährmann	28fa7c53b4	docstrings and other small fixes for downloaders	2015-04-10 21:45:41 +02:00
Mike Fährmann	5545624da1	use seperate session in http downloader	2015-04-10 19:19:12 +02:00
Mike Fährmann	cd4a699dd2	add 'Headers' and 'Cookies' message	2015-04-08 19:06:50 +02:00
Mike Fährmann	deef91eddc	initial commit	2014-10-12 21:56:44 +02:00

1 2 3

102 Commits