Mike Fährmann
9c65db2a92
consistent 'with open(…) as fp:' syntax
2024-06-14 01:22:00 +02:00
Mike Fährmann
179d112083
[downloader] overhaul http and text modules
...
Get rid of the modular structure and simplify/specialize those modules.
2019-06-19 22:56:11 +02:00
Mike Fährmann
b17a5d6f3b
give downloader classes proper names
2018-11-16 14:40:05 +01:00
Mike Fährmann
4a348990f4
adjust value resolution for retries/timeout/verify options
...
This change introduces 'extractor.*.retries/timeout/verify' options
as a general way to set these values for all HTTP requests.
'downloader.http.retries/timeout/verify' is a way to override these
options for file downloads only and will fall back to 'extractor.*.…*
values if they haven't been explicitly set.
Also: downloader classes now take an extractor object as first argument
instead of a requests.session.
2018-10-07 21:13:39 +02:00
Mike Fährmann
707b15b586
create missing directories for 'part-directory'
...
also some code improvements regarding downloader config values
2017-10-27 12:22:45 +02:00
Mike Fährmann
caf26412dd
add option to set alternate location of .part files ( #29 )
...
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
9a41002b77
fix partial downloads for 'text:' URLs
...
Using a filesize in bytes as offset into a Python string is not
a good idea if said file contains non-ASCII characters.
2017-10-25 15:04:45 +02:00
Mike Fährmann
963670d73b
add options to control usage of .part files ( #29 )
...
- '--no-part' command line option to disable them
- 'downloader.http.part' and 'downloader.text.part' config options
Disabling .part files restores the behaviour of the old downloader
implementation.
2017-10-24 23:33:44 +02:00
Mike Fährmann
b0353aa02d
rewrite download modules ( #29 )
...
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
d70c66c516
fix "text:" downloader
2017-08-16 12:11:47 +02:00
Mike Fährmann
107d29ad8a
improve handling of text:... URLs
...
- don't require // after the colon
- open output files in text mode
2017-05-12 14:10:25 +02:00
Mike Fährmann
4f123b8513
code adjustments according to pep8
2017-01-30 19:40:15 +01:00
Mike Fährmann
3c1daef839
don't delete downloaded files in certain edge cases
2016-11-27 23:43:25 +01:00
Mike Fährmann
29692c5784
get extension from Content-Type header if not provided
2016-09-30 12:32:48 +02:00
Mike Fährmann
4b377ccc09
use output-module during downloads
2015-12-01 21:22:58 +01:00
Mike Fährmann
28fa7c53b4
docstrings and other small fixes for downloaders
2015-04-10 21:45:41 +02:00
Mike Fährmann
5545624da1
use seperate session in http downloader
2015-04-10 19:19:12 +02:00
Mike Fährmann
deef91eddc
initial commit
2014-10-12 21:56:44 +02:00