1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-22 10:42:34 +01:00
Commit Graph

125 Commits

Author SHA1 Message Date
Mike Fährmann
2d4887b75b
improve KeywordJob output for "parent" extractors (closes #548) 2019-12-28 22:26:49 +01:00
Mike Fährmann
2e2fc7f0ad
prevent infinite recursion when spawning extractors (closes #489) 2019-12-26 23:38:16 +01:00
Mike Fährmann
1921c127a5
make OSErrors during file downloads nonfatal (closes #512)
… except ENOSPC (No space left on device), since there is no reason to
continue downloading in that case.

All other errors that would prevent downloading data and writing it to
disk get already raised during directory creation and are therefore not
checked here.
2019-12-19 18:34:05 +01:00
Mike Fährmann
63e6993716
merge 'bypost' functionality into metadata postprocessor 2019-12-16 17:19:23 +01:00
Gio
c0b9ad678d Separate metadata from handle_url into handle_metadata, commenting 2019-12-09 16:02:15 -06:00
Gio
6ed4fc07ff Don't print intentional metadata skips to the console. 2019-12-09 01:02:17 -06:00
Gio
cfc70a97ab Added an additional channel for downloading the metadata of an entire post or gallery. 2019-12-09 00:56:27 -06:00
Mike Fährmann
f5604492c3
update interface of config functions 2019-11-24 00:42:28 +01:00
Mike Fährmann
3fc1e12949
[postprocessor:metadata] filter private entries
i.e. keys starting with an underscore
2019-11-21 16:58:44 +01:00
Mike Fährmann
9e88e7a344
[postprocessor:exec] improve (#421, #413)
- add 'final' option
- include job status in pp finalization
- improve and extend documentation
2019-11-03 21:45:45 +01:00
Mike Fährmann
5af291ba5c
include failed downloads and child extractors in exit status 2019-10-29 15:56:54 +01:00
Mike Fährmann
322c2e7ed4
renaming variables
mostly 'keyword(s)' to 'kwdict'
2019-10-29 15:46:35 +01:00
Mike Fährmann
4409d00141
embed error messages in StopExtraction exceptions 2019-10-28 16:39:49 +01:00
Mike Fährmann
c887493a80
overhaul exception stuff 2019-10-27 23:53:37 +01:00
Mike Fährmann
389d2d7e38
implement 'cookies-update' option (#445) 2019-10-19 15:23:55 +02:00
Mike Fährmann
03bc8adfc7
[postprocessor:exec] run after file moved to target location
(#421)
2019-10-06 23:12:22 +02:00
Mike Fährmann
776e9e073f
close archive on job completion (#417) 2019-09-10 22:43:51 +02:00
Mike Fährmann
9178b54eae
handle errors when opening download archive file (#417) 2019-09-10 16:44:47 +02:00
Mike Fährmann
682105b8ee
prevent crash when loading unavailable downloader (#405) 2019-08-31 21:58:33 +02:00
Mike Fährmann
5f8621b29d
improve output of active post processor modules 2019-08-15 13:31:04 +02:00
Mike Fährmann
0bb873757a
update PathFormat class
- change 'has_extension' from a simple flag/bool to a field that
  contains the original filename extension
- rename 'keywords' to 'kwdict' and some other stuff as well
- inline 'adjust_path()'
- put enumeration index before filename extension (#306)
2019-08-12 21:40:37 +02:00
Mike Fährmann
8dc42bb178
implement 'enumerate' for 'extractor.skip' (#306)
[ci skip]
2019-08-08 18:37:54 +02:00
Mike Fährmann
20f7b07312
ensure postproc finalize() is called during C-c or crash (#355) 2019-07-27 11:14:52 +02:00
Mike Fährmann
7b77ecc35a
fix paths for files without extension (#220) 2019-07-15 16:39:03 +02:00
Mike Fährmann
62097284fe
add 'download' option (#220) 2019-07-14 18:48:18 +02:00
Mike Fährmann
fe7805de7c
improve attribute access in DownloadJob.handle_url()
Storing a value in a local variable an accessing it that way is faster
than going through 'self' if it is accessed more than once.
2019-07-13 21:42:07 +02:00
Mike Fährmann
f2000a69aa
implement 'image-unique' and 'chapter-unique' options (#303)
The default value for both is 'false', i.e. duplicate URLs are NOT
ignored.

The previous behavior was to always ignore duplicate URLs to make
'--abort-on-skip' work properly when new images where added to the
beginning of a collection while gallery-dl is running.
2019-06-29 22:50:17 +02:00
Mike Fährmann
ee4d7c3d89
update downloader.find() and related code
Instead of replacing 'https' with 'http' for every URL in
'get_downloader()', this now only happens once during downloader
initialization. Also unit tests.
2019-06-20 16:59:44 +02:00
Mike Fährmann
523ebc9b0b
Fix serialization of 'datetime' objects in '--write-metadata'
Simplified universal serialization support in json.dump() can be achieved
by passing 'default=str', which was already the case in DataJob.run()
for -j/--dump-json, but not for the 'metadata' post-processor.

This commit introduces util.dump_json() that (more or less) unifies the
JSON output procedure of both --write-metadata and --dump-json.

(#251, #252)
2019-05-09 16:49:22 +02:00
Mike Fährmann
b09a8184ca
move TestJob into test module; test _extractor values 2019-02-17 18:18:31 +01:00
Mike Fährmann
ae353ed3b0
provide "extractor" and "job" keys for logging output
This allows for stuff like "{extractor.url}" and "{extractor.category}"
in logging format strings.
Accessing 'extractor' and 'job' in any way will return "None" if those
fields aren't defined, i.e. in general logging messages.
2019-02-14 11:09:58 +01:00
Mike Fährmann
89ee8cd7e4
filter "private" kwdict entries 2019-02-13 13:22:11 +01:00
Mike Fährmann
61741d7333
provide type information for Queue messages
Child extractors are now directly constructed with Extractor.from_url()
if the extractor class is known beforehand, instead of using
extractor.find() and searching through all possible extractor classes.
2019-02-12 21:32:32 +01:00
Mike Fährmann
277b52101a
add 'category-transfer' option
[ci skip]
2019-01-19 20:28:19 +01:00
Mike Fährmann
5f38ac9609
[postprocessor:exec] add a better error message (#155) 2019-01-13 13:59:11 +01:00
Mike Fährmann
0225d90078
add exception name and traceback for OSErrors 2018-12-04 19:24:50 +01:00
Mike Fährmann
fb53b5dd55
fix control+c during -j and range tests 2018-11-25 18:54:05 +01:00
Mike Fährmann
13cb270326
set target directory before postprocessor init (fixes #126) 2018-11-21 22:21:26 +01:00
Mike Fährmann
b828473aa3
retry HTTP requests for more exception classes 2018-11-19 15:49:13 +01:00
Mike Fährmann
c47482b110
smaller changes, missing docs, etc.
- make 'netrc' extractor-specific
- rename 'downloader.enable' to 'enabled'
- document 'downloader.ytdl.format'
- consistent newlines in configuration.rst
2018-11-16 18:18:07 +01:00
Mike Fährmann
3c25fa2dad
update build_testresult_db.py script 2018-11-15 22:58:14 +01:00
Mike Fährmann
8ef84a6823
add option to enable/disable specific downloader modules
... and write URLs with no (active) downloader to unsupported-file
2018-11-13 18:06:36 +01:00
Mike Fährmann
d3d7f01543
add 'prepare()' step for post-processors
This allows post-processors to modify the destination path before
checking if a file already exists.
2018-10-18 22:32:03 +02:00
Mike Fährmann
6ed629f2b6
allow specifying number of skips before abort/exit (closes #115)
In addition to 'abort' and 'exit', it is now possible to specify
'abort:N' and 'exit:N' (where N is any integer) as value for 'skip'
to abort/exit after consecutively skipping N downloads.
2018-10-13 17:21:55 +02:00
Mike Fährmann
48a8717a7c
add 'output.num-to-str' option
... to convert any numeric values to string when outputting them as JSON
(during '--dump-json' or otherwise)
2018-10-08 20:28:54 +02:00
Mike Fährmann
0514d6a0ae
make --filter and --range config-file options
The functionality of --(chapter-)filter and --(chapter-)range are now
also exposed as the following config-file options:

- extractor.*.image-filter
- extractor.*.image-range
- extractor.*.chapter-filter
- extractor.*.chapter-range

TODO: update configuration.rst
2018-10-07 21:39:56 +02:00
Mike Fährmann
4a348990f4
adjust value resolution for retries/timeout/verify options
This change introduces 'extractor.*.retries/timeout/verify' options
as a general way to set these values for all HTTP requests.

'downloader.http.retries/timeout/verify' is a way to override these
options for file downloads only and will fall back to 'extractor.*.…*
values if they haven't been explicitly set.

Also: downloader classes now take an extractor object as first argument
instead of a requests.session.
2018-10-07 21:13:39 +02:00
Mike Fährmann
ca6ac4db6a
fix 'content' tests 2018-10-05 21:10:33 +02:00
Mike Fährmann
188876d814
implement youtube-dl downloader module
URLs starting with 'ytdl:' will now be handled by youtube-dl.
There is probably a lot to fix and improve, but the basic use case
works.

TODO:
- format selection and ytdl options in general
- better filename/path handling
- ytdl support for "unsupported URLs"
- ...
2018-10-05 18:05:11 +02:00
Mike Fährmann
8c8da11bb8
do not create directory structures when using '-s' 2018-09-21 17:55:04 +02:00