1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 03:02:50 +01:00
Commit Graph

210 Commits

Author SHA1 Message Date
Mike Fährmann
69a5e6ddb3
Merge branch 'master' into 1.4-dev 2018-05-04 10:19:02 +02:00
Mike Fährmann
16e014baaa
[smugmug] added image and album extractor
just some initial code that still requires a lot of work ...

TODO:
- folders
- old-style albums (which are nearly all of them ...)
- images from users
- OAuth

It could also happen that the API credentials used will become invalid
whenever my 14 day trial period ends (7 days remaining), but that
would just require users to supply their own.
2018-04-29 21:27:25 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module 2018-04-20 14:53:21 +02:00
Mike Fährmann
51ea699083
add 'abort()' as function to filter expressions
calling 'abort()' in a filter aborts the current extractor run
in a cleaner way than using something like 1/0, which
causes an error message to be printed
2018-04-12 17:07:12 +02:00
Mike Fährmann
3f2dd6b6f8
avoid double path-separators
(#74)
2018-03-22 10:24:59 +01:00
Mike Fährmann
b69cc94f0e
[util] implement bencode() 2018-03-14 13:17:34 +01:00
Mike Fährmann
749fbbfa6c
[mangadex] add chapter- and manga-extractor 2018-03-05 18:37:21 +01:00
Mike Fährmann
2fad0b1f1b
add 'U' conversion for format strings to unquote their content
(#74)
2018-02-25 21:57:59 +01:00
Mike Fährmann
8cdce21dcb
make archive keys user-configurable 2018-02-25 21:57:01 +01:00
Mike Fährmann
e1e0668ca8
add option to set default replacement field value
Missing or undefined keywords will now be replaced with the value
set for 'keywords-default'. The default is Python's 'None', which
is equivalent to setting this option to JSON's 'null'.
2018-02-23 00:59:20 +01:00
Mike Fährmann
ac3da8115e
[util] don't add text: URLs to list of downloaded URLs 2018-02-20 18:14:27 +01:00
Mike Fährmann
b50bdbf3d7
change config specifiers in input file format
Instead of a dictionary/object, input file options are now specified
by a 'key=value' pair starting with '-' for options only applying to
the next URL or '-G' for Global options applying to all following URLs.

See the docstring of parse_inputfile() for details.

Example option specifiers:

- filename = "{id}.{extension}"
- extractor.pixiv.user.directory = ["Pixiv Users", "{user[id]}"]
-spaces="are_optional"
-G keywords = {"global": "option"}
2018-02-16 03:10:41 +01:00
Mike Fährmann
f970a8f13c
fix adding keys to download archive when using skip=false 2018-02-13 23:45:30 +01:00
Mike Fährmann
179bcdd349
adjust archive-ids 2018-02-13 04:50:45 +01:00
Mike Fährmann
3cec533c28
Merge branch 'archive' 2018-02-12 18:07:58 +01:00
Mike Fährmann
b73b8b4f50
add OAuth unittests 2018-02-12 17:07:07 +01:00
Mike Fährmann
4d2fadfb6f
restore skip actions with download archive 2018-02-12 16:56:45 +01:00
Mike Fährmann
65773263fc
[util] implement OAuthSession.urlencode() (closes #75)
- Python's own urllib.parse.urlencode() has no quote_via argument in
  Python 3.3 and 3.4, which is necessary to follow  OAuth 1.0 quoting
  rules.
2018-02-10 21:56:13 +01:00
Mike Fährmann
057668e17e
extend input-file format with per-URL config and comments
- see docstring of parse_inputfile() for details
- TODO: unittests, recursion (currently setting for example
  {"extractor": {"key": "value"}} will override the whole "extractor"
  branch instead of merging {"key": "value"} into the already existing
  dictionary)
2018-02-07 21:47:27 +01:00
Mike Fährmann
347baf7ac5
improve util.parse_range() performance
It is never going to actually matter, but using partition() instead
of split() is twice as fast.
2018-02-05 22:28:11 +01:00
Mike Fährmann
aa38eab2be
allow not-defined fields in format strings
... and replace them with "None", for now
2018-02-03 22:28:41 +01:00
Mike Fährmann
84a52a9256
add DownloadArchive class 2018-01-30 15:23:23 +01:00
Mike Fährmann
db7f04dd97
emit log messages on download failure
and when retrying with fallback URLs
2018-01-28 18:44:10 +01:00
Mike Fährmann
6174a5c4ef
[download] adjust filename extension on filetype mismatch
(closes #63)
2018-01-17 18:37:06 +01:00
Mike Fährmann
f10ffc0839
update extractor blacklist to also allow classes 2018-01-14 18:47:22 +01:00
Mike Fährmann
29d75fc3fa
[tumblr] add support for OAuth authentication (#65) 2018-01-11 14:11:37 +01:00
Mike Fährmann
d241a0fb60
[util] replace '/' with '\' in base-directory paths
... on Windows to have consistent path separators.
2017-12-21 21:56:24 +01:00
Mike Fährmann
93482a1f88
implement 'util.advance()' 2017-12-03 01:38:24 +01:00
Mike Fährmann
a718c6c6cd
implement 'util.parse_bytes()' 2017-12-02 01:24:49 +01:00
Mike Fährmann
caf26412dd
add option to set alternate location of .part files (#29)
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
ea8ca4cfa4
add 'util.expand_path()' 2017-10-26 00:04:28 +02:00
Mike Fährmann
963670d73b
add options to control usage of .part files (#29)
- '--no-part' command line option to disable them
- 'downloader.http.part' and 'downloader.text.part' config options

Disabling .part files restores the behaviour of the old downloader
implementation.
2017-10-24 23:33:44 +02:00
Mike Fährmann
b0353aa02d
rewrite download modules (#29)
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
832b8b76ac
[util] extend global namespace for filter expressions 2017-10-09 22:12:58 +02:00
Mike Fährmann
8e6a767109
[util] restructure formatter for better exception propagation 2017-10-06 17:10:35 +02:00
Mike Fährmann
8df023e144
[util:filter] re-enable builtins
Trying to restrict access to Python's builtin functions (exec,
print, __import__, ...) can easily be circumvented and is
therefore completely pointless.

This also adds 'safe_int()' and the 'datetime' module to the global
namespace used when evaluating filter expressions.
2017-10-04 16:00:12 +02:00
Mike Fährmann
b319f4bab3
smaller code and text changes 2017-10-01 18:23:40 +02:00
Mike Fährmann
c1f0afe4c6
add custom string formatter class 2017-09-28 17:12:39 +02:00
Mike Fährmann
9fc1d0c901
implement and use 'util.safe_int()'
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
2017-09-24 15:59:25 +02:00
Mike Fährmann
9b21d3f13c
add '--filter' command-line option
This allows for image filtering via Python expressions by the same
metadata that is also used to build filenames (--list-keywords).

The usually shunned eval() function is used to evaluate
filter-expressions, but it seemed quite appropriate in this case and
shouldn't introduce any new security issues, as any attacker that could do
> gallery-dl --filter "delete-everything()" ...
could as well do
> python -c "delete-everything()"
2017-09-08 17:52:00 +02:00
Mike Fährmann
268cfa3cfe
filter duplicate URLs (#36)
Duplicate URLs might occur if, for example,  an artist adds another
image to his gallery while an extractor is running and images are being
downloaded on sites like pixiv/nijie/hentaifoundry.
The next image on the next page will have already been downloaded and
will cause a premature end if '--abort-on-skip' is being used.
2017-09-06 17:08:50 +02:00
Mike Fährmann
9bf9d64ad8
update unittests for util.py 2017-08-13 14:31:22 +02:00
Mike Fährmann
e3bfb8325a
fix circular dependency
- util.py imported config.py and vice versa
- Python < 3.5 doesn't like this
2017-08-12 21:32:24 +02:00
Mike Fährmann
004456d5d5
properly update the config-dictionary
When using 2 or more config files, the values of the second would
improperly overwrite nested dictionaries of the first one.
The new method properly combines these nested dictionaries as well.
2017-08-12 20:07:27 +02:00
Mike Fährmann
ae2d61e5b3
handle format string exceptions separately 2017-08-11 21:48:37 +02:00
Mike Fährmann
d74a635e41
[util] update 'default' values and improve test coverage
for 'code_to_language()' and 'language_to_code()'
2017-08-08 19:22:04 +02:00
rachmadani haryono
dcd573806e chg: dev: fix error (#32)
* fix: dev: error

* fix: dev: AttributeError when getting artist

* fix: dev: typo on luscious parser
2017-08-04 15:01:10 +02:00
Mike Fährmann
0610ae5000
skip login if cookies are present 2017-07-17 10:33:36 +02:00
Mike Fährmann
2993206c4b
smaller fixes and "security" measures
- move the OAuthSession class into util.py
- block special extractors for reddit and recursive
- ignore 'only matching' tests for testresults script
2017-06-16 21:01:40 +02:00
Mike Fährmann
72f1c6f87a
[flickr] add support for flic.kr/p/... URLs
Example:
    https://flic.kr/p/FPVo9U
2017-06-02 09:01:35 +02:00
Mike Fährmann
107d29ad8a
improve handling of text:... URLs
- don't require // after the colon
- open output files in text mode
2017-05-12 14:10:25 +02:00
Mike Fährmann
ef90a2de2f
implement the "exit" option for the "skip" config-key 2017-05-05 15:49:58 +02:00
Mike Fährmann
fc9223c072
add '--abort-on-skip' option and ability to control skip behavior
the 'skip' config option controls skipping behavior:
    true    - skip download if file already exist (default)
    false   - download and overwrite files even if it exists
    "abort" - abort extractor run if a download would be skipped
              (same as '--abort-on-skip')
2017-05-03 15:26:04 +02:00
Mike Fährmann
841fd50242
move code into util.py 2017-03-28 13:12:44 +02:00
Mike Fährmann
7a9d66fbce
implement basic way to tell extractors to skip ahead 2017-03-03 17:26:50 +01:00
Mike Fährmann
6208d9dd79
implement '--images' and '--chapters' options
- the former '--items' has been renamed to '--chapters'
- #6
2017-02-23 21:51:29 +01:00
Mike Fährmann
2a32b12043
add '--items' option
this allows to specify which manga-chapters/comic-issues to download
when using gallery-dl on a manga/comic URL
2017-02-20 22:02:49 +01:00
Mike Fährmann
513808d156 move code from util.py 2015-04-08 01:46:04 +02:00
Mike Fährmann
b630753e5e add 'method' parameter 2014-10-31 23:38:21 +01:00
Mike Fährmann
deef91eddc initial commit 2014-10-12 21:56:44 +02:00