gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-22 18:53:21 +01:00

Author	SHA1	Message	Date
Mike Fährmann	590c0b3ad5	re-implement and improve filename formatter A format string now gets parsed only once instead of re-parsing it each time it is applied to a set of data. The initial parsing causes directory path creation to be at about 2x slower than before, since each format string there is used only once, but building a filename, the more common operation, is at least 2x faster. The "directory slowness" cancels at about 5 filenames and everything above that is significantly faster.	2018-08-25 10:45:14 +02:00
Mike Fährmann	c83fc62abc	prioritize archive over disk access (#87 )	2018-07-30 17:48:23 +02:00
Mike Fährmann	e0dd8dff5f	implement L<maxlen>/<replacement>/ format option The L option allows for the contents of a format field to be replaced with <replacement> if its length is greater than <maxlen>. Example: {f:L5/too long/} -> "foo" (if "f" is "foo") -> "too long" (if "f" is "foobar") (#92) (#94)	2018-07-29 13:52:07 +02:00
Mike Fährmann	8fe9056b16	implement string slicing for format strings It is now possible to slice string (or list) values of format string replacement fields with the same syntax as in regular Python code. "{digits}" -> "0123456789" "{digits[2:-2]}" -> "234567" "{digits[:5]}" -> "01234" The optional third parameter (step) has been left out to simplify things.	2018-07-14 09:53:15 +02:00
Mike Fährmann	a9e276bc37	reset delete-flag Since 'PathFormat' objects are being reused, setting `delete` to True once caused all files downloaded after to be deleted as well.	2018-06-20 18:12:59 +02:00
Mike Fährmann	baccf8a958	improve postprocessor handling - add pathfmt argument for __init__() - add finalization step - add option to keep or delete zipped files	2018-06-08 17:39:02 +02:00
Mike Fährmann	7646bdbcfd	improve postprocessor initialization code	2018-06-07 22:29:54 +02:00
Mike Fährmann	821535b458	adjust PathFormat class	2018-06-06 20:17:17 +02:00
Mike Fährmann	6a31ada9e3	re-implement OAuth1.0 code OAuth support for SmugMug needs some additional features (auth-rebuild on redirect, query parameters in URL, ...) and fixing this in the old code wouldn't work all that well.	2018-05-10 18:47:05 +02:00
Mike Fährmann	69a5e6ddb3	Merge branch 'master' into 1.4-dev	2018-05-04 10:19:02 +02:00
Mike Fährmann	16e014baaa	[smugmug] added image and album extractor just some initial code that still requires a lot of work ... TODO: - folders - old-style albums (which are nearly all of them ...) - images from users - OAuth It could also happen that the API credentials used will become invalid whenever my 14 day trial period ends (7 days remaining), but that would just require users to supply their own.	2018-04-29 21:27:25 +02:00
Mike Fährmann	cc36f88586	rename safe_int to parse_int; move parse_* to text module	2018-04-20 14:53:21 +02:00
Mike Fährmann	51ea699083	add 'abort()' as function to filter expressions calling 'abort()' in a filter aborts the current extractor run in a cleaner way than using something like 1/0, which causes an error message to be printed	2018-04-12 17:07:12 +02:00
Mike Fährmann	3f2dd6b6f8	avoid double path-separators (#74)	2018-03-22 10:24:59 +01:00
Mike Fährmann	b69cc94f0e	[util] implement bencode()	2018-03-14 13:17:34 +01:00
Mike Fährmann	749fbbfa6c	[mangadex] add chapter- and manga-extractor	2018-03-05 18:37:21 +01:00
Mike Fährmann	2fad0b1f1b	add 'U' conversion for format strings to unquote their content (#74)	2018-02-25 21:57:59 +01:00
Mike Fährmann	8cdce21dcb	make archive keys user-configurable	2018-02-25 21:57:01 +01:00
Mike Fährmann	e1e0668ca8	add option to set default replacement field value Missing or undefined keywords will now be replaced with the value set for 'keywords-default'. The default is Python's 'None', which is equivalent to setting this option to JSON's 'null'.	2018-02-23 00:59:20 +01:00
Mike Fährmann	ac3da8115e	[util] don't add text: URLs to list of downloaded URLs	2018-02-20 18:14:27 +01:00
Mike Fährmann	b50bdbf3d7	change config specifiers in input file format Instead of a dictionary/object, input file options are now specified by a 'key=value' pair starting with '-' for options only applying to the next URL or '-G' for Global options applying to all following URLs. See the docstring of parse_inputfile() for details. Example option specifiers: - filename = "{id}.{extension}" - extractor.pixiv.user.directory = ["Pixiv Users", "{user[id]}"] -spaces="are_optional" -G keywords = {"global": "option"}	2018-02-16 03:10:41 +01:00
Mike Fährmann	f970a8f13c	fix adding keys to download archive when using skip=false	2018-02-13 23:45:30 +01:00
Mike Fährmann	179bcdd349	adjust archive-ids	2018-02-13 04:50:45 +01:00
Mike Fährmann	3cec533c28	Merge branch 'archive'	2018-02-12 18:07:58 +01:00
Mike Fährmann	b73b8b4f50	add OAuth unittests	2018-02-12 17:07:07 +01:00
Mike Fährmann	4d2fadfb6f	restore skip actions with download archive	2018-02-12 16:56:45 +01:00
Mike Fährmann	65773263fc	[util] implement OAuthSession.urlencode() (closes #75 ) - Python's own urllib.parse.urlencode() has no quote_via argument in Python 3.3 and 3.4, which is necessary to follow OAuth 1.0 quoting rules.	2018-02-10 21:56:13 +01:00
Mike Fährmann	057668e17e	extend input-file format with per-URL config and comments - see docstring of parse_inputfile() for details - TODO: unittests, recursion (currently setting for example {"extractor": {"key": "value"}} will override the whole "extractor" branch instead of merging {"key": "value"} into the already existing dictionary)	2018-02-07 21:47:27 +01:00
Mike Fährmann	347baf7ac5	improve util.parse_range() performance It is never going to actually matter, but using partition() instead of split() is twice as fast.	2018-02-05 22:28:11 +01:00
Mike Fährmann	aa38eab2be	allow not-defined fields in format strings ... and replace them with "None", for now	2018-02-03 22:28:41 +01:00
Mike Fährmann	84a52a9256	add DownloadArchive class	2018-01-30 15:23:23 +01:00
Mike Fährmann	db7f04dd97	emit log messages on download failure and when retrying with fallback URLs	2018-01-28 18:44:10 +01:00
Mike Fährmann	6174a5c4ef	[download] adjust filename extension on filetype mismatch (closes #63)	2018-01-17 18:37:06 +01:00
Mike Fährmann	f10ffc0839	update extractor blacklist to also allow classes	2018-01-14 18:47:22 +01:00
Mike Fährmann	29d75fc3fa	[tumblr] add support for OAuth authentication (#65 )	2018-01-11 14:11:37 +01:00
Mike Fährmann	d241a0fb60	[util] replace '/' with '\' in base-directory paths ... on Windows to have consistent path separators.	2017-12-21 21:56:24 +01:00
Mike Fährmann	93482a1f88	implement 'util.advance()'	2017-12-03 01:38:24 +01:00
Mike Fährmann	a718c6c6cd	implement 'util.parse_bytes()'	2017-12-02 01:24:49 +01:00
Mike Fährmann	caf26412dd	add option to set alternate location of .part files (#29 ) Note: The path set for 'downloader.*.part-directory' needs to point to an already existing directory.	2017-10-26 00:16:48 +02:00
Mike Fährmann	ea8ca4cfa4	add 'util.expand_path()'	2017-10-26 00:04:28 +02:00
Mike Fährmann	963670d73b	add options to control usage of .part files (#29 ) - '--no-part' command line option to disable them - 'downloader.http.part' and 'downloader.text.part' config options Disabling .part files restores the behaviour of the old downloader implementation.	2017-10-24 23:33:44 +02:00
Mike Fährmann	b0353aa02d	rewrite download modules (#29 ) - use '.part' files during file-download - implement continuation of incomplete downloads - check if file size matches the one reported by server	2017-10-24 12:53:03 +02:00
Mike Fährmann	832b8b76ac	[util] extend global namespace for filter expressions	2017-10-09 22:12:58 +02:00
Mike Fährmann	8e6a767109	[util] restructure formatter for better exception propagation	2017-10-06 17:10:35 +02:00
Mike Fährmann	8df023e144	[util:filter] re-enable builtins Trying to restrict access to Python's builtin functions (exec, print, __import__, ...) can easily be circumvented and is therefore completely pointless. This also adds 'safe_int()' and the 'datetime' module to the global namespace used when evaluating filter expressions.	2017-10-04 16:00:12 +02:00
Mike Fährmann	b319f4bab3	smaller code and text changes	2017-10-01 18:23:40 +02:00
Mike Fährmann	c1f0afe4c6	add custom string formatter class	2017-09-28 17:12:39 +02:00
Mike Fährmann	9fc1d0c901	implement and use 'util.safe_int()' same as Python's 'int()', except it doesn't raise any exceptions and accepts a default value	2017-09-24 15:59:25 +02:00
Mike Fährmann	9b21d3f13c	add '--filter' command-line option This allows for image filtering via Python expressions by the same metadata that is also used to build filenames (--list-keywords). The usually shunned eval() function is used to evaluate filter-expressions, but it seemed quite appropriate in this case and shouldn't introduce any new security issues, as any attacker that could do > gallery-dl --filter "delete-everything()" ... could as well do > python -c "delete-everything()"	2017-09-08 17:52:00 +02:00
Mike Fährmann	268cfa3cfe	filter duplicate URLs (#36 ) Duplicate URLs might occur if, for example, an artist adds another image to his gallery while an extractor is running and images are being downloaded on sites like pixiv/nijie/hentaifoundry. The next image on the next page will have already been downloaded and will cause a premature end if '--abort-on-skip' is being used.	2017-09-06 17:08:50 +02:00

1 2

69 Commits