gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-22 18:53:21 +01:00

Author	SHA1	Message	Date
Mike Fährmann	2d2953a5bf	add 'text.parse_float()' + cleanup in text.py	2019-01-29 16:46:21 +01:00
Mike Fährmann	0c32dc5858	[hentaifox] add extractor for search results (#160 )	2019-01-28 22:38:32 +01:00
Mike Fährmann	217a0687ef	[behance] add 'collection' extractor (closes #157 )	2019-01-19 18:11:20 +01:00
Mike Fährmann	b8fed34548	add generalized extractors for Mastodon instances (#144 ) Extractors for Mastodon instances can now be dynamically generated, based on the instance names in the 'extractor.mastodon.*' config path. Example: { "extractor": { "mastodon": { "pawoo.net": { ... }, "mastodon.xyz": { ... }, "tabletop.social": { ... }, ... } } } Each entry requires an 'access-token' value, which can be generated with 'gallery-dl oauth:mastodon:<instance URL>'. An 'access-token' (as well as a 'client-id' and 'client-secret') for pawoo.net is always available, but can be overwritten as necessary.	2019-01-19 14:28:59 +01:00
Mike Fährmann	66460337f1	[mangapark] fix extraction	2019-01-17 21:24:53 +01:00
Mike Fährmann	79c01ec7ae	implement J<separator>/ format option J joins list elements by calling <separator>.join(list): Example: {f:J - /} -> "a - b - c" (if "f" is ["a", "b", "c"])	2019-01-17 17:01:58 +01:00
Mike Fährmann	9bbbadd93a	[hbrowse] use HTTPS	2019-01-15 18:07:39 +01:00
Mike Fährmann	98c6520384	[pinterest] update root URL of API calls	2019-01-14 15:22:04 +01:00
Mike Fährmann	751e535948	[nhentai] fix extraction (closes #156 ) Use JSON embedded in webpage since API endpoints have been disabled	2019-01-14 07:57:50 +01:00
Mike Fährmann	1734a6c879	[reactor] detect "circular" redirects (#148 )	2019-01-09 14:59:15 +01:00
Mike Fährmann	e53cdfd6a8	update build_supportedsites.py	2019-01-09 14:58:35 +01:00
Mike Fährmann	0afa913de4	[tumblr] add tests for hidden and private blogs (#145 ) Hidden / dashboard-only blogs are pretty straightforward and "only" require a valid 'access-token' and 'access-token-secret' for the given 'api-key' and 'api-secret', so that signed OAuth1.0 requests are possible. Private / password protected blogs on the other hand are a bit cumbersome. In addition to a valid 'access-token' and 'access-token-secret', they also require the account belonging to those tokens to be a member of the blog itself. Knowing the password and entering it in the website isn't enough to access a blog through the API. Following a private blog is also impossible, so that option can't work either.	2019-01-03 16:12:24 +01:00
Mike Fährmann	fa7fa2f8ff	[deviantart1 update tests]	2019-01-01 15:39:34 +01:00
Mike Fährmann	259123732f	[readcomiconline] improve comic-page parsing	2018-12-30 13:19:23 +01:00
Mike Fährmann	6c71e9cf5d	[deviantart] add separate 'sta.sh' extractor (#113 ) - supports multiple stashed deviations per page - explicitly mentions sta.sh support on supportedsites.rst	2018-12-26 18:56:57 +01:00
Mike Fährmann	c5d4f558c9	allow missing field access keys in format strings (#136 )	2018-12-22 13:54:14 +01:00
Mike Fährmann	4d73cc785d	update test results	2018-12-14 16:07:32 +01:00
Mike Fährmann	010da8372a	[instagram] relax test pattern	2018-12-11 19:59:28 +01:00
Mike Fährmann	15890930ea	[mangafox] fix extraction use mobile version since desktop version is obfuscated	2018-11-26 16:13:41 +01:00
Mike Fährmann	fb53b5dd55	fix control+c during -j and range tests	2018-11-25 18:54:05 +01:00
Mike Fährmann	59bb434ba5	[flickr] add ability to download all albums of a user for example with 'https://www.flickr.com/photos/shona_s/albums'	2018-11-23 09:09:37 +01:00
Mike Fährmann	041bd501fc	[hentaifoundry] unescape YII_CSRF_TOKEN value This fixes the POST requests to /site/filters	2018-11-19 21:46:17 +01:00
Mike Fährmann	d4b2b73bef	release version 1.6.0	2018-11-17 18:28:02 +01:00
Mike Fährmann	3c25fa2dad	update build_testresult_db.py script	2018-11-15 22:58:14 +01:00
Mike Fährmann	7f6a0be982	adjust some tests	2018-11-15 22:50:04 +01:00
Mike Fährmann	966a9ca3a0	update test results	2018-11-10 19:14:54 +01:00
Mike Fährmann	c9861ca812	adjust message for status_code based exceptions from: 5xx HTTP Error: Reason to : 5xx: Reason The "HTTP Error" part was in there to emulate Request's error messages from response.raise_for_status(), but it reads a lot better without.	2018-10-18 15:09:49 +02:00
Mike Fährmann	c00dce2adc	[behance] enable 'categorytransfer'	2018-10-09 23:40:49 +02:00
Mike Fährmann	1532d1b690	fix 'range' tests and update a few test results	2018-10-08 23:53:58 +02:00
Mike Fährmann	0514d6a0ae	make --filter and --range config-file options The functionality of --(chapter-)filter and --(chapter-)range are now also exposed as the following config-file options: - extractor..image-filter - extractor..image-range - extractor..chapter-filter - extractor..chapter-range TODO: update configuration.rst	2018-10-07 21:39:56 +02:00
Mike Fährmann	4a348990f4	adjust value resolution for retries/timeout/verify options This change introduces 'extractor..retries/timeout/verify' options as a general way to set these values for all HTTP requests. 'downloader.http.retries/timeout/verify' is a way to override these options for file downloads only and will fall back to 'extractor..…* values if they haven't been explicitly set. Also: downloader classes now take an extractor object as first argument instead of a requests.session.	2018-10-07 21:13:39 +02:00
Mike Fährmann	ca6ac4db6a	fix 'content' tests	2018-10-05 21:10:33 +02:00
Mike Fährmann	d70db2d555	Revert "[komikcast] fix extraction" This reverts commit `5507f5ce2e`.	2018-10-02 20:38:42 +02:00
Mike Fährmann	5507f5ce2e	[komikcast] fix extraction	2018-09-29 16:37:30 +02:00
Mike Fährmann	17611bfec0	update build_supportedsites.py script	2018-09-28 12:43:19 +02:00
Mike Fährmann	e066f35118	update extractor tests	2018-09-21 11:25:56 +02:00
Mike Fährmann	22ab509a70	[bobx] rename "model" to "idol" extractor	2018-09-14 18:11:36 +02:00
Mike Fährmann	8a23b21d0e	[tests] let 'pattern' require at least 1 URL	2018-09-02 21:19:44 +02:00
Mike Fährmann	0bc8ef51c8	[smugmug] Handle albums with no explicit owner (#100 )	2018-09-01 12:55:02 +02:00
Mike Fährmann	590c0b3ad5	re-implement and improve filename formatter A format string now gets parsed only once instead of re-parsing it each time it is applied to a set of data. The initial parsing causes directory path creation to be at about 2x slower than before, since each format string there is used only once, but building a filename, the more common operation, is at least 2x faster. The "directory slowness" cancels at about 5 filenames and everything above that is significantly faster.	2018-08-25 10:45:14 +02:00
Mike Fährmann	34b556922d	update/restore tests	2018-08-23 15:47:40 +02:00
Mike Fährmann	e3055d356c	release version 1.5.1	2018-08-17 13:21:36 +02:00
Mike Fährmann	f9ded38d89	[test:results] add support for "range" options in tests	2018-08-15 21:49:44 +02:00
Mike Fährmann	c9e6ccbd7c	[test:extractor] small fixes and improvements	2018-08-15 21:49:33 +02:00
Mike Fährmann	7f4e41c989	increase timeout during extractor tests cloudflare's 522 response takes longer than 30 seconds	2018-08-10 16:51:05 +02:00
Mike Fährmann	b55e39d1ee	[mangadex] improve extraction - cache manga API results - add artist, author and date fields to chapter metadata - remove Manga-/ChapterExtractor inheritance - minor code simplifications and improvements	2018-08-10 16:50:07 +02:00
Mike Fährmann	2a9f3341a2	[behance] fix title extraction	2018-08-08 10:48:58 +02:00
Mike Fährmann	a86f2bfc80	[pinterest] update not-found redirects	2018-08-07 12:13:19 +02:00
Mike Fährmann	7442d2940c	release version 1.5.0	2018-08-03 17:50:27 +02:00
Mike Fährmann	b040ca0718	[rule34] small unit test fixes	2018-08-03 17:28:47 +02:00
Mike Fährmann	f3793660ef	update tests	2018-08-02 14:57:28 +02:00
Mike Fährmann	42a346413b	fix "re:" prefix for keyword tests	2018-08-02 14:48:51 +02:00
Mike Fährmann	e0dd8dff5f	implement L<maxlen>/<replacement>/ format option The L option allows for the contents of a format field to be replaced with <replacement> if its length is greater than <maxlen>. Example: {f:L5/too long/} -> "foo" (if "f" is "foo") -> "too long" (if "f" is "foobar") (#92) (#94)	2018-07-29 13:52:07 +02:00
Mike Fährmann	bb89a1e6d7	[mangahere] use http:// invalid SSL cert for quite some time now	2018-07-26 18:11:31 +02:00
Mike Fährmann	ce34d82cb4	fix skipping tests on 5xx status codes	2018-07-19 18:47:23 +02:00
Mike Fährmann	a6fe2bb594	[whatisthisimnotgoodwithcomputers] remove extractor	2018-07-14 09:53:16 +02:00
Mike Fährmann	0ba93650e0	[8chan] replace unit test URL the other thread is no longer accessible	2018-07-14 09:53:16 +02:00
Mike Fährmann	8fe9056b16	implement string slicing for format strings It is now possible to slice string (or list) values of format string replacement fields with the same syntax as in regular Python code. "{digits}" -> "0123456789" "{digits[2:-2]}" -> "234567" "{digits[:5]}" -> "01234" The optional third parameter (step) has been left out to simplify things.	2018-07-14 09:53:15 +02:00
Mike Fährmann	269dc2bbd5	[sankaku] add 'tags' option (#94 )	2018-07-14 09:53:01 +02:00
Mike Fährmann	764331823b	release version 1.4.2	2018-07-06 16:02:40 +02:00
Mike Fährmann	2eefaa99a3	[mangapark] support .net and .com mirrors	2018-07-05 14:45:05 +02:00
Mike Fährmann	188e956c4e	[imagefap] use HTTPS + update test results	2018-06-30 19:40:46 +02:00
Mike Fährmann	a699787d01	[deviantart] update URL patterns to new format DeviantArt changed its URL format from https://<name>.deviantart.com/... to https://www.deviantart.com/<name>/... With this change both formats will be supported.	2018-06-28 20:21:59 +02:00
Mike Fährmann	0055fdd714	change OAuth test server DNS record for oauthbin.com expired	2018-06-28 14:32:02 +02:00
Mike Fährmann	b8c97d2295	use 'extractor.request()' for more HTTP requests	2018-06-25 23:40:59 +02:00
Mike Fährmann	7a98cc9798	[smugmug] update tests My test account expired and all uploaded images got deleted.	2018-06-22 15:04:31 +02:00
Mike Fährmann	4eb94aca17	[postprocessor:ugoira] pass '-f' if not present	2018-06-22 13:26:17 +02:00
Mike Fährmann	a9e276bc37	reset delete-flag Since 'PathFormat' objects are being reused, setting `delete` to True once caused all files downloaded after to be deleted as well.	2018-06-20 18:12:59 +02:00
Mike Fährmann	6ac403c5d3	add postprocessor config example	2018-06-08 18:31:59 +02:00
Mike Fährmann	2403c405e3	Merge branch 'postprocessor'	2018-06-08 17:43:11 +02:00
Mike Fährmann	b344f2290f	fix downloader tests	2018-06-07 22:27:36 +02:00
Mike Fährmann	a47c6136cd	[simplyhentai] avoid redirects for all-pages.json (#89 )	2018-06-01 22:06:34 +02:00
Mike Fährmann	ae9a37a528	implement text.split_html()	2018-05-27 15:00:41 +02:00
Mike Fährmann	0a1863fce3	[pixiv] respect more query parameters for user URLs The API endpoint responsible for user illustrations does not provide sufficient filter capabilities* to match the actual website, so we are spinning our own filters. Respected parameters are 'type': illust, manga, ugoira 'tag' : any image tag (this was already supported) 'p' : the page to start on * - API can filter for illustrations and manga, but not for ugoira. - 'offset' is applied before filtering - no 'tag' filter	2018-05-18 15:36:30 +02:00
Mike Fährmann	7f899bd5d8	Merge branch 'master' into 1.4-dev	2018-05-14 14:50:02 +02:00
Mike Fährmann	4cea886177	[imgur] allow longer album hashes	2018-05-13 11:21:51 +02:00
Mike Fährmann	e1e23165a0	[pinterest] catch JSON decode errors	2018-05-11 17:37:27 +02:00
Mike Fährmann	6a31ada9e3	re-implement OAuth1.0 code OAuth support for SmugMug needs some additional features (auth-rebuild on redirect, query parameters in URL, ...) and fixing this in the old code wouldn't work all that well.	2018-05-10 18:47:05 +02:00
Mike Fährmann	e2157f594e	[mangadex] fix manga extraction (closes #84 ) Chapter listings for manga now use https://mangadex.org/manga/<id>/_/chapters/2/ as URL instead of https://mangadex.org/manga/<id>/_//2/	2018-05-06 17:43:50 +02:00
Mike Fährmann	69a5e6ddb3	Merge branch 'master' into 1.4-dev	2018-05-04 10:19:02 +02:00
Mike Fährmann	3fe653d940	fix test_results for empty sets {} is an empty dict and doesn't support set operations	2018-04-29 22:43:37 +02:00
Mike Fährmann	d96b3474e5	[puremashiro] remove module site has been unreachable for a couple of weeks and now the DNS record is gone as well	2018-04-28 14:24:20 +02:00
Mike Fährmann	b44a296404	[gomanga] remove module site has been unreachable for a couple of weeks and the cloudflare status page shows host errors	2018-04-28 14:24:21 +02:00
Mike Fährmann	2395d870dd	[pinterest] unquote board and user names, better errors	2018-04-26 16:38:12 +02:00
Mike Fährmann	55d4d23860	[pinterest] use Pinterest's "Web" API (#83 ) no access tokens, no user credentials of any kind ...	2018-04-24 22:28:10 +02:00
Mike Fährmann	f471161920	Merge branch 'master' into 1.4-dev	2018-04-21 12:15:40 +02:00
Mike Fährmann	cc36f88586	rename safe_int to parse_int; move parse_* to text module	2018-04-20 14:53:21 +02:00
Mike Fährmann	10cc59f3b5	fix extractor names	2018-04-18 18:12:57 +02:00
Mike Fährmann	b1325d4d2c	fix extractor docstrings	2018-04-18 18:03:43 +02:00
Mike Fährmann	df7e18399e	[luscious] fix image order	2018-04-17 17:32:21 +02:00
Mike Fährmann	d10579edb5	[pinterest] improve PinterestAPI code; remove OAuth mentions on another note: access_tokens have been set to only allow for 10 requests per hour (from 200 yesterday)	2018-04-17 17:12:42 +02:00
Mike Fährmann	4bd182c107	[pinterest] implement `oauth:pinterest` (#83 ) Pinterest access tokens are rate limited at 200 requests per hour (or maybe per 2 or 3 hours?) so having just one access token for all users isn't going to work in the long run.	2018-04-16 20:03:28 +02:00
Mike Fährmann	dbe250f7e5	[pinterest] update access_token (#83 )	2018-04-16 09:46:45 +02:00
Mike Fährmann	5c487300ee	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	2018-04-15 19:05:29 +02:00
Mike Fährmann	4ffa94f634	remove 'shorten_path()' and 'shorten_filename()'	2018-04-15 18:44:13 +02:00
Mike Fährmann	27eab4e467	rewrite text tests and improve functions - test more edge cases - consistently return an empty string for invalid arguments - remove the ungreedy-flag in 'remove_html()'	2018-04-15 18:13:46 +02:00
Mike Fährmann	e3f2bd4087	add tests for 'text.clean_xml()' and improve it	2018-04-14 22:07:01 +02:00
Mike Fährmann	6d8b191ea7	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	2018-04-13 19:21:32 +02:00
Mike Fährmann	51ea699083	add 'abort()' as function to filter expressions calling 'abort()' in a filter aborts the current extractor run in a cleaner way than using something like 1/0, which causes an error message to be printed	2018-04-12 17:07:12 +02:00
Mike Fährmann	48a83a89e9	[loveisover] remove module archive.loveisover.me was shut down on 2018-03-29; https://www.archiveteam.org/index.php?title=4chan#archive.loveisover.me	2018-04-09 16:05:15 +02:00

1 2 3 4 5 ...

314 Commits