gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-24 11:42:33 +01:00

Author	SHA1	Message	Date
Mike Fährmann	5d4494b15f	add "ascii" as a special 'path-restrict' value	2021-01-09 02:41:20 +01:00
Mike Fährmann	5818c928c4	refactor 'path-restrict' parsing	2021-01-09 02:33:42 +01:00
Mike Fährmann	aac00a2024	add 'd' conversion for format strings to convert a timestamp to a formattable 'datetime' object. For example '{created_at!d:%Y-%m-%d}' transforms the timestamp in 'created_at' into a 'datetime' object and then formats its content using '%Y-%m-%d' as template. 1262304000 -> datetime(2010, 1, 1) -> "2010-01-01"	2021-01-09 01:58:44 +01:00
Mike Fährmann	20bd9cd296	[wikiart] add extractor for single paintings (closes #1233 ) There is no API endpoint for single paintings from what I can tell, so this uses the site's search.	2021-01-08 23:19:00 +01:00
Mike Fährmann	e2d4ca4955	[deviantart] improve '--range' for favorites (closes #1226 )	2021-01-08 22:57:35 +01:00
Mike Fährmann	56ccb9951a	[gfycat] add 'date' metadata field (#1138 )	2021-01-08 17:45:09 +01:00
Mike Fährmann	f2b83b8578	[gfycat] convert IDs to lowercase Redgifs expects all IDs and names to be lowercase and throws a 404 if an ID contains an uppercase letter. Gfycat on the other hand doesn't care about case, so it's fine to just convert all IDs. (#1138)	2021-01-08 17:41:45 +01:00
Mike Fährmann	b3bc646236	[redgifs] match embedded URLs https://redgifs.com/ifr/<ID>	2021-01-08 16:01:01 +01:00
Mike Fährmann	98e0d21383	[instagram] categorize single highlight URLs as 'highlights' They were categorized as 'stories' before. (fixes #1222)	2021-01-08 15:56:27 +01:00
Mike Fährmann	1c9435e0df	add '-G' command-line option (#1217 ) A "stronger" version of '-g', resolving all intermediate URLs.	2021-01-07 19:07:05 +01:00
Mike Fährmann	fa8ee6eac4	[derpibooru] add search and gallery extractors (#862 )	2021-01-07 18:05:32 +01:00
Mike Fährmann	3759d0cb42	[redgifs] fix search results The metadata for Redgifs search results got stripped down to a bare minimum, including download URLs. (Clicking on search results on the website itself is broken as well) As a workaround, we make an extra call to '/v1/gfycats/<ID>' for each search result entry to fetch the missing data.	2021-01-06 18:16:06 +01:00
Mike Fährmann	8a88025dc4	[pinterest] support generic user URLs (#1205 ) i.e. https://www.pinterest.com/USERNAME also renames 'BoardsExtractor' to 'UserExtractor'	2021-01-02 02:36:53 +01:00
Mike Fährmann	56b460dcea	[foolfuuka] add 'search' extractors (#1174 )	2021-01-02 02:34:06 +01:00
Mike Fährmann	fb64183d53	[foolfuuka] add 'board' extractors (closes #1044 )	2021-01-01 19:33:35 +01:00
Mike Fährmann	0594821fcd	[downloader:http] add MIME type and signature for .ico files (closes #1211)	2021-01-01 16:07:33 +01:00
Mike Fährmann	b0beed7a06	[sankaku] add support for book searches (closes #1204 )	2020-12-29 17:36:37 +01:00
Mike Fährmann	6cdbab07b5	[pinterest] add support for getting all boards of a user (#1205)	2020-12-29 16:57:03 +01:00
Mike Fährmann	25074aec47	[twitter] fetch media from pinned tweets (#1203 )	2020-12-29 16:27:43 +01:00
Mike Fährmann	2475176d99	[twitter] fetch tweets from 'homeConversation' entries When logged in, some entries returned by Twitter's API are so called 'homeConversation's (they would be regular tweet entries otherwise.) Those weren't picked up before and resulted in missing files compared to accessing a timeline as guest. ('/media' timelines and search results were not affected)	2020-12-29 00:42:46 +01:00
Mike Fährmann	3af9350648	[twitter] update API calls - use 'https://twitter.com/i/api' for all requests except '/guest/activate.json' - update (default) URL parameters - update GraphQL endpoints	2020-12-28 22:05:48 +01:00
Mike Fährmann	b656b829db	[twitter] fix login with username & password It is no longer possible to get an 'authenticity_token' from Twitter's Javascript-free login form, which got disabled few days ago. Generating a random 16 byte hex string client-side and sending that as a cookie alongside the regular login form works just as well.	2020-12-28 16:10:19 +01:00
Mike Fährmann	d1903589a5	release version 1.16.1	2020-12-27 18:28:33 +01:00
Mike Fährmann	912eea29bc	update extractor test results	2020-12-27 17:41:08 +01:00
Mike Fährmann	47a7a51944	[sankaku] fix 'invalid_token' detection	2020-12-27 02:31:01 +01:00
Mike Fährmann	ba5df84f7e	[keenspot] improve redirect handling Before it would use http:// for all requests and get a redirect to a https:// version if those are supported. Now the redirect only happens once during the first request.	2020-12-26 21:38:40 +01:00
Mike Fährmann	d781e6ac44	[e621] return pool posts in order (closes #1195 ) … and add a 'num' enumeration index. A bit more code than the PR version, but it prints some helpful messages and doesn't call 'metadata()' twice.	2020-12-26 19:00:29 +01:00
Mike Fährmann	e7d446a8f7	[danbooru] slight code refactoring	2020-12-25 22:06:25 +01:00
Mike Fährmann	e41e2be2f9	[booru] split '_prepare_post()'	2020-12-24 01:13:54 +01:00
Mike Fährmann	53222445d5	[hentaicafe] simplify default filenames	2020-12-23 01:03:08 +01:00
Mike Fährmann	712c792fbe	[hentaicafe] prefer title of /hc.fyi/ pages (closes #1106 )	2020-12-23 01:01:15 +01:00
Mike Fährmann	2c4d4a75db	[mangadex] respect 'chapter-reverse' settings (closes #1194 ) The extractor in question doesn't inherit from MangaExtractor and therefore didn't do this automatically.	2020-12-22 15:08:10 +01:00
Mike Fährmann	3bd08acc8f	[pixiv] output debug message on failed login attempt (#1192)	2020-12-22 14:59:31 +01:00
Mike Fährmann	b58e605dc7	raise error when required username or password are missing do not try to login as 'None' (#1192)	2020-12-22 14:40:18 +01:00
Mike Fährmann	b233531aaa	[sankaku] use '/posts' endpoint for single posts	2020-12-22 02:44:40 +01:00
Mike Fährmann	459a0af4f8	[sankaku] add support for sankaku.app URLs (closes #1193 )	2020-12-22 01:57:53 +01:00
Mike Fährmann	371e9ca6df	[pinterest] implement video support (closes #1189 )	2020-12-21 16:09:06 +01:00
Mike Fährmann	537742c0ee	[sankaku] normalize 'created_at' metadata (closes #1190 )	2020-12-21 02:06:29 +01:00
Mike Fährmann	ae6748996a	[pornhub] update tests	2020-12-21 02:06:28 +01:00
Mike Fährmann	bf629a2818	[instagram] add 'include' option (closes #1180 ) Split the functionality of the old 'user' extractor into separate 'posts' and 'highlights' extractors, which respond to virtual URLs ('/<user>/posts' and '/<user>/highlights')	2020-12-21 02:06:28 +01:00
Mike Fährmann	78061658ea	[booru] reduce exceptions caught during _prepare_post() don't catch HttpErrors etc.	2020-12-21 02:05:59 +01:00
Mike Fährmann	212ae0c399	[mangapanda] remove module site now redirects to mangareader.net	2020-12-20 17:42:15 +01:00
Mike Fährmann	337b118e25	[instagram] warn about private profiles (#1187 )	2020-12-19 22:32:28 +01:00
Mike Fährmann	e8c64dd961	[postprocessor:exec] do not auto-add '{}' to command (#1185 ) This was initially done to mimic youtube-dl's behavior and implementation of --exec, and it seemed reasonable at the time.	2020-12-19 20:53:46 +01:00
Mike Fährmann	0a3bbc9c63	[postprocessor:exec] update output	2020-12-19 20:36:39 +01:00
Mike Fährmann	511d8d3fa3	increase SQLite connection timeouts (#1173 )	2020-12-19 20:15:07 +01:00
Mike Fährmann	465015f75a	[sankaku] reimplement login support (#1176 , #1182 )	2020-12-17 16:12:59 +01:00
Mike Fährmann	8d2e4e5f13	[booru] improve error handling e.g. for posts without a valid 'file_url' (#1176)	2020-12-17 01:16:45 +01:00
Mike Fährmann	1f9121fecb	release version 1.16.0	2020-12-12 23:08:25 +01:00
Mike Fährmann	1d753542c2	[hentainexus] fix extraction (fixes #1166 )	2020-12-12 20:30:51 +01:00
Mike Fährmann	b6f1fe59cb	add deprecation warnings for exec.final and metadata.bypost	2020-12-12 16:58:23 +01:00
Mike Fährmann	476d563ec2	[downloader:http] add MIME type and signature for .swf files	2020-12-11 14:21:04 +01:00
Mike Fährmann	a00b60fbe7	[twitter] update 'x-csrf-token' header (fixes #1170 ) Twitter started using a bigger (80 instead of 16 bytes) CSRf token for logged in users, and expects those to be used as 'x-csrf-token' header when send via 'ct0' cookie. Generating an 80 byte token ourselves doesn't work, and Twitter will still insist on using its own.	2020-12-11 13:46:58 +01:00
Mike Fährmann	b88c97b873	[instagram] add 'cursor' option (#1149 ) To enable at least 'some' way to continue downloading from the middle of a user profile listing.	2020-12-11 13:46:58 +01:00
Mike Fährmann	0d406c8daf	[common] restrict values used in 'generate_extractors()'	2020-12-11 13:46:47 +01:00
Mike Fährmann	fe0265c7a5	[downloader.http] small improvements to file signature list - specify multiple entries for gif, mp3, zip - add entries for pdf	2020-12-08 21:20:18 +01:00
Mike Fährmann	b2c55f0a72	[sankaku] remove login support The old login method for 'https://chan.sankakucomplex.com/user/login' and the cookies it produces have no effect on the results from 'beta.sankakucomplex.com'.	2020-12-08 21:05:47 +01:00
Mike Fährmann	7f3d811d7b	[moebooru] inherit from BooruExtractor	2020-12-08 18:34:56 +01:00
Mike Fährmann	a3a863fc13	[booru] add generalized extractors for *booru sites similar to `cc15fbe7`	2020-12-08 18:34:30 +01:00
Mike Fährmann	5f23441e12	[piczel] update API URLs	2020-12-07 15:56:32 +01:00
Mike Fährmann	47114339a2	[webtoons] update 'ageGate' cookie	2020-12-07 14:56:32 +01:00
Mike Fährmann	4225f12783	[nozomi] handle empty 'date' fields (fixes #1163 )	2020-12-07 00:08:53 +01:00
Mike Fährmann	2b93515ee0	[instagram] reimplement support for stories (#1149 )	2020-12-06 21:32:10 +01:00
Mike Fährmann	ecdea799dd	[sankaku] use 'beta.sankakucomplex.com' API endpoints	2020-12-05 22:08:58 +01:00
Mike Fährmann	b3ecc89a9a	[instagram] use double quotes for strings when possible	2020-12-05 19:33:42 +01:00
Mike Fährmann	76285eb60d	[instagram] reimplement support for story highlights (#1149 )	2020-12-05 19:13:00 +01:00
Mike Fährmann	8ca7f54750	rename '_request_…' variables - remove '_' at the beginning - _request_last -> request_timestamp	2020-12-05 00:09:15 +01:00
Mike Fährmann	15a122aff3	[instagram] update 'X-IG-WWW-Claim' headers	2020-12-04 20:58:34 +01:00
Mike Fährmann	e5d81bdc7b	[mangadex] handle 'external' chapters (closes #1154 )	2020-12-04 20:56:30 +01:00
Mike Fährmann	447488fb18	[instagram] rewrite (#1113, #1122, #1128, #1130, #1149) Rely on the results of GraphQL queries instead of requesting data for each post separately via '/p/<shortcode>/?__a=1'. This might result in some missing metadata, and there might be some issues for '/channel/' and '/saved/' URLs, but at least downloading from the regular post listings should work without issues and without getting users blocked/banned. TODO: reimplement support for stories	2020-12-03 14:30:59 +01:00
Mike Fährmann	cc15fbe71a	[moebooru] add generalized extractors for moebooru sites - add support for sakugabooru.com (closes #1136) - add support for lolibooru.moe (closes #1050) This allows users to dynamically add support for moebooru/myimouto based sites by adding an entry to their config file (like for foolslide, foolfuuka, etc) For example: { "extractor": { "moebooru": { "new-site-1": {"root": "https://site1.net"}, "new-site-2": {"root": "https://www.site2.moe"} } } }	2020-12-01 22:27:18 +01:00
Mike Fährmann	43120407cc	[paheal] create directory for each post (closes #1147 )	2020-12-01 12:14:55 +01:00
Mike Fährmann	63e61a0932	[twitter] update image URL format (#1145 ) use '/<name>?format=<fmt>&name=<size>' instead of the potentially deprecated '/<name>.<fmt>:<size>' but keep all of them as fallback URLs	2020-12-01 11:53:51 +01:00
Mike Fährmann	1a4b61f7eb	[downloader:http] fix issues with chunked transfer encoding (fixes #1144)	2020-11-30 01:10:45 +01:00
Mike Fährmann	536c088462	[downloader:http] improve 'adjust-extensions' (#776 ) Check file headers against a list of file signatures before downloading the whole file and writing it to disk. The file signature check needs some improvements (), but it produces usable results for the most part. () - 'webp', 'wav', and others start with 'RFFI' - 'svg' uses the same "signature" as all XML documents - 'webm' has the same signature as 'mkv' files - only 'mp3' files in an ID3v2 container get recognized	2020-11-29 20:55:35 +01:00
Mike Fährmann	46323ae6ff	initialize 'hooks' as empty tuple follow-up to `9c29fc4e` Prevent a "race" between initializing 'pathfmt' and 'hooks', and receiving a signal in between (e.g. ctrl+c), which would then crash in 'handle_finalize()'.	2020-11-28 18:18:49 +01:00
Mike Fährmann	9c29fc4e55	always initialize DownloadJob.hooks (fixes #1135 ) and not just when any (potential) post processors are defined	2020-11-28 00:09:19 +01:00
Mike Fährmann	ae6a1d5fbc	[mangoxo] fix extraction 2	2020-11-27 13:55:30 +01:00
Mike Fährmann	f6a684bc37	[hentainexus] update data decoding procedure (#1125 )	2020-11-25 11:26:26 +01:00
Mike Fährmann	c57a918f4a	[e621] implement delay via '_request_interval_min'	2020-11-25 00:19:32 +01:00
Mike Fährmann	93ce7466e2	[2chan] skip external links	2020-11-24 16:41:47 +01:00
Mike Fährmann	b214e89b5c	[mangoxo] fix extraction	2020-11-24 12:50:46 +01:00
Mike Fährmann	578dcf805c	[mangapanda] don't force https://	2020-11-21 20:24:37 +01:00
Mike Fährmann	102c482f5e	[reddit] skip invalid/failed gallery items (fixes #1127 )	2020-11-21 17:34:38 +01:00
Mike Fährmann	174945d2b2	[hentainexus] fix extraction (fixes #1125 )	2020-11-20 22:31:35 +01:00
Mike Fährmann	ca59bd691c	[postprocessor:metadata] add 'event' and 'filename' options	2020-11-20 22:29:11 +01:00
Mike Fährmann	9c3568c397	[postprocessor:exec] add 'event' option and remove 'final' option -- use '"event": "finalize"' instead.	2020-11-19 02:30:48 +01:00
Mike Fährmann	9fffa9c343	rework post processor callbacks	2020-11-19 02:29:06 +01:00
Mike Fährmann	f99c6031e0	apply post processor blacklists/whitelists to basecategories (#1103)	2020-11-17 02:02:31 +01:00
Mike Fährmann	1e3dd7330e	merge SharedConfigMixin functionality into Extractor	2020-11-17 00:34:07 +01:00
Mike Fährmann	ddfb4fd07a	[twitter] use 'https://twitter.com/i/api/' for logged in users Doesn't seem to make a difference from what I can tell, i.e. downloaded files are the same, but the website does it.	2020-11-16 11:26:37 +01:00
Mike Fährmann	42ccae53c4	[mangadex] switch to API v2 https://mangadex.org/api/v2/ https://mangadex.org/thread/351011	2020-11-16 11:05:17 +01:00
Mike Fährmann	ca44111726	[flickr] update - ensure every photo has an 'owner' (#828) - change default directories to a more consistent schema - create directory for each photo	2020-11-15 10:44:29 +01:00
Mike Fährmann	9b1bd09454	change 'extension-map' default Replace all JPEG filename extensions with 'jpg'.	2020-11-14 22:40:31 +01:00
Mike Fährmann	e5438b8a29	release version 1.15.3	2020-11-13 15:50:05 +01:00
Mike Fährmann	de0c57886d	[twitter] add 'list-members' extractor (closes #1096 )	2020-11-13 06:47:45 +01:00
Mike Fährmann	904ba08568	[gfycat] fix default filename format	2020-11-13 06:37:21 +01:00
Mike Fährmann	a46561bc16	[500px] update query hashes	2020-11-13 06:36:11 +01:00
Mike Fährmann	2e3a0dff21	[8kun] fix file URLs of older posts (fixes #1101 )	2020-11-07 23:10:37 +01:00
Mike Fährmann	00825cddf5	[hentaifoundry] use scheme from input URL (fixes #1095 ) Let the user choose between http and https, instead of always forcing https.	2020-11-07 22:40:02 +01:00
Mike Fährmann	8a98d3549a	[weasyl] create directory for each favorite submission (#1032)	2020-11-07 18:47:55 +01:00
Mike Fährmann	91db8df1c7	[deviantart] add 'index_base36' metadata field (closes #1099 ) This is the same ID as found in 'filename' without the 'd' in front, which is just 'index' encoded in base36.	2020-11-07 18:39:50 +01:00
Mike Fährmann	b9bfa4c675	update extractor test results	2020-11-07 02:03:22 +01:00
Mike Fährmann	1b5b789401	[mangoxo] fix metadata extraction	2020-11-07 01:35:29 +01:00
Mike Fährmann	41d4968866	[twitter] add 'list' extractor (#1096 )	2020-11-05 22:55:38 +01:00
Mike Fährmann	5d10520f4c	[twitter] update GraphQL endpoint & fix width/height entries	2020-11-05 22:53:29 +01:00
Mike Fährmann	9b2e5f72d6	[exhentai] update image URL parsing (#1094 )	2020-11-02 15:28:54 +01:00
Mike Fährmann	e3480bc8de	implement 'extension-map' option (#318 )	2020-11-02 15:27:07 +01:00
Mike Fährmann	98a4d86a01	[sankakucomplex] extract videos and embeds (closes #308 )	2020-10-30 01:21:11 +01:00
Mike Fährmann	c3f01dc4e6	implement 'util.unique()'	2020-10-29 23:33:41 +01:00
Mike Fährmann	558cde139c	[paheal] fix extraction (fixes #1088 )	2020-10-28 21:51:31 +01:00
Mike Fährmann	0211af7ca8	[hentaifoundry] update 'YII_CSRF_TOKEN' cookie handling (fixes #1083)	2020-10-28 21:49:03 +01:00
Mike Fährmann	d83b95fd28	[postprocessor:metadata] accept a string-list for 'content-format' (closes #1080)	2020-10-27 20:09:58 +01:00
Mike Fährmann	198c33ec36	also collect post processors from 'basecategory' entries (fixes #1084)	2020-10-27 19:56:48 +01:00
Mike Fährmann	350b1afe1c	speed up _list_classes() after iterating over all modules once	2020-10-26 22:18:15 +01:00
Mike Fährmann	5bcf28de93	add a 'extractor.modules' option	2020-10-25 03:05:10 +01:00
Mike Fährmann	18213dc5ba	release version 1.15.2	2020-10-24 18:57:29 +02:00
Mike Fährmann	de4a1e45c9	improve 'generate_csrf_token()' no need to use hashlib.md5()	2020-10-24 02:56:40 +02:00
Mike Fährmann	b788712844	[fallenangels] fix extraction of '.5' chapters	2020-10-23 16:56:08 +02:00
Mike Fährmann	28d8541cb3	[mangafox] ensure download URLs have a scheme	2020-10-23 02:45:15 +02:00
Mike Fährmann	8e3a324c91	[mangakakalot] ignore "Go Home" buttons in chapter pages	2020-10-23 02:33:35 +02:00
Mike Fährmann	c14c5d82d6	[newgrounds] use generator for fallback URLs	2020-10-23 00:39:19 +02:00
Mike Fährmann	a09f42f6b3	improve filename_from_url() performance Manually extracting the part between the last '/' and '?' instead of relying on the standard libraries' 'urllib.parse.urlsplit()' increases performance by ~400%. urlsplit() : 3.64 secs per 1.000.000 iterations partition(): 0.87 secs per 1.000.000 iterations	2020-10-23 00:14:06 +02:00
Mike Fährmann	968d3e8465	remove '&' from URL patterns '/?&#' -> '/?#' and '?&#' -> '?#' According to https://www.ietf.org/rfc/rfc3986.txt, URLs are "organized hierarchically" by using "the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components"	2020-10-22 23:31:25 +02:00
Mike Fährmann	1686dc1757	[twitter] support media from Cards (#1005 , #937 ) Can be enabled with 'extractor.twitter.cards', but for now disabled by default because cards can redirect to rather large videos from YouTube or Twitch.	2020-10-22 21:33:53 +02:00
Mike Fährmann	ffd38215a4	[hitomi] fix image URLs and URL pattern - non-webp files are now hosted on [a-c]b.hitomi.la - removed ampersand from invalid slug characters	2020-10-22 15:15:34 +02:00
Mike Fährmann	286718950c	[mangahere] ensure download URLs have a scheme (fixes #1070 )	2020-10-17 22:43:59 +02:00
Mike Fährmann	76dfa11a65	[reddit] add 'date' metadata field (closes #1068 )	2020-10-16 15:48:04 +02:00
Mike Fährmann	3f2ba629ea	[newgrounds] provide fallback URLs for video downloads (#1042 )	2020-10-16 01:16:12 +02:00
Mike Fährmann	a3ca2f6080	update fallback URL handling remove Message.Urllist and use a '_fallback' field inside a kwdict	2020-10-16 01:09:55 +02:00
Mike Fährmann	43dab3a228	[mangadex] unescape more metadata fields (fixes #1066 ) like 'manga', 'author', 'artist', etc.	2020-10-16 00:41:15 +02:00
Mike Fährmann	ec61696316	add 't' format string conversion (closes #1065 ) to Trim whitespace from the beginning and end of strings. Example: '{field!t}' becomes 'foo' for 'field' == " \nfoo\t\r"	2020-10-16 00:37:22 +02:00
Mike Fährmann	5565025221	[xhamster] fix user profile extraction	2020-10-15 18:57:35 +02:00
Mike Fährmann	07432d6262	[seiga] fix flake8 and cookie test (#1063 )	2020-10-15 15:37:58 +02:00
Mike Fährmann	b8daabc3ca	[pinterest] implement login support (closes #1055 ) being logged allows access to secret/protected boards	2020-10-15 15:14:18 +02:00
Mike Fährmann	1b1cf01d0d	add a general 'generate_csrf_token()' function	2020-10-15 15:14:18 +02:00
Mike Fährmann	7a0ba370d1	[gelbooru] rewrite mp4 video URLs (fixes #1048 )	2020-10-15 15:14:18 +02:00
Mike Fährmann	6491db3eaf	[blogger] handle URLs with specified width/height (closes #1061 ) get highest quality for images with /wXXX-hXXX/ instead of the usual /sXXX/	2020-10-15 15:14:18 +02:00
Mike Fährmann	783e0af26d	[hentaifoundry] update and simplify	2020-10-15 15:14:17 +02:00
Mike Fährmann	5b844a72b7	[newgrounds] handle embeds without scheme (#1033 )	2020-10-15 15:13:54 +02:00
kurumigi	7e0e872f4f	[seiga] Add metadata for single image downloads (#1063 ) * [seiga] Support image metadata. * [seiga] Update test data. * [seiga] Fix cookie check. * [test_cookies] [seiga] Fit test_cookies.py to the last commit.	2020-10-15 15:13:27 +02:00
Zanny	3ec60e894a	[weasyl] api-key authentication (#1057 ) * [weasyl] support api keys * [weasyl] document api-key authentication * [weasyl] usernames can contain ~	2020-10-15 15:12:09 +02:00
Mike Fährmann	35056a07d1	release version 1.15.1	2020-10-11 18:44:46 +02:00
Mike Fährmann	844793847c	update extractor test results	2020-10-11 18:15:41 +02:00
Mike Fährmann	ddd6840509	[behance] fix 'collection' extraction	2020-10-11 18:15:41 +02:00
Mike Fährmann	c5e3971b18	[newgrounds] extract image embeds (closes #1033 )	2020-10-11 18:15:40 +02:00
dawidsowa	43b156fb40	[reactor] match URLs without subdomain (#1053 )	2020-10-11 18:15:06 +02:00
Mike Fährmann	fd20093c96	allow blacklist/whitelist to be empty lists/strings (#1051 )	2020-10-08 14:55:21 +02:00
Mike Fährmann	3ebb174f2c	add missing extractor info when spawning new ones (fixes #1051 ) Not having this information causes the blacklist/whitelist logic to trigger and prevents things from functioning as intended when using default settings. Fixes issues for 8muses, deviantart, exhentai, and mangoxo.	2020-10-08 14:34:53 +02:00
Mike Fährmann	f9c1684af7	[newgrounds] restore original video URLs (#1042 )	2020-10-07 22:53:53 +02:00
Mike Fährmann	73373c06ec	[weibo] handle posts with more than 9 images (closes #926 ) Responses from '/api/container/getIndex' don't list more than 9 images per 'status' object, but the embedded JSON from a '/detail/<ID>' page does.	2020-10-06 18:16:08 +02:00
Mike Fährmann	dd1e545597	[hentaifoundry] rename GalleryExtractor to PicturesExtractor	2020-10-04 22:53:23 +02:00
Mike Fährmann	c874071f5a	[kissmanga] remove module	2020-10-04 22:46:41 +02:00
Mike Fährmann	93e04bf9a9	[500px] update query hashes	2020-10-03 19:25:28 +02:00
Mike Fährmann	844502cad5	update extractor test results	2020-10-03 19:24:19 +02:00
Mike Fährmann	fad7748b6b	[xvideos] fix 'title' extraction	2020-10-01 22:04:14 +02:00
Mike Fährmann	5b927c15df	[newgrounds] fix video extraction (closes #1042 )	2020-10-01 20:14:16 +02:00
Mike Fährmann	bdc6c8f074	improve message for 'oauth:deviantart' etc (closes #989 )	2020-09-29 21:25:24 +02:00
Mike Fährmann	430b6d6e2e	[twitter] extend 'retweets' option (closes #1026 ) Setting 'retweets' to '"original"' will use metadata from the original retweeted Tweets, and not from the Retweet entry.	2020-09-28 23:03:35 +02:00
Mike Fährmann	b9bdd2c564	[hentaifoundry] add support for stories (closes #734 )	2020-09-27 02:27:40 +02:00
Mike Fährmann	9a9d1924d8	[hentaicafe] add 'manga_id' metadata field (closes #1036 ) This field is only available when using a non-foolslide URL like '/hc.fyi/9874' or '/hazuki-yuuto-summer-blues/'	2020-09-26 14:34:48 +02:00
Mike Fährmann	cc4ac80302	[weasyl] add 'favorite' extractor (#1032 )	2020-09-26 13:09:03 +02:00
Mike Fährmann	e9cc719497	[weasyl] update and simplify - simplify 'pattern' regexps - parse 'posted_at' as 'date' - use unaltered 'title' ({title!l:R /_/} to lowercase and replace spaces)	2020-09-26 02:10:45 +02:00
Mike Fährmann	6514312126	[nijie] add 'include' option (closes #1018 )	2020-09-25 18:18:35 +02:00
Mike Fährmann	0d43456323	[hentaifoundry] add 'include' option	2020-09-25 18:18:03 +02:00
Zanny	ebb7737b9b	Weasyl Extractor (#977 ) * weasyl extractor * @kattjevfel suggested changes * @mikf changes	2020-09-25 15:18:21 +02:00
Mike Fährmann	d5fa716d89	fix crash when using 'skip=false' and archive (fixes #1023 ) Separating the archive check from pathfmt.exists() in `b5243297` had some unintended side effects. It is also not possible to monkey-patch a dunder method like __contains__ because of the special method lookup that gets performed for them.	2020-09-23 19:07:40 +02:00
Mike Fährmann	aeb0d32333	[twitter] improve twitpic extraction (fixes #1019 ) - ignore twitpic.com/photos/… URLs - ignore empty image URLs	2020-09-22 22:22:35 +02:00
Mike Fährmann	2184ec5d78	release version 1.15.0	2020-09-20 22:06:46 +02:00
Mike Fährmann	7cd383c0f9	update extractor test results	2020-09-20 21:54:39 +02:00
Mike Fährmann	1e313d5b84	implement 'sleep-request' option	2020-09-20 20:28:17 +02:00
Mike Fährmann	65744a7a31	use alternative for all falsey values in format strings … and not just None (#525) It would be better to consistently use None for all non-existent fields and/or fields without a valid value, but this is a good enough workaround for now.	2020-09-19 22:02:47 +02:00
Mike Fährmann	c43b3894be	[myhentaigallery] update and fix extraction (#1001 ) - extract more metadata - match "/show/" URLs - complete test results - fix missing images for lines starting with " <img" - fix missing comma in supportedsites.py	2020-09-17 18:14:23 +02:00
choeronline	05b9ac8d37	[myhentaigallery] add extractor (#1001 ) * adds support for myhentaigallery * fixes linting issues in myhentaigallery extractor	2020-09-17 17:32:54 +02:00
Mike Fährmann	2626629117	[danbooru] handle posts without 'id' (fixes #1004 )	2020-09-16 21:35:27 +02:00
Mike Fährmann	cc1fb0b4ea	[500px] update query hash	2020-09-16 01:26:31 +02:00
Mike Fährmann	da87a5fb7e	[exhentai] fix accessing config before main constructor bug introduced with `055c32e0` Making 'Extractor.config()' quite a bit faster is worth the "cost" of having to set _cfgpath in exhentai constructors, I think.	2020-09-15 18:09:50 +02:00
Mike Fährmann	f5b7ae01c1	update extractor test results	2020-09-15 18:07:08 +02:00
Mike Fährmann	136df52d1f	[deviantart] support watchers-only/paid deviations (#995 )	2020-09-15 16:03:46 +02:00
Mike Fährmann	055c32e0f7	precompute extractor config paths	2020-09-14 22:06:54 +02:00
Mike Fährmann	231dd4c800	accumulate postprocessor objects (#994 ) Instead of one 'postprocessors' setting overwriting all others lower in the hierarchy, all postprocessors along the config path will now get collected into one big list. For example '--mtime-from-date' will therefore no longer cause other postprocessor settings in a config file to get ignored.	2020-09-14 21:51:55 +02:00
Mike Fährmann	392d022b04	implement 'config.accumulate()' (#994 )	2020-09-14 21:13:08 +02:00
Mike Fährmann	3afd362e2e	add 'sleep-extractor' option (closes #964 ) (would have been nice if this were possible without code duplication)	2020-09-12 21:04:47 +02:00
Mike Fährmann	3108e85b89	[worldthree] remove extractors http://www.slide.world-three.org/ hasn't been accessible for a long time.	2020-09-11 18:12:57 +02:00
Mike Fährmann	8fed3eb8cb	[jaiminisbox] remove extractors https://jaiminisbox.com/post.html	2020-09-11 18:09:35 +02:00
Mike Fährmann	dcf3ad7eef	[furaffinity] update download URL extraction (fixes #988 ) support the new 'd2.facdn.net' subdomain	2020-09-11 13:23:57 +02:00
Mike Fährmann	3918b69677	remove 'extractor.blacklist' context manager	2020-09-11 13:17:35 +02:00
Mike Fährmann	c78aa17506	add general 'blacklist' and 'whitelist' options (#492 , #844 )	2020-09-11 13:17:12 +02:00
Mike Fährmann	abda352a5b	add '--no-skip' command-line option (closes #986 )	2020-09-11 01:23:39 +02:00
Mike Fährmann	5912727b88	support format string replacement fields in archive paths (closes #985)	2020-09-10 22:09:30 +02:00
Mike Fährmann	2b8d57f0ab	[twitter] support '/intent/user?user_id=…' URLs (#980 )	2020-09-08 23:17:50 +02:00
Mike Fährmann	a3b473bd2f	[twitter] support specifying users by ID (#980 ) by using 'id:…' as their screen name, i.e. https://www.twitter.com/id:2976459548/media instead of https://twitter.com/supernaturepics/media The user ID can, for example, be obtained from the output of $ gallery-dl -j --range 1 https://twitter.com/<screen-name>	2020-09-08 22:56:52 +02:00
Mike Fährmann	a0d916ed41	[exhentai] update wait time before original image download (#978 ) depend on 'wait-max', don't use a hard-coded value	2020-09-07 23:48:28 +02:00
Mike Fährmann	f6fd449b59	reduce wait time growth rate from exponential to linear Waiting for 2**N seconds after each error grows too fast. Simply waiting N seconds seems far more reasonable.	2020-09-06 22:38:25 +02:00
Mike Fährmann	bc48514d84	[aryion] get post ID via gallery-item (fixes #981 , closes #982 ) this even works when fetching post IDs from '/latest.php?id='	2020-09-06 22:17:23 +02:00
Mike Fährmann	799ca07fc8	[imgur] update - fix image/album detection for galleries - use new API endpoints for image/album data	2020-09-06 21:11:32 +02:00
Mike Fährmann	b5243297ff	write skipped files to archive (closes #550 )	2020-09-03 18:37:38 +02:00
Mike Fährmann	ac3036ef56	add 'filesize-min' and 'filesize-max' options (closes #780 )	2020-09-03 18:21:04 +02:00
Mike Fährmann	7876a03ece	[tumblr] create directories for each post (fixes #965 ) This changes the identifiers for directory format string fields. Everything blog related is now inside a 'blog' object and not at the "base level" anymore. E.g. '{name}' for directories is now '{blog[name]}' (or '{blog_name}', since that is also available)	2020-08-31 21:58:20 +02:00
Mike Fährmann	fd0685d9b5	[postprocessor:zip] defer zip file creation (fixes #968 ) don't try to create zip files on postprocessor construction, wait until directory creation during file download,	2020-08-31 21:53:18 +02:00
Mike Fährmann	33fe67b594	release version 1.14.5	2020-08-30 21:20:26 +02:00
Mike Fährmann	d50f3b333a	update extractor test results	2020-08-30 20:55:22 +02:00
Mike Fährmann	0f55b8e80a	[exhentai] fix type check from `dbbbb21` (#940 ) 'bool' is a subclass of 'int', and therefore 'isinstance(self.limits, int)' also returns True when 'self.limits' has a boolean value	2020-08-30 20:51:22 +02:00
Mike Fährmann	e33293fdd8	[hentaihand] update to new site layout	2020-08-30 00:41:03 +02:00
Mike Fährmann	fda9e296dd	[gelbooru] fix extraction without API	2020-08-28 22:33:37 +02:00
Mike Fährmann	69e4871005	update extractor test results - sensescans: replace 404d chapters - mangapark: replace 404d chapters - subscribestar: update test for attached files	2020-08-28 22:32:32 +02:00
Mike Fährmann	ab1af66a97	[imgur] add 'search' extractor (#934 )	2020-08-27 22:46:17 +02:00
Mike Fährmann	e4bbc1fb5c	[imgur] add 'tag' extractor (#934 )	2020-08-27 22:46:17 +02:00
Mike Fährmann	deaacc70bb	[hitomi] update URL pattern for tag searches	2020-08-27 22:46:03 +02:00
ArtaxIsSleeping	0e941553ec	[aryion] Add username/password support (#960 ) * Add username/password support to aryion extractor * Update docs to match * Fix code style	2020-08-27 22:45:30 +02:00
Mike Fährmann	84e04cc23b	[500px] fix extraction and update URL patterns (fixes #956 ) - rewrite most API calls to GraphQL queries - match '500px.com/p/<user>' URLs	2020-08-24 18:25:31 +02:00
Mike Fährmann	d4ff767291	[reddit] improve gallery extraction (fixes #955 )	2020-08-23 22:06:06 +02:00
Mike Fährmann	7140fe7e6d	[hitomi] fix redirect processing	2020-08-23 15:18:44 +02:00
Mike Fährmann	a57b6b3c3a	[reddit] handle deleted galleries (fixes #953 )	2020-08-20 20:14:07 +02:00
Mike Fährmann	063c71cd84	[furaffinity] add 'search' extractor (closes #915 )	2020-08-18 21:26:46 +02:00
Mike Fährmann	dbbbb21180	[exhentai] add ability to specify custom image limit (#940 )	2020-08-17 22:29:20 +02:00
Mike Fährmann	b2009ea39e	[aryion] update folder mime type list (fixes #945 )	2020-08-16 22:30:15 +02:00
Mike Fährmann	688bd046fc	release version 1.14.4	2020-08-15 21:29:02 +02:00
Mike Fährmann	d06ad148c7	[shopify] use alternate regex for products on collection pages when the first on doesn't yield any results	2020-08-15 18:24:14 +02:00
Mike Fährmann	7619152988	[reactor] sort 'tags' to ensure a consistent order for test results	2020-08-15 18:22:31 +02:00
Mike Fährmann	cd9de613a2	[exhentai] adjust image limit costs (#940 ) Each original file costs 10 points per 10^6 bytes, not 10 per 2^20 == 1048576 bytes.	2020-08-15 18:19:33 +02:00
Mike Fährmann	2e6f6ee1c1	[mangoxo] fix login	2020-08-13 22:30:37 +02:00
Mike Fährmann	a6a080656c	[pixnet] detect password-protected albums (#177 )	2020-08-08 20:48:47 +02:00
Mike Fährmann	67ac6667af	[mangareader] fix extraction	2020-08-07 22:30:10 +02:00
Mike Fährmann	2b88c90f6f	[blogger] add search extractor (#925 )	2020-08-06 19:43:39 +02:00
Mike Fährmann	d5067c51c5	[instagram] support '/reel/' URLs	2020-08-06 19:20:25 +02:00
Mike Fährmann	2c9766b29f	fix UnboundLocalError in Extractor.request() introduced in `d6a271d`	2020-08-05 21:52:04 +02:00
Mike Fährmann	aa64149583	[blogger] support searching posts by labels (closes #925 )	2020-08-04 22:49:37 +02:00
Mike Fährmann	60ba3cb946	[reddit] support gallery posts (closes #920 )	2020-08-03 22:06:15 +02:00
Mike Fährmann	0d84d3af55	[subscribestar] extract attached media files (#852 )	2020-08-03 22:02:42 +02:00
Mike Fährmann	19bf76bcf8	update extractor test results	2020-08-03 21:57:00 +02:00
Mike Fährmann	0762d6b29c	[inkbunny] add 'num' field (#283 )	2020-07-30 19:26:09 +02:00
Mike Fährmann	fbc4278fe4	[instagram] wait before GraphQL requests (#901 )	2020-07-30 19:26:09 +02:00
Mike Fährmann	ec5870576d	[imgur] handle 403 overcapacity responses (closes #910 )	2020-07-30 19:26:01 +02:00
Mike Fährmann	d6a271d2c7	add 'response' objects to 'HttpError's	2020-07-30 18:23:26 +02:00
Mike Fährmann	72c5578a27	[hentainexus] improve/simplify code	2020-07-30 00:35:49 +02:00
Mike Fährmann	627d2141d3	[xhamster] fix extraction (closes #917 )	2020-07-29 22:51:34 +02:00
Mike Fährmann	3f73cc6855	allow 'parent-directory' to work recursively (fixes #905 )	2020-07-29 00:31:23 +02:00
Mike Fährmann	27e31f4a16	[myportfolio] raise 'NotFoundError' for deleted posts	2020-07-27 16:15:24 +02:00
Mike Fährmann	f317a57c5e	[simplyhentai] fix 'gallery_id' extraction	2020-07-27 16:14:06 +02:00
Mike Fährmann	daeef8a5e3	[vsco] handle missing 'description' fields	2020-07-27 14:45:17 +02:00
Mike Fährmann	26a967cbd4	[pinterest] match 'pinterest.co.uk' URLs (fixes #914 )	2020-07-27 14:41:34 +02:00
Mike Fährmann	c5aaa1de77	[inkbunny] simplify metadata structure (#283 ) Just put everything at the top level, instead of having a separate 'post' object.	2020-07-26 23:43:50 +02:00
Mike Fährmann	b921fee24d	[inkbunny] fix submission order (#283 ) Getting detailed submission info via /api_submissions.php reordered the input submissions and sorted them by ID. InkbunnyAPI.detail() now sorts them back and ensures they are returned in their original order. This commit also removes the 'metadata' option and always requests submission descriptions.	2020-07-26 23:12:45 +02:00
Mike Fährmann	e50c75628c	[subscribestar] update 'date' parsing	2020-07-24 22:27:36 +02:00
Mike Fährmann	c4ed9f4faa	[inkbunny] add 'metadata' option (#283 )	2020-07-24 18:05:53 +02:00
Mike Fährmann	493cadb1e7	[inkbunny] add 'orderby' option (#283 )	2020-07-24 17:50:32 +02:00
Mike Fährmann	336e682a7a	[inkbunny] handle gallery/scraps URLs (#283 )	2020-07-24 17:05:00 +02:00
Mike Fährmann	8dbf827649	[bobx] remove module	2020-07-24 17:00:43 +02:00
Mike Fährmann	8f64585ff2	[twitter] handle 429 responses without x-rate-limit-reset header	2020-07-23 22:38:17 +02:00

... 3 4 5 6 7 ...

2783 Commits