gallery-dl

mirror of https://github.com/mikf/gallery-dl.git synced 2025-02-01 03:51:42 +01:00

Author	SHA1	Message	Date
Mike Fährmann	a453335a9f	remove test results in extractor modules and add generic example URLs	2023-09-11 16:30:55 +02:00
Mike Fährmann	a383eca7f6	decouple extractor initialization Introduce an 'initialize()' function that does the actual init (session, cookies, config options) and can called separately from the constructor __init__(). This allows, for example, to adjust config access inside a Job before most of it already happened when calling 'extractor.find()'.	2023-07-25 22:16:16 +02:00
Mike Fährmann	dd884b02ee	replace json.loads with direct calls to JSONDecoder.decode	2023-02-09 15:22:00 +01:00
sudo	a6305d031c	[hitomi] apply format check for every image (#3030 ) (#3280 )	2022-11-27 15:55:25 +01:00
Mike Fährmann	b2b0b1c455	[hitomi] fall back to webp when format not available (#3030 )	2022-10-11 10:48:28 +02:00
Mike Fährmann	2eb0ddd083	[hitomi] fix error when number of tag results is multiple of 25 (#2870)	2022-08-28 17:06:11 +02:00
Mike Fährmann	946643c23c	[hitomi] use maxage for gg.js cache (#2863 ) cached values become invalid after 1-2 hours	2022-08-26 17:57:17 +02:00
Mike Fährmann	37d584a9b2	[hitomi] update metadata extraction (fixes #2444 ) remove 'hitomi.metadata' option, as it is no longer necessary to make additional HTTP requests to fetch all metadata.	2022-03-26 12:46:18 +01:00
Mike Fährmann	dee0d22561	update extractor test results	2022-02-06 21:39:24 +01:00
Mike Fährmann	86fa412b47	[hitomi] add 'format' option (#2260 ) default is 'webp' since downloading original files is no longer allowed	2022-02-03 23:32:19 +01:00
Mike Fährmann	17c9c47ca0	[hitomi] fix 'tag' extraction (fixes #2189 )	2022-01-13 16:45:46 +01:00
Mike Fährmann	8b910dd8ae	[hitomi] fix image URLs again and again ...	2022-01-06 18:21:26 +01:00
Mike Fährmann	38e2af29d6	[hitomi] fix image URLs update '_parse_gg()' yet again	2022-01-03 16:41:00 +01:00
Mike Fährmann	1e0278702d	[hitomi] update '_parse_gg()'	2022-01-01 17:55:58 +01:00
Mike Fährmann	becc7f85a6	[hitomi] fix image URLs	2021-12-29 22:46:17 +01:00
Mike Fährmann	099ed72de7	[hitomi] disable extra 'metadata' by default safes one HTTP request that not needed with default filename settings	2021-12-16 22:21:07 +01:00
Mike Fährmann	211de95dd0	update extractor test results	2021-11-01 02:58:53 +01:00
YongChan Cho	14852f7050	[hitomi] fix image path (#1988 )	2021-10-30 21:45:01 +02:00
Ryu juheon	d4614e5ba4	[hitomi] fix image URLs (#1982 )	2021-10-28 19:29:48 +02:00
Ryu juheon	6b6d92d51c	[hitomi]: fix image URLs (#1975 )	2021-10-26 19:35:01 +02:00
Mike Fährmann	47a780942c	update extractor test results	2021-09-03 19:36:12 +02:00
Ryu JuHeon	9429eaa0a3	[hitomi]: fix image URLs (#1765 )	2021-08-12 14:39:10 +02:00
Mike Fährmann	5612ca31c2	[hitomi] fix image URLs (closes #1679 )	2021-07-09 18:01:49 +02:00
Mike Fährmann	e98fa01c44	[hitomi] update image URL code (fixes #1637 )	2021-06-18 16:44:22 +02:00
Mike Fährmann	968d3e8465	remove '&' from URL patterns '/?&#' -> '/?#' and '?&#' -> '?#' According to https://www.ietf.org/rfc/rfc3986.txt, URLs are "organized hierarchically" by using "the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components"	2020-10-22 23:31:25 +02:00
Mike Fährmann	ffd38215a4	[hitomi] fix image URLs and URL pattern - non-webp files are now hosted on [a-c]b.hitomi.la - removed ampersand from invalid slug characters	2020-10-22 15:15:34 +02:00
Mike Fährmann	7cd383c0f9	update extractor test results	2020-09-20 21:54:39 +02:00
Mike Fährmann	deaacc70bb	[hitomi] update URL pattern for tag searches	2020-08-27 22:46:03 +02:00
Mike Fährmann	7140fe7e6d	[hitomi] fix redirect processing	2020-08-23 15:18:44 +02:00
Mike Fährmann	a3de234e70	[hitomi] add extractor for tag searches (closes #697 )	2020-04-20 21:55:19 +02:00
Mike Fährmann	55ac408bdf	[hitomi] fix extraction of galleries without tags	2020-04-20 21:42:14 +02:00
Mike Fährmann	59edcdc822	[hitomi] restore metadata fields from before f33b13a ... and add a 'metadata' option to disable visiting the gallery page and extracting data from it if this is not needed.	2020-03-12 23:43:41 +01:00
Mike Fährmann	f33b13aacf	[hitomi] simplify metadata extraction Use the data from https://ltn.hitomi.la/galleries/<id>.js for both image URLs and metadata and ignore any gallery or reader pages. This removes 'artist', 'characters', 'group', and 'parody' metadata fields since this information is, as for now, only available in gallery pages.	2020-03-04 01:22:32 +01:00
Mike Fährmann	80ecb99089	[hitomi] fix extraction	2020-02-22 22:07:21 +01:00
Mike Fährmann	5607dd3646	[hitomi] follow multiple redirects	2020-02-20 18:22:13 +01:00
Mike Fährmann	d1de7dc296	[hitomi] implement workaround for "broken" redirects Some galleries redirect to a new "version" with different gallery id. This new version might not be available any more, but the /reader/ page for the original gallery id can still work.	2020-02-02 17:24:23 +01:00
Mike Fährmann	8bb32ee188	[hitomi] fix image URLs	2020-01-14 12:04:48 +01:00
Mike Fährmann	f8ac67ce50	[hitomi] extend URL pattern + follow redirects	2019-11-01 21:40:10 +01:00
Mike Fährmann	8361d874d7	[hitomi] fix extraction	2019-10-29 16:23:20 +01:00
Mike Fährmann	1693d97bd3	update extractor class hierarchies - let the GalleryExtractor class inherit directly from Extractor - make ChapterExtractor a subclass of GalleryExtractor - change enumeration field names of GalleryExtractors to 'num'	2019-10-16 18:15:29 +02:00
Mike Fährmann	15af2f8464	[hitomi] fallback to /reader/ page if main page returns 404 Some galleries return a 404: Not Found error when trying to access them through the main gallery URL, but their content is still available on the respective /reader/ page.	2019-10-11 18:39:52 +02:00
Mike Fährmann	cf5e716b9d	[hitomi] fix image URLs	2019-10-09 17:21:37 +02:00
Mike Fährmann	a732e9c430	[instagram] update query hashes and headers	2019-08-10 14:13:08 +02:00
Mike Fährmann	055102431f	[hitomi] handle Game CG galleries with scenes (fixes #321 )	2019-06-27 20:25:40 +02:00
Mike Fährmann	b51baa9a4b	[hitomi] fix empty language detection; parse datetime	2019-06-17 20:02:58 +02:00
Mike Fährmann	fc5e4f2b21	[hitomi] simplify data extraction code	2019-05-01 11:14:21 +02:00
Mike Fährmann	2756cc8dde	[hitomi] set Referer header (fixes #239 )	2019-05-01 10:56:00 +02:00
Mike Fährmann	26c4365baa	adjust metadata types for GalleryExtractors	2019-03-02 14:53:04 +01:00
Mike Fährmann	3595cd582f	use GalleryExtractor as common base class	2019-03-01 14:13:16 +01:00
Mike Fährmann	5530871b5a	change results of text.nameext_from_url() Instead of getting a complete 'filename' from an URL and splitting that into 'name' and 'extension', the new approach gets rid of the complete version and renames 'name' to 'filename'. (Using anything other than {extension} for a filename extension doesn't really work anyway) Example: "https://example.org/path/filename.ext" before: - filename : filename.ext - name : filename - extension: ext now: - filename : filename - extension: ext	2019-02-14 16:07:17 +01:00

1 2

73 Commits