Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics ( #5123 )
2024-02-04 21:38:46 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags ( #5120 )
2024-01-27 16:24:03 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs ( #5116 )
2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' ( #5114 )
2024-01-25 23:45:41 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain ( #5088 )
2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis ( #1443 , #2677 , #3378 )
...
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas
2024-01-21 09:50:27 +11:00
Mike Fährmann
0d367ce1b9
[tests] update extractor results
2024-01-20 18:02:36 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
...
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
...
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
...
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain
2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values
2024-01-19 14:21:09 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag ( #5076 )
2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize ( #1443 )
...
- support mediawiki.org
- support mariowiki.com (#3660 )
- combine code into a single extractor
(use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
...
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061 : [webtoons] extract more metadata
...
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
...
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
68196589c4
[2ch] update
...
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided
2024-01-14 22:47:16 +01:00
blankie
bb446b1598
[webtoons] extract more metadata
2024-01-14 19:26:49 +11:00
Mike Fährmann
355b909f46
merge #5041 : [steamgriddb] add support ( #5033 )
2024-01-13 00:59:15 +01:00
Mike Fährmann
71e2c3e5a2
merge #5037 : [hatenablog] add support ( #5036 )
2024-01-13 00:57:21 +01:00
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl
2024-01-12 02:33:27 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts ( #5049 )
2024-01-11 05:07:38 +01:00
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option ( #4995 )
2024-01-10 17:13:34 +01:00
Mike Fährmann
887ade30a5
[batoto] support more mirror domains ( #5042 )
2024-01-09 18:02:49 +01:00
blankie
2ccb7d3bd3
[steamgriddb] add support
2024-01-09 17:12:56 +11:00
blankie
2cfe788f93
[hatenablog] fix extractor naming errors
2024-01-09 01:42:57 +11:00
blankie
61f3b2f820
[hatenablog] add support
2024-01-09 01:29:47 +11:00
Mike Fährmann
657ed93a22
[batoto] improve v2 manga URL pattern
...
and add tests
2024-01-07 22:23:30 +01:00
Mike Fährmann
33f228756a
[mangadex] add 'list' extractor ( #5025 )
...
supports listing manga and chapters from list feed
2024-01-07 02:59:35 +01:00
Mike Fährmann
c25bdbae91
[komikcast] fix 'manga' extractor ( #5027 )
2024-01-06 14:19:44 +01:00
Mike Fährmann
8e1a2b5446
[komikcast] update domain to 'komikcast.lol' ( #5027 )
2024-01-06 02:16:43 +01:00
Mike Fährmann
a441249ea2
merge #4979 : [batoto] add 'chapter' and 'manga' extractors ( #1434 , #2111 )
2024-01-06 01:53:26 +01:00
Mike Fährmann
b11c352d66
[bato] rename to 'batoto'
...
to use the same category name as the previous bato.to site
2024-01-06 01:49:34 +01:00
Mike Fährmann
3aa24c3744
[bato] simplify and update
2024-01-06 01:10:04 +01:00
Mike Fährmann
11150a7d72
[nudecollect] remove module
2024-01-05 21:32:04 +01:00
Mike Fährmann
c158927c38
merge #5016 : [zzup] add 'gallery' extractor ( #4517 , #4604 , #4659 , #4863 )
2024-01-05 21:25:46 +01:00
Mike Fährmann
217fa7f8a1
include 'test/results' in flake8 checks
2024-01-05 18:16:33 +01:00
Mike Fährmann
e61f016465
[szurubooru] support 'snootbooru.com' ( #5023 )
2024-01-05 17:56:39 +01:00
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor ( #5022 )
...
yet another bug caused by a383eca7
2024-01-05 17:18:33 +01:00
Mike Fährmann
0ab0a10d2d
[jpgfish] update domain
2024-01-05 02:27:20 +01:00
enduser420
0f30136109
[zzup] add 'gallery' extractor
2024-01-04 21:38:59 +05:30
Mike Fährmann
7eaf648f2e
[fanbox] add 'metadata' option ( #4921 )
...
extracts 'plan' and extended 'user' metadata
2024-01-04 15:01:33 +01:00