Mike Fährmann
5d9ec91896
[azurlanewiki] supportedsites + test
2024-02-29 21:49:13 +01:00
Mike Fährmann
76581c13f7
handle URLs without '/' after their TLD ( #5252 )
2024-02-29 15:05:46 +01:00
Mike Fährmann
ba062712ad
[tests] '__main__' -> "__main__"
2024-02-27 02:10:05 +01:00
Mike Fährmann
8a11b72253
remove extractor/test.py ( #4504 )
2024-02-27 01:37:57 +01:00
Mike Fährmann
fde9e25c9f
[tests:kemonoparty] '.party' -> '.su'
2024-02-26 22:25:04 +01:00
Mike Fährmann
13443f40a3
[xvideos] support '/channels/' URLs ( #5244 )
2024-02-26 00:08:37 +01:00
Mike Fährmann
cc6b9e4c18
[zerochan] use API by default ( #3669 )
...
add 'pagination' option
2024-02-25 00:36:14 +01:00
blankie
962f55cc68
[artstation] fix handling usernames with dashes
2024-02-21 17:39:37 +11:00
Mike Fährmann
741fd00cec
[deviantart] extend 'metadata' option ( #5175 )
...
alloe fetching extended metadata in addition to the usual
'description', 'tags', etc by setting 'metadata' to a list of
'camera', 'stats', 'submission', 'collection', and 'gallery'
for example "metadata": "stats,submission"
2024-02-18 23:14:14 +01:00
Mike Fährmann
8a63801311
[vsco] add 'spaces' extractor ( #5202 )
...
for spaces listed on a user page
2024-02-17 18:20:48 +01:00
Mike Fährmann
ccb413df71
[wikimedia] support 'pidgi.net' and 'bulbapedia.bulbagarden.net' ( #5205 , #5206 )
2024-02-17 17:35:10 +01:00
Mike Fährmann
7033cc14e9
[vsco] add 'space' extractor ( #5202 )
2024-02-17 01:54:05 +01:00
Mike Fährmann
c9efccc959
[tests] update extractor results
2024-02-16 22:42:06 +01:00
Mike Fährmann
c413834dfc
[bluesky] extend tests
2024-02-16 16:30:02 +01:00
Mike Fährmann
24c1317e0d
[batoto] fix crash when manga/chapter contains a '-' ( #5200 )
2024-02-16 00:10:08 +01:00
Mike Fährmann
0abd9723af
[bluesky] add 'metadata' option ( #4438 )
...
allow extracting 'user' metadata and
make 'facets' extraction optional
2024-02-15 23:30:16 +01:00
Mike Fährmann
c97b92cc35
[fanbox] add 'home' and 'supporting' extractors ( #5138 )
2024-02-14 23:25:39 +01:00
Mike Fährmann
04e4ffc64c
[deviantart] combine 'png' option with 'quality' ( #4846 )
...
"quality": "png" to download PNGs instead og JPEGs
2024-02-14 22:07:29 +01:00
Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option ( #4846 )
2024-02-14 01:03:15 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
...
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor ( #5194 )
2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts ( #4567 , #5193 )
2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities ( #4913 )
2024-02-13 01:30:23 +01:00
Mike Fährmann
8f27f43d4d
[tests] implement explicitly disabling auth
2024-02-13 00:08:27 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction ( #5088 , #5151 , #5153 )
...
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor ( #5190 )
2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor ( #5143 )
2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187 : [skeb] add 'num' and 'count' metadata fields
2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests
2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
2024-02-10 18:17:07 +01:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites ( #1443 )
...
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata ( #4438 )
2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution ( #4438 )
...
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata ( #5171 )
2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support ( #4438 , #4708 , #4722 , #5047 )
2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs ( #5171 )
2024-02-07 14:57:13 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name ( #5104 )
2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics ( #5123 )
2024-02-04 21:38:46 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags ( #5120 )
2024-01-27 16:24:03 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs ( #5116 )
2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' ( #5114 )
2024-01-25 23:45:41 +01:00
Mike Fährmann
0d3af0d35b
[tests] ignore 'ytdl' categories when import fails ( #5095 )
2024-01-21 15:31:12 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain ( #5088 )
2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis ( #1443 , #2677 , #3378 )
...
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas
2024-01-21 09:50:27 +11:00
Mike Fährmann
0d367ce1b9
[tests] update extractor results
2024-01-20 18:02:36 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
...
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
...
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
...
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain
2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values
2024-01-19 14:21:09 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag ( #5076 )
2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize ( #1443 )
...
- support mediawiki.org
- support mariowiki.com (#3660 )
- combine code into a single extractor
(use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
...
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061 : [webtoons] extract more metadata
...
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
...
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
68196589c4
[2ch] update
...
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided
2024-01-14 22:47:16 +01:00
blankie
bb446b1598
[webtoons] extract more metadata
2024-01-14 19:26:49 +11:00
Mike Fährmann
355b909f46
merge #5041 : [steamgriddb] add support ( #5033 )
2024-01-13 00:59:15 +01:00
Mike Fährmann
71e2c3e5a2
merge #5037 : [hatenablog] add support ( #5036 )
2024-01-13 00:57:21 +01:00
Mike Fährmann
b97af09e03
[tests] include URL in failure report
2024-01-12 03:23:21 +01:00
Mike Fährmann
58e0665fbc
[tests] load config from external file
2024-01-12 03:21:44 +01:00
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl
2024-01-12 02:33:27 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts ( #5049 )
2024-01-11 05:07:38 +01:00
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option ( #4995 )
2024-01-10 17:13:34 +01:00
Mike Fährmann
887ade30a5
[batoto] support more mirror domains ( #5042 )
2024-01-09 18:02:49 +01:00
blankie
2ccb7d3bd3
[steamgriddb] add support
2024-01-09 17:12:56 +11:00
blankie
2cfe788f93
[hatenablog] fix extractor naming errors
2024-01-09 01:42:57 +11:00
blankie
61f3b2f820
[hatenablog] add support
2024-01-09 01:29:47 +11:00
Mike Fährmann
657ed93a22
[batoto] improve v2 manga URL pattern
...
and add tests
2024-01-07 22:23:30 +01:00
Mike Fährmann
33f228756a
[mangadex] add 'list' extractor ( #5025 )
...
supports listing manga and chapters from list feed
2024-01-07 02:59:35 +01:00
Mike Fährmann
c25bdbae91
[komikcast] fix 'manga' extractor ( #5027 )
2024-01-06 14:19:44 +01:00
Mike Fährmann
8e1a2b5446
[komikcast] update domain to 'komikcast.lol' ( #5027 )
2024-01-06 02:16:43 +01:00
Mike Fährmann
a441249ea2
merge #4979 : [batoto] add 'chapter' and 'manga' extractors ( #1434 , #2111 )
2024-01-06 01:53:26 +01:00
Mike Fährmann
b11c352d66
[bato] rename to 'batoto'
...
to use the same category name as the previous bato.to site
2024-01-06 01:49:34 +01:00
Mike Fährmann
3aa24c3744
[bato] simplify and update
2024-01-06 01:10:04 +01:00
Mike Fährmann
11150a7d72
[nudecollect] remove module
2024-01-05 21:32:04 +01:00
Mike Fährmann
c158927c38
merge #5016 : [zzup] add 'gallery' extractor ( #4517 , #4604 , #4659 , #4863 )
2024-01-05 21:25:46 +01:00
Mike Fährmann
217fa7f8a1
include 'test/results' in flake8 checks
2024-01-05 18:16:33 +01:00
Mike Fährmann
e61f016465
[szurubooru] support 'snootbooru.com' ( #5023 )
2024-01-05 17:56:39 +01:00
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor ( #5022 )
...
yet another bug caused by a383eca7
2024-01-05 17:18:33 +01:00
Mike Fährmann
0ab0a10d2d
[jpgfish] update domain
2024-01-05 02:27:20 +01:00
enduser420
0f30136109
[zzup] add 'gallery' extractor
2024-01-04 21:38:59 +05:30
Mike Fährmann
7eaf648f2e
[fanbox] add 'metadata' option ( #4921 )
...
extracts 'plan' and extended 'user' metadata
2024-01-04 15:01:33 +01:00
Mike Fährmann
4f3671458e
[deviantart] add 'avatar' and 'background' extractors ( #4995 )
2024-01-03 00:07:55 +01:00
Mike Fährmann
63f649cd92
[idolcomplex] fix extraction & update URL patterns ( #5002 )
2024-01-01 17:38:32 +01:00
Mike Fährmann
7aa1c9671b
[tests] fix 'invalid escape sequence' warnings
2024-01-01 16:12:43 +01:00
Mike Fährmann
b6903a4c90
[nijie] add 'count' metadata field
...
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1812849102
2023-12-30 22:25:59 +01:00
Mike Fährmann
b93b351db9
merge #4962 : [poringa] add support ( #4675 )
2023-12-30 20:39:35 +01:00
Mike Fährmann
9f21c839ad
[poringa] improvements and fixes
...
- add 'num' and 'count' metadata fields
- prevent crash for "private" posts
- prevent crash when there's no 'main-info'
- update tests
2023-12-30 20:37:09 +01:00
Mike Fährmann
caceb14fc2
[tests] fail when a results file contains syntax errors
...
or is otherwise not importable
2023-12-30 17:26:57 +01:00
Mike Fährmann
085411f3f1
[rule34] recognize URLs with 'www' subdomain ( #4984 )
2023-12-30 16:07:56 +01:00
Antonio
e348da7a06
[poringa] add support
2023-12-27 00:07:23 -06:00
bug-assassin
74c225f94e
[bato] add support
2023-12-26 22:33:33 -05:00