1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-22 10:42:34 +01:00
Commit Graph

1067 Commits

Author SHA1 Message Date
Mike Fährmann
db9833c28a
merge #5870: [aryion] add 'favorite' extractor (#4511) 2024-07-21 12:37:47 +02:00
Mike Fährmann
156a70bec0
[aryion] update favorite extractor
- add test case
- add docs/supportedsites entry
- add custom directory_fmt and archive_fmt
- remove constructor
- appease flake8
2024-07-21 12:34:06 +02:00
Mike Fährmann
727e53f513
[bunkr] support 'bunkr.fi' URLs (#5872) 2024-07-21 06:41:50 +02:00
Mike Fährmann
287a7d13cf
[sankaku] implement 'notes' extraction (#5865) 2024-07-18 20:44:49 +02:00
Mike Fährmann
026e0b97db
merge #5824: [furaffinity] add 'folders' and 'thumbnail' (#1284) 2024-07-18 01:41:15 +02:00
Mike Fährmann
60a2fefedd
[tests] restrict 'test_unique_pattern_matches' to internal extractors
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2231649983
2024-07-17 23:35:41 +02:00
Mike Fährmann
3fa74ca4d7
[tests] enable test results for external extractors (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2231649983
2024-07-17 22:29:09 +02:00
Mike Fährmann
f7383a56f8
wrap filters/conditionals in a try-except block
allows accessing undefined fields without exception or locals().get(…)
but hides mistakes/typos/etc by evaluating to False without feedback

performance loss compared to the previous version without try-except
is negligible (~20ns for me)
2024-07-12 22:51:11 +02:00
Mike Fährmann
c83c812a1e
[instagram][twitter] rename 'profile' to 'info' (#5262, #3623) 2024-07-11 00:22:39 +02:00
Mike Fährmann
21831eba1e
[tests] completely ignore '#auth' for 'only_matching' tests 2024-07-11 00:20:10 +02:00
Mike Fährmann
1527ad79e2
[tests] fix syntax for Python < 3.6
no f-strings
2024-07-06 18:39:59 +02:00
Mike Fährmann
da9916c01f
[pp:metadata] implement format strings for 'directory' (#5728) 2024-07-06 03:08:59 +02:00
Mike Fährmann
8f3f061daf
[hentainexus] fix error for spread pages (#5827) 2024-07-05 21:36:29 +02:00
Nicholas Bishop
f43bccb5be [furaffinity] Add 'thumbnail' (#1284) and 'folders' properties
Retrieve 'thumbnail' and 'folders' properties for each post.
'thumbnail' (#1284):
 - Preview image used for search results, writing posts, music, etc.
 - Filename format: <post_id>@600-<directory_containing_full_image>.jpg
'Folders' (related to #1817):
 - A list of all gallery folders containing this post
 - Folder name format: [<folder_category> - ]<folder_name>
 - Only works on new layout; old layout does not show folders, so list will be empty

A test is included for each property.
2024-07-04 15:41:14 -04:00
Mike Fährmann
d10bfa9065
[vipergirls] improve 'thread' URL pattern
allow for query parameters and fragments at tne end of URLs
2024-06-29 17:38:57 +02:00
Mike Fährmann
44896b0296
[instagram] add 'profile' extractor (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2188915210
2024-06-28 22:54:07 +02:00
Mike Fährmann
51fdfbe6fc
[erome] extract 'date' metadata (#5796) 2024-06-27 21:07:23 +02:00
Mike Fährmann
8f50c04af2
[formatter] implement 'X' format specifier (#5770) 2024-06-21 20:56:19 +02:00
enduser420
5adbfe526d [tcbscans] support other domains 2024-06-21 21:29:11 +05:30
Mike Fährmann
f58b0e6fc7
[twitter] ignore 'Unavailable' media (#5736)
… including geo-restricted content.

add 'unavailable' option to allow re-enabling them again
2024-06-21 00:15:10 +02:00
Mike Fährmann
8452d04a33
[fanbox] handle KeyError for no longer existing plans (#5759)
return the plan of the next higher tier instead
2024-06-21 00:13:25 +02:00
Mike Fährmann
ae3706286a
[speakerdeck] inherit from GalleryExtractor 2024-06-15 21:56:51 +02:00
Mike Fährmann
9c65db2a92
consistent 'with open(…) as fp:' syntax 2024-06-14 01:22:00 +02:00
Mike Fährmann
c6fc0281e8
[newgrounds] extend 'format' option (#5709)
- check more extensions for original formats (mp4, webm, m4v, mov, mkv)
- allow specifying which extensions and recoded formats to check
2024-06-12 20:46:45 +02:00
Mike Fährmann
86f0c3baaf
[szurubooru] support empty tag searches (#5711) 2024-06-11 20:25:06 +02:00
Mike Fährmann
2e11b6e756
[nijie] support downloading videos (#5707, #5617) 2024-06-08 22:55:28 +02:00
Mike Fährmann
f160859c5c
[hitomi] extract 'title_jpn' metadata (#5706) 2024-06-08 00:05:19 +02:00
Mike Fährmann
9abeab5ecf
[shimmie2] support 'vidya.pics' (#5632) 2024-06-06 15:08:56 +02:00
Mike Fährmann
7614bc458e
[util] extend CustomNone with comparison operators 2024-06-05 16:49:30 +02:00
Mike Fährmann
1ce5de0290
[formatter] implement 'C' format specifier (#5647)
to apply a conversion after ':' or
to apply multiple conversions

for example {tags:CSl} or {tags:J - /Cl}
to convert list to string and lowercase it
2024-06-05 16:49:29 +02:00
Mike Fährmann
9b99d2c886
[philomena] support downloading SVG files (#5643) 2024-06-05 16:48:51 +02:00
Mike Fährmann
8fce9ea6d5
[hentainexus] restore module (#5275)
revert 97641cd151
2024-06-05 16:48:25 +02:00
Mike Fährmann
4d11cd9ffb
[vichan] remove 'wikieat.club'
redirects to some non-vichan site
2024-06-02 18:16:24 +02:00
Mike Fährmann
31133b97fb
[nitter] remove instances 2024-06-02 18:15:53 +02:00
Mike Fährmann
ce228ee163
[photobucket] remove module
had been broken for years and the new site is payed access only
2024-06-02 01:40:31 +02:00
Mike Fährmann
009aa90c3f
[tests] update extractor results
at least some of them
2024-06-01 20:28:04 +02:00
Mike Fährmann
020050ea8b
merge #5641: [pixeldrain] add support for single file album download 2024-05-25 23:43:43 +02:00
Mike Fährmann
154a890399
[pixeldrain] integrate into 'album' extractor 2024-05-25 23:42:23 +02:00
HornyQT
24e70b956b [pixeldrain] add support for single file album download 2024-05-25 16:06:50 +02:00
Mike Fährmann
0761b22a7f
[hiperdex] update domain to 'hiperdex.top' (#5635) 2024-05-24 17:13:10 +02:00
Mike Fährmann
f651b3b6ab
merge #5601: [twitter] match '/video/' Tweet URLs 2024-05-17 22:49:12 +02:00
Mike Fährmann
7f1ed909d5
[imgur] match gallery/album/image URLs with title slugs (#5593) 2024-05-17 22:44:37 +02:00
Delphox
8ba73e2ec9
[twitter] match /video/ tweet urls 2024-05-17 16:50:51 -03:00
Mike Fährmann
2ee9ffeed6
merge #5568: [furaffinity] match 'xfuraffinity' URLs 2024-05-09 19:20:12 +02:00
Delphox
11109d5bad
[furaffinity] match xfuraffinity.com 2024-05-08 12:15:47 -03:00
Mike Fährmann
699592498b
[tests] use random port number for local HTTP server
… and explicitly bind to 127.0.0.1 instead of all interfaces
2024-05-02 22:54:15 +02:00
Mike Fährmann
bd8e4797e5
[vsco] add 'avatar' extractor (#5341) 2024-05-02 18:12:19 +02:00
Mike Fährmann
d0cead105b
[formatter] allow dots etc in '…' literals (#5539)
don't parse fields starting with '

this disables the ability to directly apply […] to '…' literals,
but that's not really useful anyway and can still be done with _lit
2024-05-02 17:43:24 +02:00
Mike Fährmann
8ed70b3256
[tests] mark tests with missing auth as 'only_matching'
… instead of skipping them completely
2024-05-01 16:00:07 +02:00
Mike Fährmann
3cf5366143
[mastodon] add support for card images 2024-05-01 16:00:07 +02:00
Mike Fährmann
9b1995dda3
[mastodon] add 'favorite', 'list', and 'hashtag' extractors (#5529) 2024-05-01 15:59:34 +02:00
Mike Fährmann
6c57958806
merge #5511: [twitter] [furaffinity] match fixvx.com and fxfuraffinity/fxraffinity.net URLs 2024-04-25 22:00:19 +02:00
Delphox
1886721d82
update tests 2024-04-25 13:28:30 -03:00
Mike Fährmann
cd241bea0a
[downloader:http] add MIME type and signature for .m4v files (#5505) 2024-04-25 01:01:35 +02:00
Mike Fährmann
068ccfe0b3
[tests] allow filtering extractor result tests by URL or comment
python test_results.py twitter:+/i/web/
python test_results.py twitter:~twitpic
2024-04-19 23:02:55 +02:00
Mike Fährmann
c9d3b5e5d9
[pixiv] change 'sanity_level' debug message to a warning (#5180) 2024-04-19 16:41:31 +02:00
Mike Fährmann
257e9fb435
[gelbooru] improve pagination logic for meta tags (#5478)
similar to 494acabd38
2024-04-15 23:14:48 +02:00
Mike Fährmann
e02d2ff45d
[tapas] add 'creator' extractor (#5306) 2024-04-11 23:41:50 +02:00
Mike Fährmann
35d4a706ae
[pixiv:novel] add 'covers' option (#5373) 2024-04-11 22:27:49 +02:00
Mike Fährmann
b57051719f
[wikimedia] support wiki.gg wikis 2024-04-09 19:24:01 +02:00
Mike Fährmann
40c1a8e471
[wikimedia] fix exception for files with empty 'metadata' 2024-04-09 19:12:15 +02:00
Mike Fährmann
0e730ba980
[pp:mtime] do not overwrite '_mtime' for None values (#5439) 2024-04-07 02:33:19 +02:00
Mike Fährmann
647a87d17c
[twitter] match '/photo/' Tweet URLs (#5443)
fixes regression introduced in 40c05535
2024-04-06 17:56:21 +02:00
Mike Fährmann
40bd145637
remove 'contextlib' imports 2024-04-06 16:59:09 +02:00
Mike Fährmann
9a8403917a
restore LD_LIBRARY_PATH for PyInstaller builds (#5421) 2024-04-06 16:58:33 +02:00
Mike Fährmann
095e5ded6f
[reddit] support comment embeds (#5366) 2024-04-01 23:35:42 +02:00
Mike Fährmann
64948f2c09
[foolfuuka] improve 'board' pattern & support pages (#5408) 2024-04-01 22:31:25 +02:00
Mike Fährmann
ef0c90414c
[wikimedia] suppress exception for entries without 'imageinfo' (#5384) 2024-03-26 15:33:26 +01:00
Mike Fährmann
9cce461627
[kemonoparty] add 'announcements' option (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2015919188
2024-03-26 15:20:14 +01:00
Mike Fährmann
72ac2c750d
[kemonoparty:favorite] support 'sort' and 'order' query params (#5375) 2024-03-26 02:27:36 +01:00
Mike Fährmann
d1d017ab5d
merge #5372: [twitter] match Tweet URLs with query parameters (#5371)
fixes regression introduced in 40c05535
2024-03-25 22:01:46 +01:00
fireattack
423599ce95
[twitter] fix pattern for single tweet (#5371)
- Add optional slash
- Update tests to include some non-standard tweet URLs
2024-03-25 21:57:35 +01:00
Mike Fährmann
15a4bc2584
[kemonoparty] fix KeyError for empty files (#5368) 2024-03-24 02:21:38 +01:00
Mike Fährmann
31e7ca73b6
[gelbooru] add 'order-posts' option for favorites (#5220) 2024-03-23 13:30:09 +01:00
Mike Fährmann
55e8fdad29
[tests] use 'datetime.timezone.utc' instead of 'datetime.UTC'
'datetime.UTC' was added in Python 3.11
and is not defined in older versions.
2024-03-22 18:16:24 +01:00
Mike Fährmann
4b6f47e571
[pornhub:gif] extract 'viewkey' and 'timestamp' metadata (#4463)
https://github.com/mikf/gallery-dl/issues/4463#issuecomment-2014550302
2024-03-22 18:00:20 +01:00
Mike Fährmann
7a7dc442a0
[tests] update extractor results 2024-03-22 17:57:04 +01:00
Mike Fährmann
1d6260f456
[bunkr] remove 'description' metadata
album descriptions are no longer available on album pages
and the previous code erroneously returned just '0'
2024-03-22 02:14:41 +01:00
Mike Fährmann
32262a048b
[idolcomplex] fix metadata extraction
- replace legacy 'id' vales with alphanumeric ones, since the former are
  no longer available
- approximate 'vote_average', since the real value is no longer
  available
- fix 'vote_count'
2024-03-22 01:43:05 +01:00
Mike Fährmann
ddb2edfd32
[formatter] fix local DST datetime offsets for ':O'
'O' would get the *current* local UTC offset and apply it to all
'datetime' objects it gets applied to.
This would result in a wrong offset if the current offset includes
DST and the target 'datetime' does not or vice-versa.

'O' now determines the correct local UTC offset while respecting DST for
each individual 'datetime'.
2024-03-21 20:45:46 +01:00
Mike Fährmann
da6ba60331
[bluesky] add 'instance' metadata field (#4438) 2024-03-18 17:36:16 +01:00
Mike Fährmann
718c870430
[tests] show full path for nested values
'user.name' instead of just 'name' when testing for
"user": { … , "name": "…", … }
2024-03-18 17:36:16 +01:00
Mike Fährmann
26bc2d55f4
[hiperdex] update URL patterns & fix 'manga' metadata (#5340) 2024-03-18 17:36:16 +01:00
Mike Fährmann
8e694d85c4
[twitter] add 'birdwatch' metadata field (#5317)
should probably get a better name,
but this is what it's called internally by Twitter
2024-03-18 17:36:02 +01:00
Mike Fährmann
b8e7be225c
merge #5333: [imagefap] fix folder extractor 2024-03-15 23:46:43 +01:00
Herp
99c53f7fa8
Fix imagefap extrcator 2024-03-15 23:44:25 +01:00
Mike Fährmann
1418c0ce38
[kemonoparty] add 'revision_count' metadata field (#5334) 2024-03-15 22:28:15 +01:00
Mike Fährmann
5716430c35
[deviantart:stash] recognize 'deviantart.com/stash/…' URLs 2024-03-15 18:14:55 +01:00
Mike Fährmann
76683c5f5c
[deviantart:stash] fix 'index' metadata (#5335) 2024-03-15 18:10:59 +01:00
Mike Fährmann
108abab537
[twitter] add 'protected' metadata field (#5327)
for 'author' and 'user'
2024-03-13 14:46:03 +01:00
blankie
225d849139
[mastodon] fix handling null 'moved' account field 2024-03-12 11:44:25 +11:00
Mike Fährmann
ac4e29f70a
[lensdump] support more direct link formats (#5293) 2024-03-09 23:33:58 +01:00
Mike Fährmann
d3003f8531
merge #5270: [imagefap] add 'folder' metadata 2024-03-07 01:31:40 +01:00
Mike Fährmann
05331f9cf1
[imagefap] flake8, cleanup, tests 2024-03-07 01:29:19 +01:00
Mike Fährmann
40c0553523
[twitter] add 'quotes' extractor (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-1981571924

It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
2024-03-07 00:52:50 +01:00
Mike Fährmann
790c0ffb8d
[lensdump] recognize direct image links (#5293 2024-03-06 22:56:57 +01:00
Mike Fährmann
6d9e3c0eb1
[skeb] add extractor for followed users (#5290)
needs 'Authorization' header from browser session
-o headers.Authorization="Bearer ey…"
2024-03-06 22:43:01 +01:00
Mike Fährmann
ace16f00f5
[weibo] fix retweets (#2825, #3874, #5263)
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
2024-03-06 19:36:53 +01:00
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions (#5287) 2024-03-06 19:36:32 +01:00
Mike Fährmann
a8027745e3
[downloader:http] add MIME type and signature for .mov files (#5287) 2024-03-06 14:00:24 +01:00
Mike Fährmann
24873c2724
[warosu] fix crash for threads with deleted posts (#5289) 2024-03-06 01:27:45 +01:00
Mike Fährmann
f296067797
[naver] unescape post 'title' and 'description' 2024-03-06 00:46:19 +01:00
Mike Fährmann
a71cdab53e
merge #5126: [naver] fix EUC-KR encoding issue in old image URLs 2024-03-06 00:22:33 +01:00
Mike Fährmann
a8d3efbb99
[naver] simplify code + add test 2024-03-06 00:21:23 +01:00
Mike Fährmann
7b28418f69
[naver] recognize '.naver' URLs
https://blog.naver.com/PostView.naver?…
2024-03-05 22:30:29 +01:00
Mike Fährmann
a767832332
[deviantart:avatar] ignore default avatars (#5276) 2024-03-04 23:11:30 +01:00
Mike Fährmann
6482bbc525
[bluesky] handle different 'embed' structure 2024-03-03 20:41:01 +01:00
Mike Fährmann
1a9b9aa310
[artstation] support video clips (#2566, #3309, #3911)
- add 'videos' and 'previews' options
- fix 403 errors for video previews
2024-03-03 18:00:45 +01:00
Mike Fährmann
25d2854272
[deviantart] add 'comments-avatars' option (#4995) 2024-03-02 21:59:16 +01:00
Mike Fährmann
cf9e99c07b
[artstation] support collections (#146)
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1972101003
2024-03-01 20:21:21 +01:00
Mike Fährmann
32ec695195
merge #5256: [wikimedia] add azurlane.koumakan.jp 2024-02-29 21:50:24 +01:00
Mike Fährmann
5d9ec91896
[azurlanewiki] supportedsites + test 2024-02-29 21:49:13 +01:00
Mike Fährmann
76581c13f7
handle URLs without '/' after their TLD (#5252) 2024-02-29 15:05:46 +01:00
Mike Fährmann
ba062712ad
[tests] '__main__' -> "__main__" 2024-02-27 02:10:05 +01:00
Mike Fährmann
8a11b72253
remove extractor/test.py (#4504) 2024-02-27 01:37:57 +01:00
Mike Fährmann
fde9e25c9f
[tests:kemonoparty] '.party' -> '.su' 2024-02-26 22:25:04 +01:00
Mike Fährmann
13443f40a3
[xvideos] support '/channels/' URLs (#5244) 2024-02-26 00:08:37 +01:00
Mike Fährmann
cc6b9e4c18
[zerochan] use API by default (#3669)
add 'pagination' option
2024-02-25 00:36:14 +01:00
blankie
962f55cc68
[artstation] fix handling usernames with dashes 2024-02-21 17:39:37 +11:00
Mike Fährmann
741fd00cec
[deviantart] extend 'metadata' option (#5175)
alloe fetching extended metadata in addition to the usual
'description', 'tags', etc by setting 'metadata' to a list of
'camera', 'stats', 'submission', 'collection', and 'gallery'

for example "metadata": "stats,submission"
2024-02-18 23:14:14 +01:00
Mike Fährmann
8a63801311
[vsco] add 'spaces' extractor (#5202)
for spaces listed on a user page
2024-02-17 18:20:48 +01:00
Mike Fährmann
ccb413df71
[wikimedia] support 'pidgi.net' and 'bulbapedia.bulbagarden.net' (#5205, #5206) 2024-02-17 17:35:10 +01:00
Mike Fährmann
7033cc14e9
[vsco] add 'space' extractor (#5202) 2024-02-17 01:54:05 +01:00
Mike Fährmann
c9efccc959
[tests] update extractor results 2024-02-16 22:42:06 +01:00
Mike Fährmann
c413834dfc
[bluesky] extend tests 2024-02-16 16:30:02 +01:00
Mike Fährmann
24c1317e0d
[batoto] fix crash when manga/chapter contains a '-' (#5200) 2024-02-16 00:10:08 +01:00
Mike Fährmann
0abd9723af
[bluesky] add 'metadata' option (#4438)
allow extracting 'user' metadata and
make 'facets' extraction optional
2024-02-15 23:30:16 +01:00
Mike Fährmann
c97b92cc35
[fanbox] add 'home' and 'supporting' extractors (#5138) 2024-02-14 23:25:39 +01:00
Mike Fährmann
04e4ffc64c
[deviantart] combine 'png' option with 'quality' (#4846)
"quality": "png" to download PNGs instead og JPEGs
2024-02-14 22:07:29 +01:00
Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option (#4846) 2024-02-14 01:03:15 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor (#5194) 2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts (#4567, #5193) 2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities (#4913) 2024-02-13 01:30:23 +01:00
Mike Fährmann
8f27f43d4d
[tests] implement explicitly disabling auth 2024-02-13 00:08:27 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction (#5088, #5151, #5153)
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor (#5190) 2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor (#5143) 2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187: [skeb] add 'num' and 'count' metadata fields 2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests 2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option (#5183) 2024-02-10 18:17:07 +01:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites (#1443)
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata (#4438) 2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution (#4438)
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata (#5171) 2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support (#4438, #4708, #4722, #5047) 2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs (#5171) 2024-02-07 14:57:13 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name (#5104) 2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics (#5123) 2024-02-04 21:38:46 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags (#5120) 2024-01-27 16:24:03 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs (#5116) 2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option (#5073) 2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions (#5013)
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects (#5103)
... when computing 'revision_hash'

regression caused by 3d68eda4

dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' (#5114) 2024-01-25 23:45:41 +01:00
Mike Fährmann
0d3af0d35b
[tests] ignore 'ytdl' categories when import fails (#5095) 2024-01-21 15:31:12 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain (#5088) 2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis (#1443, #2677, #3378)
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas 2024-01-21 09:50:27 +11:00
Mike Fährmann
0d367ce1b9
[tests] update extractor results 2024-01-20 18:02:36 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain 2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values 2024-01-19 14:21:09 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag (#5076) 2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs (#5073) 2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize (#1443)
- support mediawiki.org
- support mariowiki.com (#3660)

- combine code into a single extractor
  (use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata (#4706, #4727, #5013)
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.

This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061: [webtoons] extract more metadata
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
68196589c4
[2ch] update
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided 2024-01-14 22:47:16 +01:00
blankie
bb446b1598
[webtoons] extract more metadata 2024-01-14 19:26:49 +11:00
Mike Fährmann
355b909f46
merge #5041: [steamgriddb] add support (#5033) 2024-01-13 00:59:15 +01:00
Mike Fährmann
71e2c3e5a2
merge #5037: [hatenablog] add support (#5036) 2024-01-13 00:57:21 +01:00
Mike Fährmann
b97af09e03
[tests] include URL in failure report 2024-01-12 03:23:21 +01:00
Mike Fährmann
58e0665fbc
[tests] load config from external file 2024-01-12 03:21:44 +01:00
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl 2024-01-12 02:33:27 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts (#5049) 2024-01-11 05:07:38 +01:00
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option (#4995) 2024-01-10 17:13:34 +01:00
Mike Fährmann
887ade30a5
[batoto] support more mirror domains (#5042) 2024-01-09 18:02:49 +01:00
blankie
2ccb7d3bd3
[steamgriddb] add support 2024-01-09 17:12:56 +11:00
blankie
2cfe788f93
[hatenablog] fix extractor naming errors 2024-01-09 01:42:57 +11:00
blankie
61f3b2f820
[hatenablog] add support 2024-01-09 01:29:47 +11:00
Mike Fährmann
657ed93a22
[batoto] improve v2 manga URL pattern
and add tests
2024-01-07 22:23:30 +01:00
Mike Fährmann
33f228756a
[mangadex] add 'list' extractor (#5025)
supports listing manga and chapters from list feed
2024-01-07 02:59:35 +01:00
Mike Fährmann
c25bdbae91
[komikcast] fix 'manga' extractor (#5027) 2024-01-06 14:19:44 +01:00
Mike Fährmann
8e1a2b5446
[komikcast] update domain to 'komikcast.lol' (#5027) 2024-01-06 02:16:43 +01:00
Mike Fährmann
a441249ea2
merge #4979: [batoto] add 'chapter' and 'manga' extractors (#1434, #2111) 2024-01-06 01:53:26 +01:00
Mike Fährmann
b11c352d66
[bato] rename to 'batoto'
to use the same category name as the previous bato.to site
2024-01-06 01:49:34 +01:00
Mike Fährmann
3aa24c3744
[bato] simplify and update 2024-01-06 01:10:04 +01:00
Mike Fährmann
11150a7d72
[nudecollect] remove module 2024-01-05 21:32:04 +01:00
Mike Fährmann
c158927c38
merge #5016: [zzup] add 'gallery' extractor (#4517, #4604, #4659, #4863) 2024-01-05 21:25:46 +01:00
Mike Fährmann
217fa7f8a1
include 'test/results' in flake8 checks 2024-01-05 18:16:33 +01:00
Mike Fährmann
e61f016465
[szurubooru] support 'snootbooru.com' (#5023) 2024-01-05 17:56:39 +01:00
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor (#5022)
yet another bug caused by a383eca7
2024-01-05 17:18:33 +01:00
Mike Fährmann
0ab0a10d2d
[jpgfish] update domain 2024-01-05 02:27:20 +01:00
enduser420
0f30136109 [zzup] add 'gallery' extractor 2024-01-04 21:38:59 +05:30