Mike Fährmann
e02d2ff45d
[tapas] add 'creator' extractor ( #5306 )
2024-04-11 23:41:50 +02:00
Mike Fährmann
35d4a706ae
[pixiv:novel] add 'covers' option ( #5373 )
2024-04-11 22:27:49 +02:00
Mike Fährmann
b57051719f
[wikimedia] support wiki.gg wikis
2024-04-09 19:24:01 +02:00
Mike Fährmann
40c1a8e471
[wikimedia] fix exception for files with empty 'metadata'
2024-04-09 19:12:15 +02:00
Mike Fährmann
0e730ba980
[pp:mtime] do not overwrite '_mtime' for None values ( #5439 )
2024-04-07 02:33:19 +02:00
Mike Fährmann
647a87d17c
[twitter] match '/photo/' Tweet URLs ( #5443 )
...
fixes regression introduced in 40c05535
2024-04-06 17:56:21 +02:00
Mike Fährmann
40bd145637
remove 'contextlib' imports
2024-04-06 16:59:09 +02:00
Mike Fährmann
9a8403917a
restore LD_LIBRARY_PATH for PyInstaller builds ( #5421 )
2024-04-06 16:58:33 +02:00
Mike Fährmann
095e5ded6f
[reddit] support comment embeds ( #5366 )
2024-04-01 23:35:42 +02:00
Mike Fährmann
64948f2c09
[foolfuuka] improve 'board' pattern & support pages ( #5408 )
2024-04-01 22:31:25 +02:00
Mike Fährmann
ef0c90414c
[wikimedia] suppress exception for entries without 'imageinfo' ( #5384 )
2024-03-26 15:33:26 +01:00
Mike Fährmann
9cce461627
[kemonoparty] add 'announcements' option ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2015919188
2024-03-26 15:20:14 +01:00
Mike Fährmann
72ac2c750d
[kemonoparty:favorite] support 'sort' and 'order' query params ( #5375 )
2024-03-26 02:27:36 +01:00
Mike Fährmann
d1d017ab5d
merge #5372 : [twitter] match Tweet URLs with query parameters ( #5371 )
...
fixes regression introduced in 40c05535
2024-03-25 22:01:46 +01:00
fireattack
423599ce95
[twitter] fix pattern for single tweet ( #5371 )
...
- Add optional slash
- Update tests to include some non-standard tweet URLs
2024-03-25 21:57:35 +01:00
Mike Fährmann
15a4bc2584
[kemonoparty] fix KeyError for empty files ( #5368 )
2024-03-24 02:21:38 +01:00
Mike Fährmann
31e7ca73b6
[gelbooru] add 'order-posts' option for favorites ( #5220 )
2024-03-23 13:30:09 +01:00
Mike Fährmann
55e8fdad29
[tests] use 'datetime.timezone.utc' instead of 'datetime.UTC'
...
'datetime.UTC' was added in Python 3.11
and is not defined in older versions.
2024-03-22 18:16:24 +01:00
Mike Fährmann
4b6f47e571
[pornhub:gif] extract 'viewkey' and 'timestamp' metadata ( #4463 )
...
https://github.com/mikf/gallery-dl/issues/4463#issuecomment-2014550302
2024-03-22 18:00:20 +01:00
Mike Fährmann
7a7dc442a0
[tests] update extractor results
2024-03-22 17:57:04 +01:00
Mike Fährmann
1d6260f456
[bunkr] remove 'description' metadata
...
album descriptions are no longer available on album pages
and the previous code erroneously returned just '0'
2024-03-22 02:14:41 +01:00
Mike Fährmann
32262a048b
[idolcomplex] fix metadata extraction
...
- replace legacy 'id' vales with alphanumeric ones, since the former are
no longer available
- approximate 'vote_average', since the real value is no longer
available
- fix 'vote_count'
2024-03-22 01:43:05 +01:00
Mike Fährmann
ddb2edfd32
[formatter] fix local DST datetime offsets for ':O'
...
'O' would get the *current* local UTC offset and apply it to all
'datetime' objects it gets applied to.
This would result in a wrong offset if the current offset includes
DST and the target 'datetime' does not or vice-versa.
'O' now determines the correct local UTC offset while respecting DST for
each individual 'datetime'.
2024-03-21 20:45:46 +01:00
Mike Fährmann
da6ba60331
[bluesky] add 'instance' metadata field ( #4438 )
2024-03-18 17:36:16 +01:00
Mike Fährmann
718c870430
[tests] show full path for nested values
...
'user.name' instead of just 'name' when testing for
"user": { … , "name": "…", … }
2024-03-18 17:36:16 +01:00
Mike Fährmann
26bc2d55f4
[hiperdex] update URL patterns & fix 'manga' metadata ( #5340 )
2024-03-18 17:36:16 +01:00
Mike Fährmann
8e694d85c4
[twitter] add 'birdwatch' metadata field ( #5317 )
...
should probably get a better name,
but this is what it's called internally by Twitter
2024-03-18 17:36:02 +01:00
Mike Fährmann
b8e7be225c
merge #5333 : [imagefap] fix folder extractor
2024-03-15 23:46:43 +01:00
Herp
99c53f7fa8
Fix imagefap extrcator
2024-03-15 23:44:25 +01:00
Mike Fährmann
1418c0ce38
[kemonoparty] add 'revision_count' metadata field ( #5334 )
2024-03-15 22:28:15 +01:00
Mike Fährmann
5716430c35
[deviantart:stash] recognize 'deviantart.com/stash/…' URLs
2024-03-15 18:14:55 +01:00
Mike Fährmann
76683c5f5c
[deviantart:stash] fix 'index' metadata ( #5335 )
2024-03-15 18:10:59 +01:00
Mike Fährmann
108abab537
[twitter] add 'protected' metadata field ( #5327 )
...
for 'author' and 'user'
2024-03-13 14:46:03 +01:00
blankie
225d849139
[mastodon] fix handling null 'moved' account field
2024-03-12 11:44:25 +11:00
Mike Fährmann
ac4e29f70a
[lensdump] support more direct link formats ( #5293 )
2024-03-09 23:33:58 +01:00
Mike Fährmann
d3003f8531
merge #5270 : [imagefap] add 'folder' metadata
2024-03-07 01:31:40 +01:00
Mike Fährmann
05331f9cf1
[imagefap] flake8, cleanup, tests
2024-03-07 01:29:19 +01:00
Mike Fährmann
40c0553523
[twitter] add 'quotes' extractor ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-1981571924
It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
2024-03-07 00:52:50 +01:00
Mike Fährmann
790c0ffb8d
[lensdump] recognize direct image links ( #5293
2024-03-06 22:56:57 +01:00
Mike Fährmann
6d9e3c0eb1
[skeb] add extractor for followed users ( #5290 )
...
needs 'Authorization' header from browser session
-o headers.Authorization="Bearer ey…"
2024-03-06 22:43:01 +01:00
Mike Fährmann
ace16f00f5
[weibo] fix retweets ( #2825 , #3874 , #5263 )
...
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
2024-03-06 19:36:53 +01:00
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions ( #5287 )
2024-03-06 19:36:32 +01:00
Mike Fährmann
a8027745e3
[downloader:http] add MIME type and signature for .mov files ( #5287 )
2024-03-06 14:00:24 +01:00
Mike Fährmann
24873c2724
[warosu] fix crash for threads with deleted posts ( #5289 )
2024-03-06 01:27:45 +01:00
Mike Fährmann
f296067797
[naver] unescape post 'title' and 'description'
2024-03-06 00:46:19 +01:00
Mike Fährmann
a71cdab53e
merge #5126 : [naver] fix EUC-KR encoding issue in old image URLs
2024-03-06 00:22:33 +01:00
Mike Fährmann
a8d3efbb99
[naver] simplify code + add test
2024-03-06 00:21:23 +01:00
Mike Fährmann
7b28418f69
[naver] recognize '.naver' URLs
...
https://blog.naver.com/PostView.naver ?…
2024-03-05 22:30:29 +01:00
Mike Fährmann
a767832332
[deviantart:avatar] ignore default avatars ( #5276 )
2024-03-04 23:11:30 +01:00
Mike Fährmann
6482bbc525
[bluesky] handle different 'embed' structure
2024-03-03 20:41:01 +01:00
Mike Fährmann
1a9b9aa310
[artstation] support video clips ( #2566 , #3309 , #3911 )
...
- add 'videos' and 'previews' options
- fix 403 errors for video previews
2024-03-03 18:00:45 +01:00
Mike Fährmann
25d2854272
[deviantart] add 'comments-avatars' option ( #4995 )
2024-03-02 21:59:16 +01:00
Mike Fährmann
cf9e99c07b
[artstation] support collections ( #146 )
...
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1972101003
2024-03-01 20:21:21 +01:00
Mike Fährmann
32ec695195
merge #5256 : [wikimedia] add azurlane.koumakan.jp
2024-02-29 21:50:24 +01:00
Mike Fährmann
5d9ec91896
[azurlanewiki] supportedsites + test
2024-02-29 21:49:13 +01:00
Mike Fährmann
76581c13f7
handle URLs without '/' after their TLD ( #5252 )
2024-02-29 15:05:46 +01:00
Mike Fährmann
ba062712ad
[tests] '__main__' -> "__main__"
2024-02-27 02:10:05 +01:00
Mike Fährmann
8a11b72253
remove extractor/test.py ( #4504 )
2024-02-27 01:37:57 +01:00
Mike Fährmann
fde9e25c9f
[tests:kemonoparty] '.party' -> '.su'
2024-02-26 22:25:04 +01:00
Mike Fährmann
13443f40a3
[xvideos] support '/channels/' URLs ( #5244 )
2024-02-26 00:08:37 +01:00
Mike Fährmann
cc6b9e4c18
[zerochan] use API by default ( #3669 )
...
add 'pagination' option
2024-02-25 00:36:14 +01:00
blankie
962f55cc68
[artstation] fix handling usernames with dashes
2024-02-21 17:39:37 +11:00
Mike Fährmann
741fd00cec
[deviantart] extend 'metadata' option ( #5175 )
...
alloe fetching extended metadata in addition to the usual
'description', 'tags', etc by setting 'metadata' to a list of
'camera', 'stats', 'submission', 'collection', and 'gallery'
for example "metadata": "stats,submission"
2024-02-18 23:14:14 +01:00
Mike Fährmann
8a63801311
[vsco] add 'spaces' extractor ( #5202 )
...
for spaces listed on a user page
2024-02-17 18:20:48 +01:00
Mike Fährmann
ccb413df71
[wikimedia] support 'pidgi.net' and 'bulbapedia.bulbagarden.net' ( #5205 , #5206 )
2024-02-17 17:35:10 +01:00
Mike Fährmann
7033cc14e9
[vsco] add 'space' extractor ( #5202 )
2024-02-17 01:54:05 +01:00
Mike Fährmann
c9efccc959
[tests] update extractor results
2024-02-16 22:42:06 +01:00
Mike Fährmann
c413834dfc
[bluesky] extend tests
2024-02-16 16:30:02 +01:00
Mike Fährmann
24c1317e0d
[batoto] fix crash when manga/chapter contains a '-' ( #5200 )
2024-02-16 00:10:08 +01:00
Mike Fährmann
0abd9723af
[bluesky] add 'metadata' option ( #4438 )
...
allow extracting 'user' metadata and
make 'facets' extraction optional
2024-02-15 23:30:16 +01:00
Mike Fährmann
c97b92cc35
[fanbox] add 'home' and 'supporting' extractors ( #5138 )
2024-02-14 23:25:39 +01:00
Mike Fährmann
04e4ffc64c
[deviantart] combine 'png' option with 'quality' ( #4846 )
...
"quality": "png" to download PNGs instead og JPEGs
2024-02-14 22:07:29 +01:00
Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option ( #4846 )
2024-02-14 01:03:15 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
...
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor ( #5194 )
2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts ( #4567 , #5193 )
2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities ( #4913 )
2024-02-13 01:30:23 +01:00
Mike Fährmann
8f27f43d4d
[tests] implement explicitly disabling auth
2024-02-13 00:08:27 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction ( #5088 , #5151 , #5153 )
...
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor ( #5190 )
2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor ( #5143 )
2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187 : [skeb] add 'num' and 'count' metadata fields
2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests
2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
2024-02-10 18:17:07 +01:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites ( #1443 )
...
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata ( #4438 )
2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution ( #4438 )
...
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata ( #5171 )
2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support ( #4438 , #4708 , #4722 , #5047 )
2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs ( #5171 )
2024-02-07 14:57:13 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name ( #5104 )
2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics ( #5123 )
2024-02-04 21:38:46 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags ( #5120 )
2024-01-27 16:24:03 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs ( #5116 )
2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' ( #5114 )
2024-01-25 23:45:41 +01:00
Mike Fährmann
0d3af0d35b
[tests] ignore 'ytdl' categories when import fails ( #5095 )
2024-01-21 15:31:12 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain ( #5088 )
2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis ( #1443 , #2677 , #3378 )
...
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas
2024-01-21 09:50:27 +11:00
Mike Fährmann
0d367ce1b9
[tests] update extractor results
2024-01-20 18:02:36 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
...
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
...
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
...
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain
2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values
2024-01-19 14:21:09 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag ( #5076 )
2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize ( #1443 )
...
- support mediawiki.org
- support mariowiki.com (#3660 )
- combine code into a single extractor
(use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
...
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061 : [webtoons] extract more metadata
...
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
...
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
68196589c4
[2ch] update
...
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided
2024-01-14 22:47:16 +01:00
blankie
bb446b1598
[webtoons] extract more metadata
2024-01-14 19:26:49 +11:00
Mike Fährmann
355b909f46
merge #5041 : [steamgriddb] add support ( #5033 )
2024-01-13 00:59:15 +01:00
Mike Fährmann
71e2c3e5a2
merge #5037 : [hatenablog] add support ( #5036 )
2024-01-13 00:57:21 +01:00
Mike Fährmann
b97af09e03
[tests] include URL in failure report
2024-01-12 03:23:21 +01:00
Mike Fährmann
58e0665fbc
[tests] load config from external file
2024-01-12 03:21:44 +01:00
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl
2024-01-12 02:33:27 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts ( #5049 )
2024-01-11 05:07:38 +01:00
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option ( #4995 )
2024-01-10 17:13:34 +01:00
Mike Fährmann
887ade30a5
[batoto] support more mirror domains ( #5042 )
2024-01-09 18:02:49 +01:00
blankie
2ccb7d3bd3
[steamgriddb] add support
2024-01-09 17:12:56 +11:00
blankie
2cfe788f93
[hatenablog] fix extractor naming errors
2024-01-09 01:42:57 +11:00
blankie
61f3b2f820
[hatenablog] add support
2024-01-09 01:29:47 +11:00
Mike Fährmann
657ed93a22
[batoto] improve v2 manga URL pattern
...
and add tests
2024-01-07 22:23:30 +01:00
Mike Fährmann
33f228756a
[mangadex] add 'list' extractor ( #5025 )
...
supports listing manga and chapters from list feed
2024-01-07 02:59:35 +01:00
Mike Fährmann
c25bdbae91
[komikcast] fix 'manga' extractor ( #5027 )
2024-01-06 14:19:44 +01:00
Mike Fährmann
8e1a2b5446
[komikcast] update domain to 'komikcast.lol' ( #5027 )
2024-01-06 02:16:43 +01:00
Mike Fährmann
a441249ea2
merge #4979 : [batoto] add 'chapter' and 'manga' extractors ( #1434 , #2111 )
2024-01-06 01:53:26 +01:00
Mike Fährmann
b11c352d66
[bato] rename to 'batoto'
...
to use the same category name as the previous bato.to site
2024-01-06 01:49:34 +01:00
Mike Fährmann
3aa24c3744
[bato] simplify and update
2024-01-06 01:10:04 +01:00
Mike Fährmann
11150a7d72
[nudecollect] remove module
2024-01-05 21:32:04 +01:00
Mike Fährmann
c158927c38
merge #5016 : [zzup] add 'gallery' extractor ( #4517 , #4604 , #4659 , #4863 )
2024-01-05 21:25:46 +01:00
Mike Fährmann
217fa7f8a1
include 'test/results' in flake8 checks
2024-01-05 18:16:33 +01:00
Mike Fährmann
e61f016465
[szurubooru] support 'snootbooru.com' ( #5023 )
2024-01-05 17:56:39 +01:00
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor ( #5022 )
...
yet another bug caused by a383eca7
2024-01-05 17:18:33 +01:00
Mike Fährmann
0ab0a10d2d
[jpgfish] update domain
2024-01-05 02:27:20 +01:00
enduser420
0f30136109
[zzup] add 'gallery' extractor
2024-01-04 21:38:59 +05:30
Mike Fährmann
7eaf648f2e
[fanbox] add 'metadata' option ( #4921 )
...
extracts 'plan' and extended 'user' metadata
2024-01-04 15:01:33 +01:00
Mike Fährmann
4f3671458e
[deviantart] add 'avatar' and 'background' extractors ( #4995 )
2024-01-03 00:07:55 +01:00
Mike Fährmann
63f649cd92
[idolcomplex] fix extraction & update URL patterns ( #5002 )
2024-01-01 17:38:32 +01:00
Mike Fährmann
7aa1c9671b
[tests] fix 'invalid escape sequence' warnings
2024-01-01 16:12:43 +01:00
Mike Fährmann
b6903a4c90
[nijie] add 'count' metadata field
...
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1812849102
2023-12-30 22:25:59 +01:00
Mike Fährmann
b93b351db9
merge #4962 : [poringa] add support ( #4675 )
2023-12-30 20:39:35 +01:00
Mike Fährmann
9f21c839ad
[poringa] improvements and fixes
...
- add 'num' and 'count' metadata fields
- prevent crash for "private" posts
- prevent crash when there's no 'main-info'
- update tests
2023-12-30 20:37:09 +01:00
Mike Fährmann
caceb14fc2
[tests] fail when a results file contains syntax errors
...
or is otherwise not importable
2023-12-30 17:26:57 +01:00
Mike Fährmann
085411f3f1
[rule34] recognize URLs with 'www' subdomain ( #4984 )
2023-12-30 16:07:56 +01:00
Antonio
e348da7a06
[poringa] add support
2023-12-27 00:07:23 -06:00
bug-assassin
74c225f94e
[bato] add support
2023-12-26 22:33:33 -05:00
Mike Fährmann
f9544194c0
[paheal] restore 'extension' metadata ( #4976 )
2023-12-26 16:09:26 +01:00
Mike Fährmann
77d46e6f0c
[lynxchan] update 'bbw-chan' domain ( #4970 )
2023-12-25 15:29:05 +01:00
Mike Fährmann
108c978073
merge #4919 : [postmill] add support ( #4917 )
2023-12-23 15:23:56 +01:00
Mike Fährmann
2a60645095
[deviantart] set 'is_original' for intermediary URLs to 'false'
2023-12-22 14:49:10 +01:00
Mike Fährmann
01bb75f6cb
merge #4945 : {shimmie2[ support 'rule34hentai.net' ( #861 , #4789 )
2023-12-22 00:10:26 +01:00
Mike Fährmann
79e4606893
[rule34hentai] cleanup
...
- fix using 'self._posts_rule34hentai'
- fix 'file_url' for posts
- update docs/supportedsites
- add tests
2023-12-22 00:01:36 +01:00
Mike Fährmann
627ed794a2
[danbooru] provide 'tags' as list ( #4942 )
...
keep the old 'tag_string' values around, similar to sankaku
a lot of repeat code ...
would be a lot less bad if "".split(" ") returned an empty list
2023-12-21 14:39:38 +01:00
Mike Fährmann
99aa923322
[inkbunny] improve '/submissionsviewall.php' patterns ( #4934 )
...
allow 'mode=…' to be in any position
don't require it to be somewhere in the middle
2023-12-16 19:21:20 +01:00
Mike Fährmann
3f9c113d78
[mastodon] Support non-numeric status IDs ( #4936 )
2023-12-16 01:52:31 +01:00
Mike Fährmann
2852404e49
[inkbunny] add 'unread' extractor ( #4934 )
2023-12-15 21:20:12 +01:00
Mike Fährmann
a37b7759bc
[myhentaigallery] recognize '/g/' URLs ( #4920 )
2023-12-12 20:02:28 +01:00
blankie
fbe14a2745
[postmill] add support
2023-12-12 21:36:52 +11:00
Mike Fährmann
bf74eb5c46
merge #4886 : [urlgalleries] add 'gallery' extractor ( #919 , #1184 , #2905 )
2023-12-08 22:55:58 +01:00
Mike Fährmann
ade93c5397
[urlgalleries] add tests
2023-12-08 22:55:16 +01:00
Mike Fährmann
4eb3590103
[nijie] fix image URLs of multi-image posts ( #4876 )
2023-12-05 17:48:50 +01:00
Mike Fährmann
c83fbe6c2d
merge #4855 : [nitter] fix video extraction ( #4853 )
2023-11-27 18:39:05 +01:00
Mike Fährmann
1137d72d48
[tests] skip test_init for BaseExtractor classes without instances
2023-11-27 18:36:15 +01:00
Mike Fährmann
625e94fa7d
update extractor test results
...
still not everything, but good enough for now
2023-11-27 18:30:53 +01:00
enduser420
1e9bacd169
[nitter] fix video extraction
2023-11-27 21:58:06 +05:30
Mike Fährmann
95c1dfb089
[tests] swap assertEqual argument order
...
before this, it would show test failures as
+ test value
- extracted value
when it should be the other way round
2023-11-27 01:06:13 +01:00
Mike Fährmann
bdb3ce7217
[foolslide] remove 'powermanga.org'
2023-11-26 23:19:05 +01:00
Mike Fährmann
f9dac43be9
[warosu] fix file URLs
2023-11-24 02:44:55 +01:00
Mike Fährmann
645b4627ef
[sankaku] update URL patterns
2023-11-24 02:41:52 +01:00
Mike Fährmann
119755a5a3
[tests] implement skipping/failing tests when pressing ctrl+c
2023-11-24 00:48:37 +01:00
Mike Fährmann
1ae43d8123
merge #4841 : [fapello] support '.su' TLD ( #4840 )
2023-11-22 20:18:32 +01:00
Mike Fährmann
e1404827a6
[pixeldrain] add 'file' and 'album' extractors ( #4839 )
2023-11-22 19:01:19 +01:00
enduser420
2402162e8a
[fapello] support '.su' TLD
2023-11-22 19:35:43 +05:30
Mike Fährmann
725c8dd55a
[tmohentai] 'categories' -> 'genres'
...
quite likely that the site meant 'genres' by "Genders"
2023-11-21 22:11:43 +01:00
Mike Fährmann
ce7c4cb544
merge #4832 : [tmohentai] add 'gallery' extractor ( #4808 )
2023-11-21 20:25:49 +01:00
Mike Fährmann
c4a201ed42
[tmohentai] simplify + tests
2023-11-21 20:24:07 +01:00
Mike Fährmann
e17a48fe56
[blogger] inherit from BaseExtractor
...
- support www.micmicidol.club (#4759 )
2023-11-21 16:52:25 +01:00
Mike Fährmann
0fa85360a0
merge #4812 : [erome] add 'count' metadata field
2023-11-20 22:42:02 +01:00
Mike Fährmann
a43cf78bb7
[erome] tests
2023-11-20 22:41:12 +01:00
Mike Fährmann
07cb584231
[behance] add 'modules' option ( #4799 )
2023-11-17 22:54:38 +01:00
Mike Fährmann
ea78f67860
[downloader:http] skip files not passing filesize-min/-max ( #4821 )
...
instead of failing the download
2023-11-17 22:54:20 +01:00
Mike Fährmann
3f591d5a4e
[mastodon] update test results
2023-11-11 21:24:07 +01:00
Mike Fährmann
6402f2950f
[pp:metadata] ignore non-string tag values ( #4764 )
2023-11-04 17:33:14 +01:00
Mike Fährmann
007c433677
[patreon] support 'id:<campaign_id>' in place of a user name
...
https://patreon.com/id:12345
… and remove 'campaign-id' config option
2023-11-04 00:17:41 +01:00
Mike Fährmann
43a3d93467
merge #4755 : [twitter] recognize fixupx.com URLs
2023-11-02 15:33:29 +01:00
Mike Fährmann
cdf77e326f
[twitter] add test for fixupx.com
2023-11-02 15:32:48 +01:00
Mike Fährmann
fc8f86bf24
[hitomi] recognize 'imageset' gallery URLs ( #4756 )
2023-11-02 15:29:44 +01:00
Mike Fährmann
72b18d701f
represent util.NONE as 'null' in JSON output
...
was '"None"' before
2023-11-02 15:23:28 +01:00
Mike Fährmann
68e72a836c
[exhentai] fix extraction ( #4730 )
...
- update to new API response layout
- use proper API server URL
- fix 'filesize' metadata
2023-10-30 13:38:49 +01:00
Mike Fährmann
fd8f58ad76
[behance] unescape embed URLs ( #4742 )
2023-10-30 13:38:49 +01:00
Mike Fährmann
c9a2be36d4
[sankaku] support '/posts/' tag search URLs ( #4740 )
2023-10-29 13:48:42 +01:00
Mike Fährmann
218295a4c6
[twitter] fix avatars without 'date' information ( #4696 )
2023-10-27 17:58:02 +02:00
Mike Fährmann
d0effcae20
[kemonoparty] add 'revision_index' metadata field ( #4727 )
2023-10-26 22:26:38 +02:00
Mike Fährmann
3bbaa875f1
[kemonoparty] fix parsing of non-standard 'dates' ( #4676 )
2023-10-26 21:50:18 +02:00
Mike Fährmann
a09df34bcf
merge #4714 : [4archive] add 'thread' and 'board' extractors
...
(#1262 , #2418 , #4400 , #4710 )
2023-10-25 20:12:07 +02:00
enduser420
acb713b95a
[4archive] update
2023-10-25 23:08:45 +05:30
Mike Fährmann
6766877524
merge #4693 : [reddit] support Reddit Mobile share links
2023-10-25 17:54:32 +02:00
Mike Fährmann
1042278bec
[misskey] support 'misskey.design' ( #4713 )
2023-10-25 17:47:03 +02:00
enduser420
c0714d5585
[4archive] add 'thread' and 'board' extractors
2023-10-24 23:05:28 +05:30
inty
b68aad3dab
[reddit] implement Reddit Mobile share links
2023-10-22 10:38:05 +00:00
Mike Fährmann
7958ab1946
[newgrounds] support 'imageData' files ( #4642 )
2023-10-21 13:22:55 +02:00
Mike Fährmann
b52fd91ac6
[sankaku] support '/posts/' URLs ( #4688 )
2023-10-21 13:20:35 +02:00
Mike Fährmann
b2c3db3e24
[bunkr] add extractor for media URLs ( #4684 )
2023-10-20 15:22:44 +02:00
Mike Fährmann
6e830ffc9e
[kemonoparty] support post searches ( #3385 , #4057 )
2023-10-19 23:06:06 +02:00
Mike Fährmann
aaf539009b
[kemonoparty] initial support for post revisions ( #4498 , #4597 )
...
- single revision
https://kemono.party/SERVICE/user/12345/post/12345/revision/12345
- all revisions
https://kemono.party/SERVICE/user/12345/post/12345/revisions
2023-10-19 22:32:51 +02:00
Mike Fährmann
174191cb79
[kemonoparty] restore discord pagination ( #4676 )
2023-10-19 21:57:27 +02:00
Mike Fährmann
c9a976d8a6
[kemonoparty] various updates and fixes ( #4676 , #4681 )
...
- fix pagination
- fix 'date' metadata
- fix discord channel API endpoint
2023-10-19 17:36:16 +02:00
Mike Fährmann
bfdc07632a
[deviantart] expand nested comment replies ( #4653 )
2023-10-17 19:40:53 +02:00
Mike Fährmann
9bc5ad4784
[tests] implement 'len:'
2023-10-17 19:25:31 +02:00
Mike Fährmann
a1977a698e
[tests] fix spurious failures in '_assert_isotime()'
2023-10-16 18:16:48 +02:00
Mike Fährmann
390d14dbcc
[chevereto] support 'img.kiwi' and 'deltaporno.com' ( #4664 , #1381 )
2023-10-16 18:14:30 +02:00
Mike Fährmann
727c8eec6c
merge #4667 : [redgifs] fix 'niches' extraction ( #4666 )
2023-10-16 14:20:01 +02:00
Mike Fährmann
2911ed1240
[chevereto] add generic extractors ( #4664 )
...
- support jpgfish
- support pixl.li / pixl.is (#3179 , #4357 )
2023-10-16 14:15:39 +02:00
enduser420
db3363ac0b
[redgifs] fix 'niches' extraction
2023-10-16 16:51:30 +05:30
Mike Fährmann
ade8347ead
[kemonoparty] fix DM dates
2023-10-15 19:54:28 +02:00
Mike Fährmann
6dfe200ae4
[kemonoparty] support discord URLs with channel IDs ( #4662 )
2023-10-15 19:45:22 +02:00
Mike Fährmann
c6a3892210
[imgbb] update username extraction ( #4626 )
2023-10-14 20:55:39 +02:00
Mike Fährmann
13ce3a9acb
[warosu] fix extraction ( #4634 )
2023-10-13 23:03:39 +02:00
Mike Fährmann
c4c4e4d2f4
[newgrounds] improve 'art-image' extraction ( #4642 )
...
- download files in original resolution
- replace .webp with extension of first file
2023-10-13 20:10:55 +02:00
Mike Fährmann
833dce141f
[fantia] add 'content_count' and 'content_num' metadata fields ( #4627 )
2023-10-13 20:10:55 +02:00
Mike Fährmann
2d41702762
[deviantart] implement '"group": "skip"' ( #4630 )
2023-10-12 22:14:20 +02:00
Mike Fährmann
a9c3442d4e
[deviantart] add a couple 'deactivated account' test URLs
2023-10-12 21:40:10 +02:00
Mike Fährmann
2974b8e3c8
[moebooru] add 'metadata' option ( #4646 )
...
for extended 'pool' metadata
2023-10-12 21:34:25 +02:00
Mike Fährmann
67ba4ee842
[pp:exec] support more replacement fields for '--exec' ( #4633 )
...
- {_directory}
- {_filename}
- {_path} (alias for {})
2023-10-09 12:50:10 +02:00
Mike Fährmann
9a008523ac
[hentaifoundry] fix '.swf' file downloads ( #4641 )
2023-10-09 11:45:55 +02:00
Mike Fährmann
15f940819b
[newgrounds] support 'art-image' files ( #4642 )
2023-10-09 11:20:10 +02:00
Mike Fährmann
efaab4fbfa
[twitter] fix crash due to missing 'source' ( #4620 )
...
regression caused by 06aaedde
2023-10-04 23:01:04 +02:00
Mike Fährmann
84fbbd96aa
[shimmie2] remove 'meme.museum'
2023-10-02 20:41:25 +02:00
Mike Fährmann
0b150d45db
[tests] add 'msg' arguments to assert statements
2023-10-01 13:52:00 +02:00
Mike Fährmann
27da3f2958
[tests] re-implement filtering by basecategory
2023-10-01 13:31:23 +02:00
Mike Fährmann
c7bd9925d9
[tests] use fallback URLs for content tests ( #3163 )
2023-09-30 21:00:55 +02:00
Mike Fährmann
b92645cd37
[bunkr] fix extraction ( #4514 , #4532 , #4529 , #4540 )
2023-09-30 18:05:12 +02:00
Mike Fährmann
bd3f7a5bbc
[tests] support one regex per URL for #pattern
2023-09-28 21:56:09 +02:00
Mike Fährmann
0c5d8b1505
[deviantart] re-add 'quality' option and 'intermediary' transform
2023-09-24 17:36:05 +02:00
Mike Fährmann
dbd820d7c5
[tests] allow checking for exact URL results
2023-09-24 01:52:47 +02:00
Mike Fährmann
642998504d
[tests] support 'range()' for #count and metadata checks
2023-09-24 01:52:40 +02:00
Mike Fährmann
1e31fce37b
[pillowfort] support '/tagged/' URLs ( #4570 )
2023-09-23 00:11:01 +02:00
Mike Fährmann
1d2fd0b831
[pillowfort] extract 'b2_lg_url' media ( #4570 )
2023-09-23 00:05:26 +02:00
Mike Fährmann
50e2ebaff0
[danbooru] support 'donmai.moe' URLs
2023-09-22 20:58:38 +02:00
Mike Fährmann
918ba4f847
[redgifs] match gfycat image URLs ( #4558 )
2023-09-22 18:02:55 +02:00
Mike Fährmann
2cd801232b
fix --range causing crashes ( #4557 )
...
regression caused by a383eca7
2023-09-22 16:28:20 +02:00
Mike Fährmann
27ec653991
fix bug in test_init and update example URLs
2023-09-14 13:27:03 +02:00
Mike Fährmann
24a1d46391
[mastodon] support '/@USER/following' URLs
...
Previously, only '/users/USER/following' got matched.
2023-09-13 23:42:51 +02:00
Mike Fährmann
ac00d47a16
update test/test_results.py
2023-09-13 14:54:25 +02:00
Mike Fährmann
65b6011cc5
update test/test_extractor.py
2023-09-11 17:20:06 +02:00
Mike Fährmann
a833c244c8
add exported extractor results
2023-09-10 14:45:01 +02:00
Mike Fährmann
93a7a89cf6
[formatter] use value of last alternative ( #4492 )
...
fixes {fieldname|''} evaluating to the value of 'keywords-default'
instead of an empty string
2023-09-05 17:53:27 +02:00
Mike Fährmann
f2de70f254
[gfycat] remove module
2023-09-04 18:27:11 +02:00
Mike Fährmann
d319777a24
[tests] skip 'test_init_ytdl' on Python<3.6
...
It passes without error in a Python 3.4/3.5 venv on my own machine,
but fails for some inexplicable reason on Github Actions.
2023-08-10 23:34:49 +02:00
Mike Fährmann
0ef1fcab20
[postprocessor] update 'finalize' events
...
Add 'finalize-error' and 'finalize-success' events that trigger
depending on whether error(s) did or did not happen.
'finalize' itself now always triggers regardless of error status.
(was supposed to have the same behavior as the new 'finalize-success')
2023-08-10 19:46:37 +02:00
Mike Fährmann
d50c312ff0
prevent test failure when there's no 'ytdl' module ( #4364 )
...
split of ytdl into its own test function and
skip it when there's an ImportError similar to test_ytdl.py
2023-07-29 13:48:31 +02:00
Mike Fährmann
48ef062867
fix issues with 'Extractor.finalize()'
...
- prevent crash in InstagramUserExtractor (#4359 )
- call it at the end of every DownloadJob
- add it to tests
2023-07-29 13:43:27 +02:00
Mike Fährmann
255d08b79e
add test for 'Extractor.initialize()' ( #4359 )
2023-07-28 16:58:16 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
f0203b7559
[postprocessor:python] add tests
2023-07-24 15:22:57 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
c5565f79f7
merge #4096 : [danbooru] add support for booru.borvar.art instance
2023-07-18 18:33:08 +02:00
Mike Fährmann
63326e3168
[danbooru] add tests for booruvar
2023-07-18 18:29:57 +02:00
Mike Fährmann
5171d8975c
[E621] support 'e6ai.net' ( #4320 )
2023-07-18 18:16:30 +02:00
Mike Fährmann
7444fc125b
[gfycat] implement login support ( #3770 , #4271 )
...
For the record: '/webtoken' and '/weblogin' are not the same ...
2023-07-06 18:56:34 +02:00
Mike Fährmann
25c5a6ffcb
no f-strings
2023-06-25 14:01:26 +02:00
Mike Fährmann
ec64cbefeb
[postprocessor:exec] add tests
2023-06-21 23:54:35 +02:00
Mike Fährmann
ce93c460a6
[formatter] implement 'H' conversion ( #4164 )
...
to remove HTML tags and unescape HTML entities
2023-06-15 13:07:51 +02:00
Mike Fährmann
deff3b434d
[vipergirls] implement login support ( #4166 )
2023-06-13 21:05:09 +02:00
Mike Fährmann
69865dcc05
[formatter] implement slicing strings as bytes ( #4087 )
...
prefixing a slice '[10:30]' with a lowercase b '[b10:30]' encodes
the string to bytes in filesystem encoding before applying the slice
2023-05-22 18:30:45 +02:00
Mike Fährmann
df11214281
[ytdl] improve --xff/--geo-bypass detection ( #3989 )
...
check if --xff is supported in a try-except block
and select expected results accordingly
2023-05-01 18:26:37 +02:00
Mike Fährmann
aa731c4298
[ytdl] run yt-dlp tests with latest code from master ( #3989 )
...
Only use PyPI version for Python 3.6, since that's no longer supported
by the current codebase.
2023-05-01 16:42:57 +02:00
Mike Fährmann
43f4bd9faa
[ytdl] fix tests
...
tests pass with latest Git HEAD, but not with the current PyPI version
2023-04-29 18:05:45 +02:00
Mike Fährmann
61a65d5bb9
[ytdl] fix crash due to --geo-bypass deprecation ( #3975 )
2023-04-29 17:25:38 +02:00
Mike Fährmann
a96745368e
"fix" tests on Python 3.4 and 3.5
...
can't rely on dict insertion order
2023-04-26 19:31:27 +02:00
Mike Fährmann
3905f05f00
[postprocessor:metadata] support putting keys in quotes
...
for mode 'modify' and 'delete'
based on fe41a2b1
2023-04-25 14:30:18 +02:00
Mike Fährmann
7459e4abce
[postprocessor:metadata] fix traversing more than 1 level deep
...
for mode 'modify' and 'delete'
2023-04-25 14:17:25 +02:00
Mike Fährmann
2edcdee32f
[downloader:http] add MIME type and signature for .heic files
...
(#3915 )
https://github.com/strukturag/libheif/issues/83
2023-04-15 17:09:22 +02:00
Mike Fährmann
082d55de16
fix circular reference detection for -K
2023-03-21 23:46:36 +01:00
Mike Fährmann
2ab66ad899
update -K output to include quotes around keys
2023-03-21 22:28:04 +01:00
Mike Fährmann
fe41a2b159
[formatter] support putting keys in quotes
...
i.e. obj["key"] or obj['key']
as in f-strings
2023-03-21 22:06:54 +01:00
Mike Fährmann
46fdf46f21
[formatter] support loading an f-string from a template file
...
"\fTF ~/path/to/file.txt"
2023-03-20 22:05:33 +01:00
Mike Fährmann
1a4d4a799b
[formatter] support filesystem paths for \fM
2023-03-20 22:01:33 +01:00
Mike Fährmann
00f0233b28
[postprocessor:metadata] add 'skip' option ( #3786 )
2023-03-17 23:30:11 +01:00
Mike Fährmann
8f8b4de0e8
[ytdl] fix '--parse-metadata' ( #3663 )
2023-03-05 19:57:23 +01:00
Mike Fährmann
7610d9cf82
merge #3675 : [pixiv] fix --write-tags for '"tags": "original"'
2023-03-02 21:48:31 +01:00
Mike Fährmann
83e7a25b6b
extend OAuth tests
2023-03-02 17:26:51 +01:00
Mike Fährmann
d788e6c60c
implement 'globals' option
2023-02-28 18:18:55 +01:00
Mike Fährmann
56039d2456
add 'hash_md5' and 'hash_sha1' functions ( #3679 )
...
... to global eval namespace
2023-02-22 10:58:44 +01:00
Mike Fährmann
e1df7f73b1
[deviantart] add 'search' extractor
...
(#538 , #1264 , #2954 , #2970 , #3577 )
Requires login to fetch any results, since the API endpoint raises an
error for not logged in requests.
TODO: parse HTML search results
2023-02-20 20:54:46 +01:00
Gray Manley
38a6389e2c
Fix lint.
2023-02-20 00:33:30 -06:00
Gray Manley
56cbae92ec
Use more pythony naming.
2023-02-19 06:14:34 -06:00
Gray Manley
8e2ba4f32e
Add test.
2023-02-19 06:13:21 -06:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2023-02-09 15:22:00 +01:00
Mike Fährmann
b7337d810e
[postprocessor:metadata] add 'sort' and 'separators' options
2023-02-07 18:28:14 +01:00
Mike Fährmann
3436c6b117
[postprocessor:metadata] speed up JSON encoding
2023-02-06 12:35:28 +01:00
Mike Fährmann
925b467496
split e621 from danbooru module ( #3425 )
2023-02-03 19:24:31 +01:00