Mike Fährmann
a5be680596
[directlink] extend recognized file extensions ( #5924 )
...
bmp, svg, avif, heic, psd, pdf, m4v, mov, wav, mp3, zip, rar, 7z, swf
2024-08-02 16:13:09 +02:00
Mike Fährmann
9be7896f78
[koharu] fix 'count' for 'cbz' downloads ( #5893 )
2024-08-02 16:13:03 +02:00
Mike Fährmann
42388dc819
[tests] fix 'pattern' not being compiled before running a test
...
fixes regression introduced in 3fa74ca4
2024-08-01 12:35:01 +02:00
Mike Fährmann
c372242a06
[koharu] add 'favorite' extractor ( #5893 )
2024-08-01 12:33:18 +02:00
Mike Fährmann
2bf76461ce
[deviantart:following] use OAuth API endpoint ( #2511 )
2024-07-31 17:21:39 +02:00
Mike Fährmann
095f278d6f
[vsco] add 'include' option ( #5911 )
2024-07-31 12:32:04 +02:00
Mike Fährmann
ff58683b76
[koharu] send necessary headers for image downloads ( #5893 )
2024-07-30 19:39:50 +02:00
Mike Fährmann
399ba85841
[fallenangels] remove module
2024-07-30 17:33:16 +02:00
Mike Fährmann
84eefeebd6
[sankaku] match URLs with 'www' subdomain ( #5907 )
2024-07-30 17:05:22 +02:00
Mike Fährmann
279854cd9e
[agnph] implement 'tags' option ( #5284 )
2024-07-30 14:16:19 +02:00
Mike Fährmann
ca3e7a5e5a
[koharu] add 'date', simplify 'tags' ( #5893 )
2024-07-30 12:38:19 +02:00
Mike Fährmann
9c91935fe4
merge #5904 : [dynastyscans] extract chapter 'tags'
2024-07-29 21:17:13 +02:00
enduser420
5a36da2968
[dynastyscans] extract chapter 'tags'
2024-07-29 16:50:47 +05:30
Mike Fährmann
8a6e208605
[zerochan] fix 'Invalid control character' errors ( #5892 )
2024-07-29 11:24:17 +02:00
Mike Fährmann
aa6d00613f
[cien] initial support ( #2885 , #4103 , #5240 )
2024-07-28 19:27:12 +02:00
Mike Fährmann
65d7cccaf9
merge #5899 : [redgifs] support URLs with numeric IDs ( #5898 )
2024-07-28 12:35:57 +02:00
Mike Fährmann
4e245c94a8
[redgifs] add test for numeric ID
2024-07-28 12:33:28 +02:00
Mike Fährmann
c9aeedeafd
[koharu] add 'gallery' and 'search' extractors ( #5893 , #4707 )
2024-07-28 12:22:18 +02:00
Mike Fährmann
226ead728e
[agnph] add 'tag' and 'post' extractors ( #5284 , #5890 )
2024-07-27 12:17:47 +02:00
Mike Fährmann
d7a2c73274
[util] let a CustomNone instance be equal to itself
2024-07-26 20:56:01 +02:00
Mike Fährmann
b5e141ed6e
[sankakucomplex] update domain to 'news.sankakucomplex.com'
2024-07-26 20:39:55 +02:00
Mike Fährmann
f321272b7c
[ytdl] fix --cookies-from-browser option parsing
2024-07-25 17:45:25 +02:00
Mike Fährmann
5207a0c2e0
[zerochan] implement 'tags' option ( #5874 )
...
allow splitting tags into separate lists by category
2024-07-23 10:21:33 +02:00
Mike Fährmann
1aadc29c5b
[zerochan] fix 'source' extraction
2024-07-23 09:34:44 +02:00
Mike Fährmann
5f6d20c595
[tests] remove internal extractor check
...
revert 60a2fefedd
.
2024-07-22 18:40:02 +02:00
Mike Fährmann
3eba1f7c29
[tests] load results from ${GDL_TEST_RESULTS} ( #5262 )
2024-07-22 18:35:50 +02:00
Mike Fährmann
db9833c28a
merge #5870 : [aryion] add 'favorite' extractor ( #4511 )
2024-07-21 12:37:47 +02:00
Mike Fährmann
156a70bec0
[aryion] update favorite extractor
...
- add test case
- add docs/supportedsites entry
- add custom directory_fmt and archive_fmt
- remove constructor
- appease flake8
2024-07-21 12:34:06 +02:00
Mike Fährmann
727e53f513
[bunkr] support 'bunkr.fi' URLs ( #5872 )
2024-07-21 06:41:50 +02:00
Mike Fährmann
287a7d13cf
[sankaku] implement 'notes' extraction ( #5865 )
2024-07-18 20:44:49 +02:00
Mike Fährmann
026e0b97db
merge #5824 : [furaffinity] add 'folders' and 'thumbnail' ( #1284 )
2024-07-18 01:41:15 +02:00
Mike Fährmann
60a2fefedd
[tests] restrict 'test_unique_pattern_matches' to internal extractors
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2231649983
2024-07-17 23:35:41 +02:00
Mike Fährmann
3fa74ca4d7
[tests] enable test results for external extractors ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2231649983
2024-07-17 22:29:09 +02:00
Mike Fährmann
f7383a56f8
wrap filters/conditionals in a try-except block
...
allows accessing undefined fields without exception or locals().get(…)
but hides mistakes/typos/etc by evaluating to False without feedback
performance loss compared to the previous version without try-except
is negligible (~20ns for me)
2024-07-12 22:51:11 +02:00
Mike Fährmann
c83c812a1e
[instagram][twitter] rename 'profile' to 'info' ( #5262 , #3623 )
2024-07-11 00:22:39 +02:00
Mike Fährmann
21831eba1e
[tests] completely ignore '#auth' for 'only_matching' tests
2024-07-11 00:20:10 +02:00
Mike Fährmann
1527ad79e2
[tests] fix syntax for Python < 3.6
...
no f-strings
2024-07-06 18:39:59 +02:00
Mike Fährmann
da9916c01f
[pp:metadata] implement format strings for 'directory' ( #5728 )
2024-07-06 03:08:59 +02:00
Mike Fährmann
8f3f061daf
[hentainexus] fix error for spread pages ( #5827 )
2024-07-05 21:36:29 +02:00
Nicholas Bishop
f43bccb5be
[furaffinity] Add 'thumbnail' ( #1284 ) and 'folders' properties
...
Retrieve 'thumbnail' and 'folders' properties for each post.
'thumbnail' (#1284 ):
- Preview image used for search results, writing posts, music, etc.
- Filename format: <post_id>@600-<directory_containing_full_image>.jpg
'Folders' (related to #1817 ):
- A list of all gallery folders containing this post
- Folder name format: [<folder_category> - ]<folder_name>
- Only works on new layout; old layout does not show folders, so list will be empty
A test is included for each property.
2024-07-04 15:41:14 -04:00
Mike Fährmann
d10bfa9065
[vipergirls] improve 'thread' URL pattern
...
allow for query parameters and fragments at tne end of URLs
2024-06-29 17:38:57 +02:00
Mike Fährmann
44896b0296
[instagram] add 'profile' extractor ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2188915210
2024-06-28 22:54:07 +02:00
Mike Fährmann
51fdfbe6fc
[erome] extract 'date' metadata ( #5796 )
2024-06-27 21:07:23 +02:00
Mike Fährmann
8f50c04af2
[formatter] implement 'X' format specifier ( #5770 )
2024-06-21 20:56:19 +02:00
enduser420
5adbfe526d
[tcbscans] support other domains
2024-06-21 21:29:11 +05:30
Mike Fährmann
f58b0e6fc7
[twitter] ignore 'Unavailable' media ( #5736 )
...
… including geo-restricted content.
add 'unavailable' option to allow re-enabling them again
2024-06-21 00:15:10 +02:00
Mike Fährmann
8452d04a33
[fanbox] handle KeyError for no longer existing plans ( #5759 )
...
return the plan of the next higher tier instead
2024-06-21 00:13:25 +02:00
Mike Fährmann
ae3706286a
[speakerdeck] inherit from GalleryExtractor
2024-06-15 21:56:51 +02:00
Mike Fährmann
9c65db2a92
consistent 'with open(…) as fp:' syntax
2024-06-14 01:22:00 +02:00
Mike Fährmann
c6fc0281e8
[newgrounds] extend 'format' option ( #5709 )
...
- check more extensions for original formats (mp4, webm, m4v, mov, mkv)
- allow specifying which extensions and recoded formats to check
2024-06-12 20:46:45 +02:00
Mike Fährmann
86f0c3baaf
[szurubooru] support empty tag searches ( #5711 )
2024-06-11 20:25:06 +02:00
Mike Fährmann
2e11b6e756
[nijie] support downloading videos ( #5707 , #5617 )
2024-06-08 22:55:28 +02:00
Mike Fährmann
f160859c5c
[hitomi] extract 'title_jpn' metadata ( #5706 )
2024-06-08 00:05:19 +02:00
Mike Fährmann
9abeab5ecf
[shimmie2] support 'vidya.pics' ( #5632 )
2024-06-06 15:08:56 +02:00
Mike Fährmann
7614bc458e
[util] extend CustomNone with comparison operators
2024-06-05 16:49:30 +02:00
Mike Fährmann
1ce5de0290
[formatter] implement 'C' format specifier ( #5647 )
...
to apply a conversion after ':' or
to apply multiple conversions
for example {tags:CSl} or {tags:J - /Cl}
to convert list to string and lowercase it
2024-06-05 16:49:29 +02:00
Mike Fährmann
9b99d2c886
[philomena] support downloading SVG files ( #5643 )
2024-06-05 16:48:51 +02:00
Mike Fährmann
8fce9ea6d5
[hentainexus] restore module ( #5275 )
...
revert 97641cd151
2024-06-05 16:48:25 +02:00
Mike Fährmann
4d11cd9ffb
[vichan] remove 'wikieat.club'
...
redirects to some non-vichan site
2024-06-02 18:16:24 +02:00
Mike Fährmann
31133b97fb
[nitter] remove instances
2024-06-02 18:15:53 +02:00
Mike Fährmann
ce228ee163
[photobucket] remove module
...
had been broken for years and the new site is payed access only
2024-06-02 01:40:31 +02:00
Mike Fährmann
009aa90c3f
[tests] update extractor results
...
at least some of them
2024-06-01 20:28:04 +02:00
Mike Fährmann
020050ea8b
merge #5641 : [pixeldrain] add support for single file album download
2024-05-25 23:43:43 +02:00
Mike Fährmann
154a890399
[pixeldrain] integrate into 'album' extractor
2024-05-25 23:42:23 +02:00
HornyQT
24e70b956b
[pixeldrain] add support for single file album download
2024-05-25 16:06:50 +02:00
Mike Fährmann
0761b22a7f
[hiperdex] update domain to 'hiperdex.top' ( #5635 )
2024-05-24 17:13:10 +02:00
Mike Fährmann
f651b3b6ab
merge #5601 : [twitter] match '/video/' Tweet URLs
2024-05-17 22:49:12 +02:00
Mike Fährmann
7f1ed909d5
[imgur] match gallery/album/image URLs with title slugs ( #5593 )
2024-05-17 22:44:37 +02:00
Delphox
8ba73e2ec9
[twitter] match /video/ tweet urls
2024-05-17 16:50:51 -03:00
Mike Fährmann
2ee9ffeed6
merge #5568 : [furaffinity] match 'xfuraffinity' URLs
2024-05-09 19:20:12 +02:00
Delphox
11109d5bad
[furaffinity] match xfuraffinity.com
2024-05-08 12:15:47 -03:00
Mike Fährmann
699592498b
[tests] use random port number for local HTTP server
...
… and explicitly bind to 127.0.0.1 instead of all interfaces
2024-05-02 22:54:15 +02:00
Mike Fährmann
bd8e4797e5
[vsco] add 'avatar' extractor ( #5341 )
2024-05-02 18:12:19 +02:00
Mike Fährmann
d0cead105b
[formatter] allow dots etc in '…' literals ( #5539 )
...
don't parse fields starting with '
this disables the ability to directly apply […] to '…' literals,
but that's not really useful anyway and can still be done with _lit
2024-05-02 17:43:24 +02:00
Mike Fährmann
8ed70b3256
[tests] mark tests with missing auth as 'only_matching'
...
… instead of skipping them completely
2024-05-01 16:00:07 +02:00
Mike Fährmann
3cf5366143
[mastodon] add support for card images
2024-05-01 16:00:07 +02:00
Mike Fährmann
9b1995dda3
[mastodon] add 'favorite', 'list', and 'hashtag' extractors ( #5529 )
2024-05-01 15:59:34 +02:00
Mike Fährmann
6c57958806
merge #5511 : [twitter] [furaffinity] match fixvx.com and fxfuraffinity/fxraffinity.net URLs
2024-04-25 22:00:19 +02:00
Delphox
1886721d82
update tests
2024-04-25 13:28:30 -03:00
Mike Fährmann
cd241bea0a
[downloader:http] add MIME type and signature for .m4v files ( #5505 )
2024-04-25 01:01:35 +02:00
Mike Fährmann
068ccfe0b3
[tests] allow filtering extractor result tests by URL or comment
...
python test_results.py twitter:+/i/web/
python test_results.py twitter:~twitpic
2024-04-19 23:02:55 +02:00
Mike Fährmann
c9d3b5e5d9
[pixiv] change 'sanity_level' debug message to a warning ( #5180 )
2024-04-19 16:41:31 +02:00
Mike Fährmann
257e9fb435
[gelbooru] improve pagination logic for meta tags ( #5478 )
...
similar to 494acabd38
2024-04-15 23:14:48 +02:00
Mike Fährmann
e02d2ff45d
[tapas] add 'creator' extractor ( #5306 )
2024-04-11 23:41:50 +02:00
Mike Fährmann
35d4a706ae
[pixiv:novel] add 'covers' option ( #5373 )
2024-04-11 22:27:49 +02:00
Mike Fährmann
b57051719f
[wikimedia] support wiki.gg wikis
2024-04-09 19:24:01 +02:00
Mike Fährmann
40c1a8e471
[wikimedia] fix exception for files with empty 'metadata'
2024-04-09 19:12:15 +02:00
Mike Fährmann
0e730ba980
[pp:mtime] do not overwrite '_mtime' for None values ( #5439 )
2024-04-07 02:33:19 +02:00
Mike Fährmann
647a87d17c
[twitter] match '/photo/' Tweet URLs ( #5443 )
...
fixes regression introduced in 40c05535
2024-04-06 17:56:21 +02:00
Mike Fährmann
40bd145637
remove 'contextlib' imports
2024-04-06 16:59:09 +02:00
Mike Fährmann
9a8403917a
restore LD_LIBRARY_PATH for PyInstaller builds ( #5421 )
2024-04-06 16:58:33 +02:00
Mike Fährmann
095e5ded6f
[reddit] support comment embeds ( #5366 )
2024-04-01 23:35:42 +02:00
Mike Fährmann
64948f2c09
[foolfuuka] improve 'board' pattern & support pages ( #5408 )
2024-04-01 22:31:25 +02:00
Mike Fährmann
ef0c90414c
[wikimedia] suppress exception for entries without 'imageinfo' ( #5384 )
2024-03-26 15:33:26 +01:00
Mike Fährmann
9cce461627
[kemonoparty] add 'announcements' option ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2015919188
2024-03-26 15:20:14 +01:00
Mike Fährmann
72ac2c750d
[kemonoparty:favorite] support 'sort' and 'order' query params ( #5375 )
2024-03-26 02:27:36 +01:00
Mike Fährmann
d1d017ab5d
merge #5372 : [twitter] match Tweet URLs with query parameters ( #5371 )
...
fixes regression introduced in 40c05535
2024-03-25 22:01:46 +01:00
fireattack
423599ce95
[twitter] fix pattern for single tweet ( #5371 )
...
- Add optional slash
- Update tests to include some non-standard tweet URLs
2024-03-25 21:57:35 +01:00
Mike Fährmann
15a4bc2584
[kemonoparty] fix KeyError for empty files ( #5368 )
2024-03-24 02:21:38 +01:00
Mike Fährmann
31e7ca73b6
[gelbooru] add 'order-posts' option for favorites ( #5220 )
2024-03-23 13:30:09 +01:00
Mike Fährmann
55e8fdad29
[tests] use 'datetime.timezone.utc' instead of 'datetime.UTC'
...
'datetime.UTC' was added in Python 3.11
and is not defined in older versions.
2024-03-22 18:16:24 +01:00
Mike Fährmann
4b6f47e571
[pornhub:gif] extract 'viewkey' and 'timestamp' metadata ( #4463 )
...
https://github.com/mikf/gallery-dl/issues/4463#issuecomment-2014550302
2024-03-22 18:00:20 +01:00
Mike Fährmann
7a7dc442a0
[tests] update extractor results
2024-03-22 17:57:04 +01:00
Mike Fährmann
1d6260f456
[bunkr] remove 'description' metadata
...
album descriptions are no longer available on album pages
and the previous code erroneously returned just '0'
2024-03-22 02:14:41 +01:00
Mike Fährmann
32262a048b
[idolcomplex] fix metadata extraction
...
- replace legacy 'id' vales with alphanumeric ones, since the former are
no longer available
- approximate 'vote_average', since the real value is no longer
available
- fix 'vote_count'
2024-03-22 01:43:05 +01:00
Mike Fährmann
ddb2edfd32
[formatter] fix local DST datetime offsets for ':O'
...
'O' would get the *current* local UTC offset and apply it to all
'datetime' objects it gets applied to.
This would result in a wrong offset if the current offset includes
DST and the target 'datetime' does not or vice-versa.
'O' now determines the correct local UTC offset while respecting DST for
each individual 'datetime'.
2024-03-21 20:45:46 +01:00
Mike Fährmann
da6ba60331
[bluesky] add 'instance' metadata field ( #4438 )
2024-03-18 17:36:16 +01:00
Mike Fährmann
718c870430
[tests] show full path for nested values
...
'user.name' instead of just 'name' when testing for
"user": { … , "name": "…", … }
2024-03-18 17:36:16 +01:00
Mike Fährmann
26bc2d55f4
[hiperdex] update URL patterns & fix 'manga' metadata ( #5340 )
2024-03-18 17:36:16 +01:00
Mike Fährmann
8e694d85c4
[twitter] add 'birdwatch' metadata field ( #5317 )
...
should probably get a better name,
but this is what it's called internally by Twitter
2024-03-18 17:36:02 +01:00
Mike Fährmann
b8e7be225c
merge #5333 : [imagefap] fix folder extractor
2024-03-15 23:46:43 +01:00
Herp
99c53f7fa8
Fix imagefap extrcator
2024-03-15 23:44:25 +01:00
Mike Fährmann
1418c0ce38
[kemonoparty] add 'revision_count' metadata field ( #5334 )
2024-03-15 22:28:15 +01:00
Mike Fährmann
5716430c35
[deviantart:stash] recognize 'deviantart.com/stash/…' URLs
2024-03-15 18:14:55 +01:00
Mike Fährmann
76683c5f5c
[deviantart:stash] fix 'index' metadata ( #5335 )
2024-03-15 18:10:59 +01:00
Mike Fährmann
108abab537
[twitter] add 'protected' metadata field ( #5327 )
...
for 'author' and 'user'
2024-03-13 14:46:03 +01:00
blankie
225d849139
[mastodon] fix handling null 'moved' account field
2024-03-12 11:44:25 +11:00
Mike Fährmann
ac4e29f70a
[lensdump] support more direct link formats ( #5293 )
2024-03-09 23:33:58 +01:00
Mike Fährmann
d3003f8531
merge #5270 : [imagefap] add 'folder' metadata
2024-03-07 01:31:40 +01:00
Mike Fährmann
05331f9cf1
[imagefap] flake8, cleanup, tests
2024-03-07 01:29:19 +01:00
Mike Fährmann
40c0553523
[twitter] add 'quotes' extractor ( #5262 )
...
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-1981571924
It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
2024-03-07 00:52:50 +01:00
Mike Fährmann
790c0ffb8d
[lensdump] recognize direct image links ( #5293
2024-03-06 22:56:57 +01:00
Mike Fährmann
6d9e3c0eb1
[skeb] add extractor for followed users ( #5290 )
...
needs 'Authorization' header from browser session
-o headers.Authorization="Bearer ey…"
2024-03-06 22:43:01 +01:00
Mike Fährmann
ace16f00f5
[weibo] fix retweets ( #2825 , #3874 , #5263 )
...
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
2024-03-06 19:36:53 +01:00
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions ( #5287 )
2024-03-06 19:36:32 +01:00
Mike Fährmann
a8027745e3
[downloader:http] add MIME type and signature for .mov files ( #5287 )
2024-03-06 14:00:24 +01:00
Mike Fährmann
24873c2724
[warosu] fix crash for threads with deleted posts ( #5289 )
2024-03-06 01:27:45 +01:00
Mike Fährmann
f296067797
[naver] unescape post 'title' and 'description'
2024-03-06 00:46:19 +01:00
Mike Fährmann
a71cdab53e
merge #5126 : [naver] fix EUC-KR encoding issue in old image URLs
2024-03-06 00:22:33 +01:00
Mike Fährmann
a8d3efbb99
[naver] simplify code + add test
2024-03-06 00:21:23 +01:00
Mike Fährmann
7b28418f69
[naver] recognize '.naver' URLs
...
https://blog.naver.com/PostView.naver ?…
2024-03-05 22:30:29 +01:00
Mike Fährmann
a767832332
[deviantart:avatar] ignore default avatars ( #5276 )
2024-03-04 23:11:30 +01:00
Mike Fährmann
6482bbc525
[bluesky] handle different 'embed' structure
2024-03-03 20:41:01 +01:00
Mike Fährmann
1a9b9aa310
[artstation] support video clips ( #2566 , #3309 , #3911 )
...
- add 'videos' and 'previews' options
- fix 403 errors for video previews
2024-03-03 18:00:45 +01:00
Mike Fährmann
25d2854272
[deviantart] add 'comments-avatars' option ( #4995 )
2024-03-02 21:59:16 +01:00
Mike Fährmann
cf9e99c07b
[artstation] support collections ( #146 )
...
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1972101003
2024-03-01 20:21:21 +01:00
Mike Fährmann
32ec695195
merge #5256 : [wikimedia] add azurlane.koumakan.jp
2024-02-29 21:50:24 +01:00
Mike Fährmann
5d9ec91896
[azurlanewiki] supportedsites + test
2024-02-29 21:49:13 +01:00
Mike Fährmann
76581c13f7
handle URLs without '/' after their TLD ( #5252 )
2024-02-29 15:05:46 +01:00
Mike Fährmann
ba062712ad
[tests] '__main__' -> "__main__"
2024-02-27 02:10:05 +01:00
Mike Fährmann
8a11b72253
remove extractor/test.py ( #4504 )
2024-02-27 01:37:57 +01:00
Mike Fährmann
fde9e25c9f
[tests:kemonoparty] '.party' -> '.su'
2024-02-26 22:25:04 +01:00
Mike Fährmann
13443f40a3
[xvideos] support '/channels/' URLs ( #5244 )
2024-02-26 00:08:37 +01:00
Mike Fährmann
cc6b9e4c18
[zerochan] use API by default ( #3669 )
...
add 'pagination' option
2024-02-25 00:36:14 +01:00
blankie
962f55cc68
[artstation] fix handling usernames with dashes
2024-02-21 17:39:37 +11:00
Mike Fährmann
741fd00cec
[deviantart] extend 'metadata' option ( #5175 )
...
alloe fetching extended metadata in addition to the usual
'description', 'tags', etc by setting 'metadata' to a list of
'camera', 'stats', 'submission', 'collection', and 'gallery'
for example "metadata": "stats,submission"
2024-02-18 23:14:14 +01:00
Mike Fährmann
8a63801311
[vsco] add 'spaces' extractor ( #5202 )
...
for spaces listed on a user page
2024-02-17 18:20:48 +01:00
Mike Fährmann
ccb413df71
[wikimedia] support 'pidgi.net' and 'bulbapedia.bulbagarden.net' ( #5205 , #5206 )
2024-02-17 17:35:10 +01:00
Mike Fährmann
7033cc14e9
[vsco] add 'space' extractor ( #5202 )
2024-02-17 01:54:05 +01:00
Mike Fährmann
c9efccc959
[tests] update extractor results
2024-02-16 22:42:06 +01:00
Mike Fährmann
c413834dfc
[bluesky] extend tests
2024-02-16 16:30:02 +01:00
Mike Fährmann
24c1317e0d
[batoto] fix crash when manga/chapter contains a '-' ( #5200 )
2024-02-16 00:10:08 +01:00
Mike Fährmann
0abd9723af
[bluesky] add 'metadata' option ( #4438 )
...
allow extracting 'user' metadata and
make 'facets' extraction optional
2024-02-15 23:30:16 +01:00
Mike Fährmann
c97b92cc35
[fanbox] add 'home' and 'supporting' extractors ( #5138 )
2024-02-14 23:25:39 +01:00
Mike Fährmann
04e4ffc64c
[deviantart] combine 'png' option with 'quality' ( #4846 )
...
"quality": "png" to download PNGs instead og JPEGs
2024-02-14 22:07:29 +01:00
Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option ( #4846 )
2024-02-14 01:03:15 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
...
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor ( #5194 )
2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts ( #4567 , #5193 )
2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities ( #4913 )
2024-02-13 01:30:23 +01:00
Mike Fährmann
8f27f43d4d
[tests] implement explicitly disabling auth
2024-02-13 00:08:27 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction ( #5088 , #5151 , #5153 )
...
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor ( #5190 )
2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor ( #5143 )
2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187 : [skeb] add 'num' and 'count' metadata fields
2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests
2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
2024-02-10 18:17:07 +01:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites ( #1443 )
...
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata ( #4438 )
2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution ( #4438 )
...
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata ( #5171 )
2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support ( #4438 , #4708 , #4722 , #5047 )
2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs ( #5171 )
2024-02-07 14:57:13 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name ( #5104 )
2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics ( #5123 )
2024-02-04 21:38:46 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags ( #5120 )
2024-01-27 16:24:03 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs ( #5116 )
2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' ( #5114 )
2024-01-25 23:45:41 +01:00
Mike Fährmann
0d3af0d35b
[tests] ignore 'ytdl' categories when import fails ( #5095 )
2024-01-21 15:31:12 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain ( #5088 )
2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis ( #1443 , #2677 , #3378 )
...
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas
2024-01-21 09:50:27 +11:00
Mike Fährmann
0d367ce1b9
[tests] update extractor results
2024-01-20 18:02:36 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
...
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
...
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
...
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain
2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values
2024-01-19 14:21:09 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag ( #5076 )
2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize ( #1443 )
...
- support mediawiki.org
- support mariowiki.com (#3660 )
- combine code into a single extractor
(use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
...
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061 : [webtoons] extract more metadata
...
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
...
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
68196589c4
[2ch] update
...
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
Mike Fährmann
69726fc82c
[tests] skip tests requiring auth when non is provided
2024-01-14 22:47:16 +01:00