Mike Fährmann
f7246f025f
[weibo] simplify 'livephoto' extraction ( #6471 )
...
continuation of 396b52aef7
fixes wrong 'filename' and 'extension' values
when 'ssig' query parameter contains "%2F"
2024-11-16 08:19:02 +01:00
Mike Fährmann
cb09273670
[koharu] implement 'tags' option
2024-11-15 23:49:58 +01:00
Mike Fährmann
ddd325b435
merge #6432 : [koharu] update domain ( #6430 )
2024-11-15 22:41:46 +01:00
Mike Fährmann
e5c2882320
[koharu] cleanup
...
- update BASE_PATTERN formatting
- fix groups indices
- add tests for new domains
- update docs/supportedsites
2024-11-15 22:41:40 +01:00
K0ng2
a09d9edaa6
[koharu] updat root and root_api change
2024-11-15 22:14:33 +01:00
Mike Fährmann
0d1469f229
[exhentai] implement 'tags' option ( #2117 )
...
allow splitting tags into categories,
e.g. 'tags_parody', 'tags_group', etc.
2024-11-15 21:47:13 +01:00
Mike Fährmann
c82f3db098
[common] add 'proxy-env' option
...
(#6134 , #6455 )
disable using environment proxies by default
2024-11-15 18:03:56 +01:00
Mike Fährmann
0a72a5009c
[common] disable Authorization header injection from .netrc auth
...
(#6134 , #6455 )
2024-11-15 17:37:04 +01:00
Mike Fährmann
a3dbc58172
[pillowfort] provide 'count' metadata field ( #6478 )
2024-11-15 08:27:52 +01:00
Mike Fährmann
9821503226
[misc] 'api_root' -> 'root_api'
2024-11-14 23:44:15 +01:00
Mike Fährmann
e763efd36c
[bilibili] add workarounds for getting rate-limited ( #6443 )
...
- set 3-6 second request_interval by default
- retry request after waiting 5 minutes
2024-11-14 23:06:26 +01:00
Mike Fährmann
cfe24a9e31
[twitter] make 'source' metadata extraction non-fatal ( #6472 )
2024-11-14 18:59:01 +01:00
Mike Fährmann
396b52aef7
[weibo] fix livephoto 'filename' & 'extension' ( #6471 )
2024-11-14 18:56:18 +01:00
Mike Fährmann
a3276e3b5d
[hentaifoundry] add 'tag' extractor ( #6465 )
2024-11-13 20:56:37 +01:00
Mike Fährmann
b62c466c14
[flickr] fix video download URLs ( #6464 )
...
continuation of 0e18fa395d
fix video detection in '_file_url'
2024-11-13 20:56:37 +01:00
Mike Fährmann
2b96d638dc
[bunkr] support 'bunkr.cr' URLs
2024-11-10 20:43:33 +01:00
Mike Fährmann
096b9f1d26
[bunkr] fix album names containing <>&
...
unescaping HTML entities once is not good enough
2024-11-10 20:38:21 +01:00
Mike Fährmann
c61c0461a9
[urlgalleries] fix 'root' and update 'request_interval'
2024-11-10 20:28:55 +01:00
Mike Fährmann
73d6e56a8f
merge #6443 : [bilibili] add support for articles ( #2824 )
2024-11-10 18:01:51 +01:00
Mike Fährmann
82d561e825
[bilibili] update
...
- use self.groups[…] to access matched values
- extract more metadata (count, width, height, size)
- remove type hint
- add tests
- update docs/supportedsites
2024-11-10 17:59:24 +01:00
hdk5
fc59e0fb14
[bilibili] support large articles
2024-11-10 15:18:03 +02:00
Mike Fährmann
74f1e9a1ac
[poipiku] return 'count' as proper number ( #6445 )
2024-11-10 08:26:43 +01:00
hdk5
6eef3e3495
[bilibili] initial support ( #2824 )
2024-11-10 00:21:27 +02:00
Mike Fährmann
7916c8bf77
allow passing cookies to OAuth extractors
...
partially revert ce54b8c04c
2024-11-09 18:06:27 +01:00
Mike Fährmann
0e18fa395d
[flickr] use "download" URLs ( #6360 )
2024-11-09 17:33:27 +01:00
Mike Fährmann
1ddbcda58b
[nhentai] support ',webp' files ( #6442 )
2024-11-08 17:46:38 +01:00
Mike Fährmann
b6cf348658
[webtoons] extract 'episode_no' for comic results ( #6439 )
2024-11-08 14:19:17 +01:00
Mike Fährmann
77f761d320
merge #6437 : [philomena:ponybooru] switch default filter
...
… to get everything by default
2024-11-08 08:20:10 +01:00
Mike Fährmann
6205e255f4
merge #6394 : [tumblr] add 'search' extractor
2024-11-08 08:17:46 +01:00
Mike Fährmann
33778d35ba
[tumblr] update
...
- simplify
- fix search pagination
- support custom search mode and post types
2024-11-08 08:15:13 +01:00
Shelvacu
f8e707b92c
[philomena] switch default ponybooru filter to get everything by default
...
The system filter mislabeled "Everything" hides 4 tags https://ponybooru.org/filters/2
There are [many public filters that don't hide anything](https://ponybooru.org/filters?fq=spoilered_count%3A0%2C+hidden_count%3A0 ), I just picked [the oldest one](https://ponybooru.org/filters/3 ).
2024-11-07 20:08:42 -08:00
Mike Fährmann
ce90566c56
[pinterest] detect video/audio by block content ( #6421 )
...
story blocks from search/board results do not always contain a 'type'
2024-11-05 15:55:24 +01:00
Mike Fährmann
a9a9f3a180
[pinterest] support 'story_pin_music_block' blocks ( #6421 )
2024-11-05 15:55:24 +01:00
Mike Fährmann
0b3ddd01af
[hiperdex] update domain to 'hipertoon.com' ( #6420 )
...
and fix 'description' extraction
2024-11-05 15:54:42 +01:00
Mike Fährmann
9afbe91f82
[rule34xyz] add 'format' option ( #1078 )
2024-11-05 15:45:52 +01:00
Mike Fährmann
51b16d078b
[rule34xyz] ensure 'files' keys are strings ( #1078 )
...
fixes -K/--list-keywords
2024-11-05 09:34:17 +01:00
Mike Fährmann
390b8ddd3e
[common] emit logging messages for --write-pages files
2024-11-03 20:38:33 +01:00
Mike Fährmann
cb0d8cae77
merge #6227 : [everia] add support ( #1067 , #2472 , #4091 )
2024-11-03 17:52:17 +01:00
Mike Fährmann
cea062ffc5
[everia] update
...
- implement general _pagination method
- simplify code
- adjust URL patterns
- update test results
2024-11-03 17:51:04 +01:00
missionfloyd
d31a3b5da3
[everia.club] Add support
...
- Unescape title and URL
- Add tags and categories metadata
Lookup tag id with API instead of downloading tag page
- Add category extractor
- Add tests
- Rename EveriaExtractor to EveriaPostExtractor
- Fix EveriaPostExtractor example
- Lookup tags/categories by post id
- Add date extractor
- Remove leftover pages parameter
- Add error handling for invalid dates.
- Add filename numbering
Parse date
- Rename extract() to images()
- Remove html import
- Fix search/date URLs with page number
- Fix tag/category search
- Fix post extractor
- Fix tag, category extractors
- Fix search extractor
- Only load first page once
- Fix date extractor
- Fix tests
- Clean up search extractor
2024-11-03 14:09:07 +01:00
Mike Fährmann
9b59af8d8d
[instagram] fix using numeric cursor values ( #6414 )
2024-11-03 12:03:01 +01:00
Mike Fährmann
d787c0c4ea
[rule34xyz] add support ( #1078 , #4960 )
2024-11-03 10:12:26 +01:00
Mike Fährmann
7c0d2ca07d
[rule34vault] update
...
- implement 'tags' categorization
- don't use 'totalCount' for pagination end
- update tests
2024-11-03 09:59:25 +01:00
Mike Fährmann
d5fa1d6aba
[sankaku] improve tag categorization code
...
translate tag type ID to name for each category
instead of for each tag
2024-11-03 09:21:39 +01:00
Delphox
565dc5b43b
[bluesky] match fxbsky.app and vxbsky.app
2024-11-02 16:00:43 -03:00
Mike Fährmann
93adfbe935
merge #6410 : [bluesky] match common bluesky embed fixes
2024-11-02 18:28:07 +01:00
Mike Fährmann
cd47425ccc
[bluesky] fix downloads from non-bsky PDSs ( #6406 )
2024-11-02 18:22:34 +01:00
Mike Fährmann
9deed87340
[bluesky] add 'author["instance"]' metadata ( #4438 )
2024-11-02 17:37:11 +01:00
Delphox
80c7246732
[bluesky] match cbsky.app, bskye.app, bskyx.app and bsyy.app urls
2024-11-02 13:04:32 -03:00
Mike Fährmann
99fe2b1f55
[bluesky] support 'main.bsky.dev' URLs ( #4438 )
2024-11-02 15:33:31 +01:00