Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option ( #4846 )
2024-02-14 01:03:15 +01:00
Mike Fährmann
966c8608e6
[deviantart] move image content extraction into separate function
2024-02-14 00:30:06 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
...
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
cc1234
32472d7d6c
Add support for multi channels
2024-02-13 18:34:04 +00:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor ( #5194 )
2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts ( #4567 , #5193 )
2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities ( #4913 )
2024-02-13 01:30:23 +01:00
Mike Fährmann
cae77e85f8
[twitter] update query hashes
...
... as well as 'variables' and 'features' values
also remove unused legacy API code
2024-02-12 23:19:13 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction ( #5088 , #5151 , #5153 )
...
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
dcc6e3f65c
merge #5134 : [bunkr] add new bunkr domains ( #5130 )
2024-02-11 21:10:06 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor ( #5190 )
2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor ( #5143 )
2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187 : [skeb] add 'num' and 'count' metadata fields
2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests
2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
2024-02-10 18:17:07 +01:00
blankie
f9a8e8cacf
[skeb] add 'num' and 'count' metadata fields
2024-02-10 21:51:23 +11:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites ( #1443 )
...
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata ( #4438 )
2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution ( #4438 )
...
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
6414dc6bca
[idolcomplex] fix pagination for tags containing ':' ( #5171 )
2024-02-09 17:51:08 +01:00
Mike Fährmann
5c2a2321a2
[bluesky] update refresh token after using it ( #4438 )
2024-02-08 22:33:34 +01:00
Mike Fährmann
9c10be54fb
[bluesky] add 'following' extractor ( #4438 )
2024-02-08 21:58:17 +01:00
Mike Fährmann
86ce35d6a1
[bluesky] simplify 'pattern'
2024-02-08 21:28:21 +01:00
Mike Fährmann
da292ded4e
[bluesky] add 'list' extractor ( #4438 )
2024-02-08 21:24:07 +01:00
Mike Fährmann
004bf7bb38
[bluesky] add 'feed' extractor ( #4438 )
2024-02-08 21:01:44 +01:00
Mike Fährmann
6aea818d4e
[bluesky] allow using DIDs as user handles ( #4438 )
2024-02-08 20:15:54 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata ( #5171 )
2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support ( #4438 , #4708 , #4722 , #5047 )
2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs ( #5171 )
2024-02-07 14:57:13 +01:00
Mike Fährmann
6e928300bc
[flickr] handle non-JSON errors ( #5131 )
2024-02-06 21:22:10 +01:00
Mike Fährmann
90ac6d7375
[wikimedia] use '/api.php' as default API path
2024-02-06 00:36:51 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name ( #5104 )
2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics ( #5123 )
2024-02-04 21:38:46 +01:00
Jeff Mercado
d9d0601ab1
break up line to fit 80 char
2024-01-29 20:31:58 -08:00
Jeff Mercado
6bcd3c9380
[bunkr] add new bunkr domains ( #5130 )
2024-01-29 20:25:33 -08:00
Mike Fährmann
62d6f5f8d2
[luscious] fix IndexError for files without thumbnail ( #5122 )
2024-01-28 01:43:29 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags ( #5120 )
2024-01-27 16:24:03 +01:00
Mike Fährmann
3433481dd2
[gofile] update 'website_token' extraction
2024-01-27 01:10:14 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs ( #5116 )
2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' ( #5114 )
2024-01-25 23:45:41 +01:00
Mike Fährmann
67c99b1366
[patreon] prevent HttpError for stream.mux.com URLs
2024-01-21 22:50:40 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain ( #5088 )
2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis ( #1443 , #2677 , #3378 )
...
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
Mike Fährmann
5bf156f0b1
merge #5094 : [webtoons] fix extracting comic and episode name with commas
2024-01-21 00:47:26 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas
2024-01-21 09:50:27 +11:00
Wiiplay123
6eb62f2140
Combine lh*(-**).googleusercontent.com URL regex into one line.
...
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2024-01-20 15:53:11 -06:00
Wiiplay123
a6fed628dd
[blogger] Fix lh*.googleusercontent.com forward slash bug, add support for lh*-**.googleusercontent.com
...
Some URLs use "lh(number)-(locale).googleusercontent.com" format, so I added support for those.
Also, "lh(number).googleusercontent.com" formats were broken because the regex was looking for a second forward slash.
Examples:
lh7.googleusercontent.com
lh7-us.googleusercontent.com
2024-01-20 15:07:52 -06:00