1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-26 04:32:51 +01:00
Commit Graph

4074 Commits

Author SHA1 Message Date
Mike Fährmann
ac4e29f70a
[lensdump] support more direct link formats (#5293) 2024-03-09 23:33:58 +01:00
Mike Fährmann
146459056c
[reddit] provide 'fallback_url' as video fallback (#5296) 2024-03-07 15:58:01 +01:00
Mike Fährmann
d3003f8531
merge #5270: [imagefap] add 'folder' metadata 2024-03-07 01:31:40 +01:00
Mike Fährmann
05331f9cf1
[imagefap] flake8, cleanup, tests 2024-03-07 01:29:19 +01:00
Mike Fährmann
40c0553523
[twitter] add 'quotes' extractor (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-1981571924

It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
2024-03-07 00:52:50 +01:00
Mike Fährmann
790c0ffb8d
[lensdump] recognize direct image links (#5293 2024-03-06 22:56:57 +01:00
Mike Fährmann
6d9e3c0eb1
[skeb] add extractor for followed users (#5290)
needs 'Authorization' header from browser session
-o headers.Authorization="Bearer ey…"
2024-03-06 22:43:01 +01:00
Mike Fährmann
ace16f00f5
[weibo] fix retweets (#2825, #3874, #5263)
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
2024-03-06 19:36:53 +01:00
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions (#5287) 2024-03-06 19:36:32 +01:00
Mike Fährmann
db507e30c7
[pixiv] fix novel text extraction (#5285)
change to '/webview/v2/novel'
since '/v1/novel/text' does not work anymore
2024-03-06 02:31:26 +01:00
Mike Fährmann
296f20e630
[warosu] fix 'board_name' metadata 2024-03-06 01:28:47 +01:00
Mike Fährmann
24873c2724
[warosu] fix crash for threads with deleted posts (#5289) 2024-03-06 01:27:45 +01:00
Mike Fährmann
f296067797
[naver] unescape post 'title' and 'description' 2024-03-06 00:46:19 +01:00
Mike Fährmann
a71cdab53e
merge #5126: [naver] fix EUC-KR encoding issue in old image URLs 2024-03-06 00:22:33 +01:00
Mike Fährmann
a8d3efbb99
[naver] simplify code + add test 2024-03-06 00:21:23 +01:00
Johann Hong
f64fb8f239
[naver] EUC-KR encoding issue in old image URLs Fix
Around October 2010, the image server URL format and file name
encoding changed from EUC-KR to UTF-8.
Modified to detect old URL format and decode image URLs into EUC-KR

- (lint with flake8) Customize conditions
  Wrap lines smaller than 79 characters

- (lint with flake8) Customize conditions (2nd try)
  - One import per line
  - Indent on consecutive lines

- (lint with flake8) Customize conditions (3rd try)
  - E128 continuation line under-indented for visual indent
  - E123 closing bracket does not match indentation of opening bracket's line

- Update naver.py
  Check encoding for all image URLs
2024-03-06 00:21:23 +01:00
Mike Fährmann
7b28418f69
[naver] recognize '.naver' URLs
https://blog.naver.com/PostView.naver?…
2024-03-05 22:30:29 +01:00
Mike Fährmann
a767832332
[deviantart:avatar] ignore default avatars (#5276) 2024-03-04 23:11:30 +01:00
Mike Fährmann
0cbc910905
[deviantart:avatar] fix 'index' for avatars without '?' (#5276) 2024-03-04 22:31:35 +01:00
Mike Fährmann
6482bbc525
[bluesky] handle different 'embed' structure 2024-03-03 20:41:01 +01:00
Mike Fährmann
1115dccd0d
[bluesky] fix feeds and lists
bug introduced in 495c9ee1
2024-03-03 20:22:34 +01:00
Mike Fährmann
1a9b9aa310
[artstation] support video clips (#2566, #3309, #3911)
- add 'videos' and 'previews' options
- fix 403 errors for video previews
2024-03-03 18:00:45 +01:00
termvacycurtocs
f8b037ed40
[Imagefap] Add folder metadata
[Imagefap] Add "folder" metadata when downloading a folder or user profile.
No additional request is made to the server.

Use for example with the following configuration :
"parent-metadata": true
"directory":["{category}", "{uploader}", "{folder}", "{gallery_id} {title}"]
2024-03-02 22:15:45 +01:00
Mike Fährmann
982880615d
[deviantart] prevent unnecessary API requests (#4995)
… when using 'comments-avatars'

This also has the added benefit of making it possible to download
comment avatars from users without a valid user profile entry,
like deleted users.
2024-03-02 21:59:16 +01:00
Mike Fährmann
25d2854272
[deviantart] add 'comments-avatars' option (#4995) 2024-03-02 21:59:16 +01:00
Mike Fährmann
218ec1a9ee
[instagram] raise proper error for missing 'reels_media' (#5257) 2024-03-02 21:58:59 +01:00
Mike Fährmann
82c73c77b0
[redgifs] make 'date' available for directories (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-1973975415
2024-03-01 23:39:16 +01:00
Mike Fährmann
cf9e99c07b
[artstation] support collections (#146)
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1972101003
2024-03-01 20:21:21 +01:00
Mike Fährmann
32ec695195
merge #5256: [wikimedia] add azurlane.koumakan.jp 2024-02-29 21:50:24 +01:00
thatfuckingbird
88a06df165 [wikimedia] add azurlane.koumakan.jp to presets 2024-02-29 19:28:50 +01:00
Mike Fährmann
1db0a587f3
[nitter] ignore invalid Tweets (#5253)
like "Load newest"
2024-02-29 16:31:37 +01:00
Mike Fährmann
a00b171d4e
[bluesky] wait unitl 'RateLimit-Reset' on 429 responses 2024-02-28 18:13:16 +01:00
Mike Fährmann
7d874e2497
[bluesky] improve API error messages 2024-02-28 14:45:36 +01:00
Mike Fährmann
495c9ee126
[bluesky] add 'reposts' option (#4438, #5248) 2024-02-27 18:42:29 +01:00
Mike Fährmann
c8b591303f
[paheal] cleanup 2024-02-27 02:27:20 +01:00
Mike Fährmann
8a11b72253
remove extractor/test.py (#4504) 2024-02-27 01:37:57 +01:00
Mike Fährmann
fde9e25c9f
[tests:kemonoparty] '.party' -> '.su' 2024-02-26 22:25:04 +01:00
Mike Fährmann
311a21bfb2
[bluesky] fix '/follows' not spawning child extractors (#5246) 2024-02-26 15:38:31 +01:00
Mike Fährmann
d3dca68225
[xvideos] fix galleries with more than 500 images (#5244) 2024-02-26 15:36:41 +01:00
Mike Fährmann
13443f40a3
[xvideos] support '/channels/' URLs (#5244) 2024-02-26 00:08:37 +01:00
Mike Fährmann
c60ebc6519
[deviantart] improve fetching extended metadata (#5175)
use multiple metadata API calls per chunk of deviations if necessary
2024-02-25 03:36:00 +01:00
Mike Fährmann
cc6b9e4c18
[zerochan] use API by default (#3669)
add 'pagination' option
2024-02-25 00:36:14 +01:00
Mike Fährmann
a2b55d5dde
[skeb] retry 429 responses containing a 'request_key' cookie (#5210) 2024-02-24 00:54:15 +01:00
Mike Fährmann
b4c46de4b8
merge #5224: [artstation] update URL patterns to recognize usernames with dashes 2024-02-21 14:41:02 +01:00
blankie
962f55cc68
[artstation] fix handling usernames with dashes 2024-02-21 17:39:37 +11:00
Mike Fährmann
fe7e2281ac
[nijie] increase default delay between requests (#5221)
1-2s is not enough
2024-02-20 18:19:49 +01:00
Mike Fährmann
a34312e3ac
[instagram] make accessing 'like_count' non-fatal (#5218) 2024-02-19 19:24:51 +01:00
Mike Fährmann
741fd00cec
[deviantart] extend 'metadata' option (#5175)
alloe fetching extended metadata in addition to the usual
'description', 'tags', etc by setting 'metadata' to a list of
'camera', 'stats', 'submission', 'collection', and 'gallery'

for example "metadata": "stats,submission"
2024-02-18 23:14:14 +01:00
Mike Fährmann
8a63801311
[vsco] add 'spaces' extractor (#5202)
for spaces listed on a user page
2024-02-17 18:20:48 +01:00
Mike Fährmann
ccb413df71
[wikimedia] support 'pidgi.net' and 'bulbapedia.bulbagarden.net' (#5205, #5206) 2024-02-17 17:35:10 +01:00
Mike Fährmann
7033cc14e9
[vsco] add 'space' extractor (#5202) 2024-02-17 01:54:05 +01:00
Mike Fährmann
770aec922d
[fapachi] ignore empty entries 2024-02-16 22:43:37 +01:00
Mike Fährmann
ee7c054855
[bluesky] add 'search' extractor (#4438)
Both https://bsky.app/search?q=QUERY and https://bsky.app/search/QUERY
are recognized as search URLs, where QUERY gets forwarded unmodified as
'q' parameter for app.bsky.feed.searchPosts .

User searches are not supported yet.
2024-02-16 15:58:47 +01:00
Mike Fährmann
91e5c4fdfe
[bluesky] add 'avatar' and 'background' extractors (#4438) 2024-02-16 15:41:19 +01:00
Mike Fährmann
24c1317e0d
[batoto] fix crash when manga/chapter contains a '-' (#5200) 2024-02-16 00:10:08 +01:00
Mike Fährmann
0abd9723af
[bluesky] add 'metadata' option (#4438)
allow extracting 'user' metadata and
make 'facets' extraction optional
2024-02-15 23:30:16 +01:00
Mike Fährmann
7e036ea290
[bluesky] add 'depth' option (#4438)
and reduce default depth and parentHeight values
2024-02-15 22:26:05 +01:00
Mike Fährmann
42335ea880
[zerochan] fix skipping every other post 2024-02-15 02:51:01 +01:00
Mike Fährmann
c97b92cc35
[fanbox] add 'home' and 'supporting' extractors (#5138) 2024-02-14 23:25:39 +01:00
Mike Fährmann
04e4ffc64c
[deviantart] combine 'png' option with 'quality' (#4846)
"quality": "png" to download PNGs instead og JPEGs
2024-02-14 22:07:29 +01:00
Mike Fährmann
9cc4ec2c58
[deviantart] add 'png' option (#4846) 2024-02-14 01:03:15 +01:00
Mike Fährmann
966c8608e6
[deviantart] move image content extraction into separate function 2024-02-14 00:30:06 +01:00
Mike Fährmann
1d1ffe3317
[pornpics] update 'channel' extraction & add test
change 'channel' to a list, since extracting both 'channel' and
'channels' does not really work with text.extract_from()
2024-02-13 23:48:46 +01:00
cc1234
32472d7d6c Add support for multi channels 2024-02-13 18:34:04 +00:00
Mike Fährmann
139ff3f6ab
[kemonoparty] add 'posts' extractor (#5194) 2024-02-13 15:41:34 +01:00
Mike Fährmann
814ad9321e
[deviantart] skip locked/blurred posts (#4567, #5193) 2024-02-13 14:15:12 +01:00
Mike Fährmann
f7f8ef8684
[twitter] support communities (#4913) 2024-02-13 01:30:23 +01:00
Mike Fährmann
cae77e85f8
[twitter] update query hashes
... as well as 'variables' and 'features' values
also remove unused legacy API code
2024-02-12 23:19:13 +01:00
Mike Fährmann
06cb518d97
[bunkr] fix extraction (#5088, #5151, #5153)
- remove legacy code
- map legacy domains to bunkr.sk
- use input URL domain for newer domains
- update tests (some files got slightly modified or deleted)
2024-02-11 22:36:03 +01:00
Mike Fährmann
dcc6e3f65c
merge #5134: [bunkr] add new bunkr domains (#5130) 2024-02-11 21:10:06 +01:00
Mike Fährmann
4641937ca3
[imagetwist] add 'gallery' extractor (#5190) 2024-02-11 18:41:02 +01:00
Mike Fährmann
fde82ab0ce
[imagechest] add 'user' extractor (#5143) 2024-02-11 18:38:33 +01:00
Mike Fährmann
4474cea31b
merge #5187: [skeb] add 'num' and 'count' metadata fields 2024-02-10 19:36:59 +01:00
Mike Fährmann
4cfceb23cb
[skeb] rename 'data' -> 'file' & add tests 2024-02-10 19:35:50 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option (#5183) 2024-02-10 18:17:07 +01:00
blankie
f9a8e8cacf
[skeb] add 'num' and 'count' metadata fields 2024-02-10 21:51:23 +11:00
Mike Fährmann
af61d2b037
[wikimedia] combine most wikimedia.org sites (#1443)
add wikidata.org and wikivoyage.org
2024-02-10 03:00:58 +01:00
Mike Fährmann
c7d17f1111
[bluesky] extract 'hashtags', 'mentions', and 'uris' metadata (#4438) 2024-02-10 00:01:55 +01:00
Mike Fährmann
55bbd49a0e
[bluesky] download images in original resolution (#4438)
at least up to 2000 px
2024-02-09 21:33:33 +01:00
Mike Fährmann
6414dc6bca
[idolcomplex] fix pagination for tags containing ':' (#5171) 2024-02-09 17:51:08 +01:00
Mike Fährmann
5c2a2321a2
[bluesky] update refresh token after using it (#4438) 2024-02-08 22:33:34 +01:00
Mike Fährmann
9c10be54fb
[bluesky] add 'following' extractor (#4438) 2024-02-08 21:58:17 +01:00
Mike Fährmann
86ce35d6a1
[bluesky] simplify 'pattern' 2024-02-08 21:28:21 +01:00
Mike Fährmann
da292ded4e
[bluesky] add 'list' extractor (#4438) 2024-02-08 21:24:07 +01:00
Mike Fährmann
004bf7bb38
[bluesky] add 'feed' extractor (#4438) 2024-02-08 21:01:44 +01:00
Mike Fährmann
6aea818d4e
[bluesky] allow using DIDs as user handles (#4438) 2024-02-08 20:15:54 +01:00
Mike Fährmann
aee5580c62
[idolcomplex] extract 'id_alnum' metadata (#5171) 2024-02-08 18:29:54 +01:00
Mike Fährmann
cf7d6be2d4
[bluesky] initial support (#4438, #4708, #4722, #5047) 2024-02-07 19:09:33 +01:00
Mike Fährmann
6ef143ea31
[idolcomplex] support alphanumeric post IDs (#5171) 2024-02-07 14:57:13 +01:00
Mike Fährmann
6e928300bc
[flickr] handle non-JSON errors (#5131) 2024-02-06 21:22:10 +01:00
Mike Fährmann
90ac6d7375
[wikimedia] use '/api.php' as default API path 2024-02-06 00:36:51 +01:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name (#5104) 2024-02-05 15:54:06 +01:00
Mike Fährmann
de752eb7b1
[naverwebtoon] support '/webtoon/' paths for all comics (#5123) 2024-02-04 21:38:46 +01:00
Jeff Mercado
d9d0601ab1 break up line to fit 80 char 2024-01-29 20:31:58 -08:00
Jeff Mercado
6bcd3c9380 [bunkr] add new bunkr domains (#5130) 2024-01-29 20:25:33 -08:00
Mike Fährmann
62d6f5f8d2
[luscious] fix IndexError for files without thumbnail (#5122) 2024-01-28 01:43:29 +01:00
Mike Fährmann
22647c2626
[naverwebtoon] fix 'title' for comics with empty tags (#5120) 2024-01-27 16:24:03 +01:00
Mike Fährmann
3433481dd2
[gofile] update 'website_token' extraction 2024-01-27 01:10:14 +01:00
Mike Fährmann
1f7101d606
[archivedmoe] fix thebarchive webm URLs (#5116) 2024-01-27 00:24:41 +01:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option (#5073) 2024-01-26 17:56:08 +01:00
Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions (#5013)
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects (#5103)
... when computing 'revision_hash'

regression caused by 3d68eda4

dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
beacfa7436
[bunkr] update domain to 'bunkr.sk' (#5114) 2024-01-25 23:45:41 +01:00
Mike Fährmann
67c99b1366
[patreon] prevent HttpError for stream.mux.com URLs 2024-01-21 22:50:40 +01:00
Mike Fährmann
f3ad91b44f
[bunkr] update domain (#5088) 2024-01-21 03:00:57 +01:00
Mike Fährmann
c7a42880ab
[wikimedia] support fandom wikis (#1443, #2677, #3378)
Wikis hosted on fandom.com are just wikimedia instances
and support its API.
2024-01-21 00:52:02 +01:00
Mike Fährmann
5bf156f0b1
merge #5094: [webtoons] fix extracting comic and episode name with commas 2024-01-21 00:47:26 +01:00
blankie
df718887c2
[webtoons] fix extracting comic and episode name with commas 2024-01-21 09:50:27 +11:00
Wiiplay123
6eb62f2140
Combine lh*(-**).googleusercontent.com URL regex into one line.
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2024-01-20 15:53:11 -06:00
Wiiplay123
a6fed628dd
[blogger] Fix lh*.googleusercontent.com forward slash bug, add support for lh*-**.googleusercontent.com
Some URLs use "lh(number)-(locale).googleusercontent.com" format, so I added support for those.

Also, "lh(number).googleusercontent.com" formats were broken because the regex was looking for a second forward slash.

Examples:
lh7.googleusercontent.com
lh7-us.googleusercontent.com
2024-01-20 15:07:52 -06:00
Mike Fährmann
6f8592eaff
[hbrowse] remove from modules list 2024-01-20 18:25:38 +01:00
Mike Fährmann
acc94ac187
[realbooru] fix extraction
revert ac97aca99c
2024-01-20 17:56:07 +01:00
Mike Fährmann
9599151118
[issuu] fix extraction 2024-01-20 16:44:48 +01:00
Mike Fährmann
9ca6117c67
[hbrowse] remove module
website gone
2024-01-20 02:53:44 +01:00
Mike Fährmann
375eefb886
[chevereto] remove 'pixl.li'
"Pixl is closing down"
"All images will be deleted January 1st."
2024-01-20 02:21:40 +01:00
Mike Fährmann
321861af7e
[erome] fix 'count' metadata 2024-01-20 00:26:41 +01:00
Mike Fährmann
b41d9bf616
[paheal] fix 'source' metadata 2024-01-19 22:24:39 +01:00
Mike Fährmann
b0a441f1e3
[nitter] remove 'nitter.lacontrevoie.fr'
"Fermeture de Nitter / Closing down Nitter"
2024-01-19 19:34:16 +01:00
Mike Fährmann
a1c1e80f67
[giantessbooru] update domain 2024-01-19 14:21:56 +01:00
Mike Fährmann
2007cb2f59
[tests] check extractor category values 2024-01-19 14:21:09 +01:00
Mike Fährmann
fc4e737f67
[wikimedia] include 'sha1' in default filenames 2024-01-19 03:08:43 +01:00
Mike Fährmann
44f2c15a04
[wikimedia] handle 'File:' paths 2024-01-19 03:05:45 +01:00
Mike Fährmann
93b4120e77
[gelbooru] support 'all' and empty tag (#5076) 2024-01-18 21:49:33 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs (#5073) 2024-01-18 16:23:14 +01:00
Mike Fährmann
ea553a1d55
[wikimedia] generalize (#1443)
- support mediawiki.org
- support mariowiki.com (#3660)

- combine code into a single extractor
  (use prefix as subcategory)
- handle non-wiki instances
- unescape titles
2024-01-18 15:36:16 +01:00
Mike Fährmann
89066844f4
add 'config_instance' method
to allow for a more streamlined access to BaseExtractor instance options
2024-01-18 03:20:36 +01:00
Mike Fährmann
c3c1635ef3
[wikimedia] update
- rewrite using BaseExtractor
- support most Wiki* domains
- update docs/supportedsites
- add tests
2024-01-17 00:08:06 +01:00
Ailothaen
221f54309c
[wikimedia] Improved archive identifiers 2024-01-16 02:32:32 +01:00
Ailothaen
e33056adcd
[wikimedia] Add Wikipedia/Wikimedia extractor 2024-01-16 02:32:25 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata (#4706, #4727, #5013)
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.

This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
799a8206ad
merge #5061: [webtoons] extract more metadata
- author_name
- comic_name
- episode_name
- username
2024-01-15 18:27:12 +01:00
Mike Fährmann
8ffa0cd3c8
[webtoons] small optimization
don't extract the entire 'author_area' and
avoid creating a second 'text.extract_from()' object
2024-01-15 18:24:47 +01:00
Mike Fährmann
59cf4b3884
merge #4444: [2ch] add 'thread' and 'board' extractors (#1009, #3540) 2024-01-15 17:50:34 +01:00
Mike Fährmann
90b382304a
[deviantart] fix KeyError: 'premium_folder_data' (#5063) 2024-01-15 17:30:03 +01:00
Mike Fährmann
4cedf378d5
[deviantart] fix AttributeError for URLs without username (#5065)
caused by 4f367145
2024-01-15 16:28:57 +01:00
Mike Fährmann
68196589c4
[2ch] update
- simplify extractor code
- more metadata
- add tests
2024-01-15 04:09:05 +01:00
hunter-gatherer8
6c4abc982e
[2ch] add 'thread' and 'board' extractors
- [2ch] add thread extractor
- [2ch] add board extractor
- [2ch] add new entry to supported sites
2024-01-15 03:51:03 +01:00
blankie
bb446b1598
[webtoons] extract more metadata 2024-01-14 19:26:49 +11:00
Mike Fährmann
355b909f46
merge #5041: [steamgriddb] add support (#5033) 2024-01-13 00:59:15 +01:00
Mike Fährmann
71e2c3e5a2
merge #5037: [hatenablog] add support (#5036) 2024-01-13 00:57:21 +01:00
blankie
9f53daabb8
[hatenablog] implement additional suggestion 2024-01-13 10:43:25 +11:00
blankie
293f1559df
[hatenablog] implement suggestions 2024-01-13 10:42:22 +11:00
blankie
65f42442f5
[steamgriddb] implement another suggestion 2024-01-13 10:12:15 +11:00
blankie
8995fd5f01
[steamgriddb] implement suggestions 2024-01-13 09:55:39 +11:00
Mike Fährmann
2dcfb012ea
[patreon] download 'm3u8' manifests with ytdl 2024-01-12 02:33:27 +01:00
Mike Fährmann
1c68b7df01
[patreon] fix KeyError (#5048) 2024-01-11 17:56:47 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts (#5049) 2024-01-11 05:07:38 +01:00
Mike Fährmann
bbf96753e2
[gelbooru] only log "Incomplete API response" for favorites (#5045) 2024-01-10 17:27:46 +01:00
Mike Fährmann
39904c9e4e
[deviantart:avatar] add 'formats' option (#4995) 2024-01-10 17:13:34 +01:00
Mike Fährmann
5c43098a1a
[twitter] revert to using 'media' timeline by default (#4953)
This reverts commit a94f944148.
2024-01-09 23:19:39 +01:00