Mike Fährmann
7eadcbea70
[4chanarchives] add end condition for 'board' extractor ( #4012 )
2023-05-06 20:52:45 +02:00
Mike Fährmann
1406f7125f
[4chanarchives] add 'thread' and 'board' extractors ( #4012 )
2023-05-06 20:45:57 +02:00
Mike Fährmann
d12dd3813c
[imgur] fix internal image/album URLs
...
URLs from "link" attributes of newer images/albums were all returned
as 'https://imgur.com/gallery/ ...' instead of the expected format,
causing them to be ignored.
2023-05-06 15:13:38 +02:00
Mike Fährmann
8520de57f0
[imgur] add 'favorite-folder' extractor ( #4016 )
2023-05-06 15:10:13 +02:00
Mike Fährmann
3ca5dac8b6
extend 'cookies-update' functionality
...
Allow writing cookies to a different file than a given cookies.txt,
making it possible to export cookies imported with --cookies-from-browser
To convert browser cookies to cookies.txt format:
gallery-dl --cookies-fr chromium \
-o cookies-update=cookies.txt \
--no-download \
http://example.org/file.jpg
2023-05-04 15:10:47 +02:00
Mike Fährmann
bc6d65d203
implement 'Extractor.config_deprecated()'
...
a version of 'Extractor.config()'
that logs a warning when using a deprecated option name
2023-05-04 10:49:14 +02:00
Mike Fährmann
850df34c31
remove '&' from URL patterns part 2
...
follow-up on 968d3e8465
2023-05-03 20:26:25 +02:00
Mike Fährmann
4d415376d1
[pinterest] fix 'pin.it' extractor
...
it really was just the single '/' at the end of the url_shortener URL
2023-05-03 20:05:10 +02:00
Mike Fährmann
657b6a9100
[pinterest] update endpoint for related board pins
2023-05-03 18:41:09 +02:00
Mike Fährmann
79f47f98dd
[nana] remove module
...
permanently gone since 2023-03-13
2023-05-03 18:05:53 +02:00
Mike Fährmann
0e74df1de8
[420chan] remove module
...
offline since 2022-06-01
2023-05-03 17:46:21 +02:00
Mike Fährmann
7499fa7075
[exhentai] remove and update sad panda check
...
there hasn't been a sad panda in several years
2023-05-03 17:39:49 +02:00
Mike Fährmann
076380e079
remove '*' indicating keyword-only arguments
...
they are kind of unnecessary and
cause a non-insignificant function call overhead (~10%)
2023-05-02 22:23:33 +02:00
Mike Fährmann
0c46758a93
[foolslide] remove 'sensescans.com'
...
group moved to mangadex
https://mangadex.org/group/1071e71d-cc55-4fa6-81d1-4b5913a2fde5/sense-scans
2023-05-02 20:09:04 +02:00
Mike Fährmann
a08fdfac6e
[foolfuuka] add 'archive.palanq.win'
2023-05-02 19:58:55 +02:00
Mike Fährmann
1870df8b23
[foolfuuka] remove 'tokyochronos.net'
2023-05-02 19:25:50 +02:00
Mike Fährmann
ef4e2d8178
[foolfuuka] remove 'archive.alice.al'
2023-05-02 19:23:26 +02:00
Mike Fährmann
b12dad8df5
[pixiv] fix 'pixivision' extraction
2023-04-30 15:35:32 +02:00
Mike Fährmann
5fb7107f2b
[imxto] fix 'gallery' extraction
...
support both single and double quotes
2023-04-30 15:23:13 +02:00
Mike Fährmann
15d7c5a199
[behance] 'items()' -> 'values()'
...
we only need 'size', 'name' is unnecessary
2023-04-30 13:53:51 +02:00
Mike Fährmann
0fb580135d
[behance] fix extraction ( #3980 )
2023-04-29 16:18:35 +02:00
Alexandru Vasilescu
d4f8b2fe22
fix: linter issues
2023-04-28 13:45:23 +03:00
Alexandru Vasilescu
1b918bd937
fix(extractor): fix extraction for cross-posted reddit videos and galleries
2023-04-28 13:13:25 +03:00
Mike Fährmann
215028a462
[manganelo] match more minor version separators ( #3972 )
2023-04-27 13:12:11 +02:00
thatfuckingbird
9f76783ac0
[pixiv] allow sorting by popularity (requires pixiv premium)
2023-04-26 22:49:29 +02:00
Mike Fährmann
7865067d19
[shimmie2] add generic extractors for Shimmie2 sites ( #3734 )
...
add support for
- loudbooru.com (#3734 )
- booru.cavemanon.xyz (#3734 )
- giantessbooru.com (#943 )
- tentaclerape.net
2023-04-26 19:20:44 +02:00
Mike Fährmann
28419bf45a
[itchio] add 'game' extractor ( #3923 )
2023-04-26 19:20:43 +02:00
Mike Fährmann
5297ee0cd9
[tumblr] add 'day' extractor ( #3951 )
2023-04-24 22:01:47 +02:00
Mike Fährmann
de670bd7de
[tumblr] update pagination logic ( #2191 )
2023-04-24 20:07:10 +02:00
Mike Fährmann
98c9fdb414
[deviantart] revert e9353c63; retry downloads with private token
2023-04-23 21:10:16 +02:00
Mike Fährmann
5d7435e803
[nitter] extract user IDs from encoded banner URLs
...
still requires a banner to be present to begin with
2023-04-23 19:13:27 +02:00
Mike Fährmann
7f25cab56e
[sankaku] support post URLs with MD5 hashes ( #3952 )
2023-04-23 16:46:40 +02:00
Mike Fährmann
a05120412a
[oauth] catch exception from 'webbrowser.get()' ( #3947 )
...
It raises an exception instead of returning None
when no runnable browser is available.
2023-04-23 15:00:09 +02:00
Mike Fährmann
3fc2223893
merge #3935 : [reddit] match 'preview.redd.it' URLs
2023-04-21 20:09:20 +02:00
Mike Fährmann
1d505b39f8
[twitter] support 'profile-conversation' entries ( #3938 )
2023-04-21 15:08:50 +02:00
Mike Fährmann
aaf58a1259
[imgur] document 'client-id' option ( #3937 )
2023-04-21 15:08:50 +02:00
Mike Fährmann
202f5d86a7
[reddit] ignore 'id-max' value "zik0zj"/2147483647
...
(#3939 , #3862 , #3697 , #3606 , #3546 , #3521 , #3412 )
2023-04-21 15:08:50 +02:00
Mike Fährmann
8586ee81be
[nana] fix 'keyword' tests
2023-04-21 15:08:50 +02:00
ClosedPort22
cd4bfb0dd1
[reddit] match 'preview.redd.it' URLs
2023-04-20 15:54:09 +08:00
Mike Fährmann
faca32a850
[sankaku] sanitize 'date:…' tags ( #1790 )
2023-04-19 20:09:11 +02:00
Mike Fährmann
6f1e34ec69
[vipergirls] add 'thread' and 'post' extractors
...
(#731 , #2720 , #3812 )
2023-04-19 15:28:26 +02:00
Mike Fährmann
81bd2af83e
[2chen] update domain to sturdychan.help
2023-04-19 13:54:44 +02:00
Mike Fährmann
f500b45b5e
[twitter] improve 480bc34e
...
only check for double user assignment where necessary
2023-04-18 20:50:23 +02:00
Mike Fährmann
5b635f2317
[imxto] add 'gallery' extractor ( #1289 )
2023-04-17 20:49:09 +02:00
Mike Fährmann
359e31e462
[nozomi] update file URLs ( #3925 )
...
Static images are now only available in WebP format over the 'w'
subdomain. GIFs also got their own 'g' subdomain.
2023-04-17 15:42:42 +02:00
Mike Fährmann
2dfd4a3de2
[imagefap] extract 'categories' metadata and fix empty 'tags'
2023-04-17 14:49:50 +02:00
Mike Fährmann
480bc34e54
[twitter] do not overwrite previously assigned users ( #3922 )
2023-04-16 17:30:43 +02:00
Mike Fährmann
02ec5bb8e5
[imagefap] extract 'description' metadata ( #3905 )
2023-04-16 17:02:16 +02:00
Mike Fährmann
d253a3c542
merge #3841 : [urlshortener] add support for bit.ly & t.co
2023-04-15 18:08:21 +02:00
Mike Fährmann
5e63942b37
[urlshortener] update
2023-04-15 18:06:06 +02:00
Mike Fährmann
c45f09d2a8
[imagechest] fix extraction ( #3914 )
2023-04-14 20:06:59 +02:00
Mike Fährmann
2cd4411ff8
[nitter] extract videos from 'source' elements ( #3912 )
2023-04-14 19:00:56 +02:00
Mike Fährmann
9501579279
[sexcom] fix fetching HD videos
2023-04-13 15:40:53 +02:00
Mike Fährmann
a2f7274eae
[sexcom] fix pagination ( #3906 )
2023-04-13 15:39:15 +02:00
Mike Fährmann
e9353c63d6
[deviantart] keep using private access tokens
...
for deviations returned from a private API call
also fixes a bug from 0a7eee3e
where '_pagination()'
would never switch from unspecified (None) to private access token
2023-04-13 14:46:06 +02:00
Mike Fährmann
e70af6a550
[hentaifoundry] do not update filters when cookies are provided
2023-04-13 14:16:53 +02:00
Mike Fährmann
9c29c904c7
[mastodon] try to get account IDs without access token
...
Try to query the public '/api/v1/accounts/lookup' endpoint
and fall back to '/v1/accounts/search' if it returns an error.
'/api/v1/accounts/lookup' is available since Mastodon v3.4.0.
The version of an instance can be found at '/api/v1/instance'.
2023-04-13 14:03:23 +02:00
Mike Fährmann
1614c5c4bf
[generic] write regular expressions without 'x' flags
2023-04-10 20:45:23 +02:00
Mike Fährmann
d84a617273
[hentaifoundry] fix setting content filters ( #3887 )
2023-04-09 18:04:49 +02:00
ClosedPort22
875485313f
[urlshortener] force HTTPS
2023-04-09 18:19:52 +08:00
Mike Fährmann
0a7eee3ee0
[deviantart] add 'public' option
2023-04-08 23:04:34 +02:00
Mike Fährmann
f5a59c4170
[twitter] add 'date_bookmarked' metadata ( #3816 )
2023-04-06 20:16:25 +02:00
Mike Fährmann
1c1f6fdc80
[twitter] fix regression from 160335ad
...
Tweets from 'homeConversation' or 'conversationthread' entries do not
contain a 'sortIndex' field. Accessing it raises a KeyError and would
erroneously get them labeled as 'deleted'.
2023-04-06 19:22:48 +02:00
Mike Fährmann
160335ad44
[twitter] add 'date_liked' metadata for liked Tweets ( #3816 )
2023-04-06 18:33:45 +02:00
Mike Fährmann
6d850ce629
[twitter] calculate 'date' from Tweet IDs
...
20 times faster than parsing 'created_at'
2023-04-05 22:29:14 +02:00
Mike Fährmann
25949bd767
merge #3871 : [hotleak] Fix downloading of creators whose name starts with a category name
2023-04-04 16:24:20 +02:00
Mike Fährmann
dbe06cdba1
[twitter] warn about 'withheld' Tweets and users ( #3864 )
2023-04-04 16:15:08 +02:00
Mike Fährmann
3cc1dd1572
[twitter] update query hashes
2023-04-03 23:20:20 +02:00
Mike Fährmann
3846ce0de5
[twitter] update to bookmark timeline v2 ( #3859 )
2023-04-03 22:46:12 +02:00
Mike Fährmann
34699fbf64
[deviantart:search] detect login redirects ( #3860 )
2023-04-03 19:37:12 +02:00
Mike Fährmann
e6cb92864a
[twitter] allow setting custom features per API endpoint
2023-04-03 16:18:31 +02:00
Balgden
4b141cce66
Fix indentation
2023-04-03 13:44:14 +00:00
Balgden
bbc5977121
Fix line length
2023-04-03 13:38:42 +00:00
Balgden
ffd30abcb3
[hotleak] Fix downloading of creators whose name starts with a category name
...
E.g. `hot4lexi` would start downloading the `hot` section by mistake
This happened because the regex had a negative lookahead for the category names, but didn't ensure that they where followed by either end-of-string or a slash.
2023-04-03 13:30:27 +00:00
Mike Fährmann
5ca9d55595
merge #3870 : [blogger] update 'sub' regex to get the highest resolution url
2023-04-03 14:47:18 +02:00
Mike Fährmann
fd7ce4c081
merge #3868 : [shopify] fix 'collection' extractor
2023-04-03 14:44:46 +02:00
Mike Fährmann
135ac9c302
merge #3854 : [twitter] fix: graphql_timeline_v2_bookmark_timeline cannot be null
2023-04-03 14:37:42 +02:00
enduser420
bbb1e34c34
[blogger] update sub regex
2023-04-03 12:43:58 +05:30
enduser420
96e3dd2128
[shopify] fix 'collection' extractor
2023-04-03 12:19:09 +05:30
Mike Fährmann
ac97aca99c
[realbooru] fix extraction
...
get file URLs from HTML pages
2023-04-02 20:45:16 +02:00
Mike Fährmann
75666cf9c3
[danbooru] reduce API requests for fetching extended 'metadata'
...
Instead of using one additional API request per post object (N+1),
this requires only one request per 200-post batch.
2023-04-02 20:11:52 +02:00
Amer Jazaerli
bebbff6578
fix: graphql_timeline_v2_bookmark_timeline cannot be null
...
twitter: 400 Bad Request (The following features cannot be null: graphql_timeline_v2_bookmark_timeline)
2023-03-31 00:06:49 +02:00
ClosedPort22
71b26adb9b
[urlshortener] add tinyurl.com as an example
2023-03-29 13:37:26 +08:00
Mike Fährmann
421db26aff
[bunkr] update domain to 'bunkr.la'
2023-03-28 20:10:36 +02:00
ClosedPort22
9e2a945013
[urlshortener] add support for bit.ly & t.co
2023-03-29 00:06:41 +08:00
Mike Fährmann
9b5e7ce8b9
[hiperdex] fix extraction
2023-03-25 18:18:27 +01:00
Mike Fährmann
89a67c45e0
[nitter] support nitter.it ( #3819 )
2023-03-25 13:29:22 +01:00
Mike Fährmann
88f29a751d
[nitter] skip broadcasts
...
instead of downloading an "Unsupported feature" HTML page
2023-03-25 13:09:24 +01:00
Mike Fährmann
1e013eba5a
[nitter] fix extraction for instances without user banners
2023-03-25 12:50:40 +01:00
Mike Fährmann
d94aa1ee02
[gelbooru] fix --range for favorites ( #3704 )
2023-03-23 22:58:13 +01:00
Mike Fährmann
1f82b00b8f
[gelbooru] fix and improve --range for pools
2023-03-23 18:22:46 +01:00
Mike Fährmann
197882cf12
[twitter] add 'hashtag' extractor ( #3783 )
2023-03-22 22:20:40 +01:00
Mike Fährmann
9789ebac52
[naverwebtoon] fix extraction ( #3729 )
2023-03-19 17:08:58 +01:00
Mike Fährmann
72f1f16eb2
[weibo] support 'mix_media_info' entries ( #3793 )
2023-03-18 15:19:25 +01:00
ClosedPort22
d4fb4ff47f
[twitter] extract TwitPic URLs in text ( #3792 )
...
also ignore previously seen URLs
2023-03-18 21:19:24 +08:00
Mike Fährmann
2bb937014f
[twitter] fall back to legacy /media endpoint when not logged in
2023-03-17 20:54:35 +01:00
Mike Fährmann
b68094d326
[twitter] support 'note_tweet's
2023-03-17 19:36:07 +01:00
Mike Fährmann
3dcabc97ed
[twitter] update API endpoints and parameters
2023-03-17 19:25:53 +01:00
Mike Fährmann
dcb8af659a
[gelbooru] extract favorites without needing cookies ( #3704 )
...
TODO: fix --range
2023-03-15 19:21:35 +01:00
Mike Fährmann
b756dc13aa
[gelbooru] warn about missing cookies for favorites ( #3704 )
...
and add docstring so it shows up in --list-extractors
2023-03-15 14:58:55 +01:00