1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 19:22:32 +01:00
Commit Graph

191 Commits

Author SHA1 Message Date
Mike Fährmann
9156e90f1f
[twitter] add 'pinned' option 2021-10-29 22:10:58 +02:00
Mike Fährmann
cd66c3c415
[twitter] add 'size' option (#1881) 2021-10-05 19:14:54 +02:00
Mike Fährmann
94143eb86c
[twitter] add 'quote_by' metadata field (#1481)
Only present for tweets quoted by another tweet.
Represents the tweet_id of said tweet quoting this one.
2021-09-25 18:15:14 +02:00
Mike Fährmann
da16eabb82
[twitter] ensure card entries have a 'url' (#1868) 2021-09-23 18:02:19 +02:00
Mike Fährmann
0fd959a2a7
[twitter] support '/with_replies' URLs (closes #1833) 2021-09-10 20:44:26 +02:00
Mike Fährmann
6651da27e9
[twitter] fix 'url' extraction for users without 'expanded_url'
(#1532, #1787)
2021-08-27 18:41:16 +02:00
Mike Fährmann
ae78d95a5f
[twitter] fix issue when filtering quote tweets (#1792)
When a user quotes his own Tweet and that Tweet gets filtered by
'"quoted": false', it could also get filtered when it appeared later
as regular Tweet.
2021-08-25 20:04:22 +02:00
Mike Fährmann
0817f468ef
[twitter] expand t.co links in user descriptions (#1532, #1787) 2021-08-23 23:34:59 +02:00
Mike Fährmann
7c0ae88185
[twitter] add 'url' to user objects (#1532, #1787) 2021-08-23 22:51:35 +02:00
Mike Fährmann
5919dc5b5a
[twitter] slightly improve '_transform_user()' 2021-08-23 22:28:09 +02:00
Mike Fährmann
6b56b3ebe1
[twitter] report API errors as generic StopExtraction exceptions
prevents duplicate logging messages for nonexistent users
(#1759)
2021-08-21 22:46:22 +02:00
Mike Fährmann
c866fcba48
[twitter] fix 'logout' (#1719)
delete 'auth_token' cookie and cookies.txt path
2021-08-16 01:36:34 +02:00
Mike Fährmann
52984f7e22
[twitter] add option to log out when blocked (#1719) 2021-08-12 19:11:41 +02:00
Mike Fährmann
e5a93e113f
[twitter] extend 'replies' option (#1254)
Allow setting 'replies to '"self"' to only download from self-replies.
2021-08-10 22:14:00 +02:00
Mike Fährmann
229498b8aa
[twitter] warn about suspended accounts etc (closes #1759) 2021-08-09 02:58:27 +02:00
Mike Fährmann
414bdc95a3
[twitter] set 'retweet_id' for original retweets (#1481) 2021-07-02 21:50:37 +02:00
Mike Fährmann
5323c1c73a
[twitter] ensure guest tokens are returned as string (#1665) 2021-07-01 14:35:53 +02:00
Mike Fährmann
035562bd11
[twitter] remove old-style URLs from image fallback lists 2021-06-28 16:25:24 +02:00
Mike Fährmann
a751afdfb3
[twitter] change some defaults
- 'retweets' option: true -> false
- 'quoted' option  : true -> false

  i.e. disable downloading tweets from other user's timelines by default

- search directory:
    '["{category}", "Search", "{search}"]' ->
    '["{category}", "{user[name]}"]'

  i.e. change it to the same as other twitter extractors (#1308)
2021-06-11 21:26:11 +02:00
Mike Fährmann
b5affc62aa
[twitter] rename 'text-only' to 'text-tweets' (#570) 2021-05-22 21:41:12 +02:00
Mike Fährmann
724ca61f36
[twitter] add 'text-only' option (#570) 2021-05-22 17:01:49 +02:00
Mike Fährmann
394fbb5f56
[twitter] strip useless t.co links (#1532)
The 'full_text' of Tweets with media content usually ends with a t.co
link to itself. This commit removes those.
2021-05-17 00:20:29 +02:00
Mike Fährmann
41457dbb1b
[twitter] resolve t.co URLs in 'content' (#1532) 2021-05-15 18:52:37 +02:00
Mike Fährmann
17b0ccb071
[twitter] add missing retweet media entities (fixes #1555)
from the original tweets
2021-05-14 22:51:01 +02:00
Mike Fährmann
fd858eed7b
[twitter] add 'user_likes' metadata field for liked tweets
i.e. the 'screen_name' of the user whose liked tweets get extracted.

Ideally this would replace 'user' or at least be in the same format,
but that would break backwards compatibility or be impossible/too
complicated thanks to API result differences.

(#1421)
2021-04-02 03:41:41 +02:00
Mike Fährmann
8d124a3766
[twitter] rename variables 2021-04-02 02:49:53 +02:00
Mike Fährmann
105f3c9666
[twitter] add extractor for direct image links (closes #1417) 2021-04-02 02:45:23 +02:00
Mike Fährmann
ebd142e2a8
[twitter] don't use youtube-dl for cards when videos are disabled
(#1416)
2021-04-01 14:26:08 +02:00
Mike Fährmann
ccfa5a8694
[twitter] better error message when logging in with 2FA (#1409) 2021-03-27 18:26:37 +01:00
Mike Fährmann
2846235669
[twitter] allow specifying a custom format for user results
(#1337)
2021-03-21 22:26:26 +01:00
Mike Fährmann
3378b39719
[twitter] implement 'users' option (#1337) 2021-03-16 00:51:05 +01:00
Mike Fährmann
5d69e437d0
[twitter] add option to download all media from a conversation
(fixes #1319)
2021-02-26 13:50:46 +01:00
Mike Fährmann
de0656941b
[twitter] add extractor for followed users (#1337)
https://twitter.com/USER/following or
https://twitter.com/id:USERID/following
2021-02-22 18:22:01 +01:00
Mike Fährmann
5542a11c46
[twitter] update GraphQL endpoints 2021-02-20 02:09:17 +01:00
Mike Fährmann
24e8e398e0
[twitter] skip login if 'auth_token' cookie is present 2021-01-25 15:03:59 +01:00
Mike Fährmann
95e5911895
[twitter] match '/i/user/ID' URLs 2021-01-20 00:33:57 +01:00
Mike Fährmann
069b113cbf
[twitter] improve and fix retry after hitting rate limit
- replace recursive call with infinite loop
- fix function arguments for recursive call
2021-01-19 23:50:07 +01:00
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
25074aec47
[twitter] fetch media from pinned tweets (#1203) 2020-12-29 16:27:43 +01:00
Mike Fährmann
2475176d99
[twitter] fetch tweets from 'homeConversation' entries
When logged in, some entries returned by Twitter's API are so called
'homeConversation's (they would be regular tweet entries otherwise.)

Those weren't picked up before and resulted in missing files compared
to accessing a timeline as guest.

('/media' timelines and search results were not affected)
2020-12-29 00:42:46 +01:00
Mike Fährmann
3af9350648
[twitter] update API calls
- use 'https://twitter.com/i/api' for all requests
  except '/guest/activate.json'
- update (default) URL parameters
- update GraphQL endpoints
2020-12-28 22:05:48 +01:00
Mike Fährmann
b656b829db
[twitter] fix login with username & password
It is no longer possible to get an 'authenticity_token' from Twitter's
Javascript-free login form, which got disabled few days ago.

Generating a random 16 byte hex string client-side and sending that as
a cookie alongside the regular login form works just as well.
2020-12-28 16:10:19 +01:00
Mike Fährmann
a00b60fbe7
[twitter] update 'x-csrf-token' header (fixes #1170)
Twitter started using a bigger (80 instead of 16 bytes) CSRf token for
logged in users, and expects those to be used as 'x-csrf-token' header
when send via 'ct0' cookie.

Generating an 80 byte token ourselves doesn't work, and Twitter will
still insist on using its own.
2020-12-11 13:46:58 +01:00
Mike Fährmann
63e61a0932
[twitter] update image URL format (#1145)
use
'/<name>?format=<fmt>&name=<size>'
instead of the potentially deprecated
'/<name>.<fmt>:<size>'

but keep all of them as fallback URLs
2020-12-01 11:53:51 +01:00
Mike Fährmann
ddfb4fd07a
[twitter] use 'https://twitter.com/i/api/' for logged in users
Doesn't seem to make a difference from what I can tell,
i.e. downloaded files are the same, but the website does it.
2020-11-16 11:26:37 +01:00
Mike Fährmann
de0c57886d
[twitter] add 'list-members' extractor (closes #1096) 2020-11-13 06:47:45 +01:00
Mike Fährmann
41d4968866
[twitter] add 'list' extractor (#1096) 2020-11-05 22:55:38 +01:00
Mike Fährmann
5d10520f4c
[twitter] update GraphQL endpoint & fix width/height entries 2020-11-05 22:53:29 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
1686dc1757
[twitter] support media from Cards (#1005, #937)
Can be enabled with 'extractor.twitter.cards', but for now disabled by
default because cards can redirect to rather large videos from YouTube
or Twitch.
2020-10-22 21:33:53 +02:00
Mike Fährmann
a3ca2f6080
update fallback URL handling
remove Message.Urllist and use a '_fallback' field inside a kwdict
2020-10-16 01:09:55 +02:00
Mike Fährmann
1b1cf01d0d
add a general 'generate_csrf_token()' function 2020-10-15 15:14:18 +02:00
Mike Fährmann
844502cad5
update extractor test results 2020-10-03 19:24:19 +02:00
Mike Fährmann
430b6d6e2e
[twitter] extend 'retweets' option (closes #1026)
Setting 'retweets' to '"original"' will use metadata from the
original retweeted Tweets, and not from the Retweet entry.
2020-09-28 23:03:35 +02:00
Mike Fährmann
aeb0d32333
[twitter] improve twitpic extraction (fixes #1019)
- ignore twitpic.com/photos/… URLs
- ignore empty image URLs
2020-09-22 22:22:35 +02:00
Mike Fährmann
2b8d57f0ab
[twitter] support '/intent/user?user_id=…' URLs (#980) 2020-09-08 23:17:50 +02:00
Mike Fährmann
a3b473bd2f
[twitter] support specifying users by ID (#980)
by using 'id:…' as their screen name, i.e.
https://www.twitter.com/id:2976459548/media
instead of
https://twitter.com/supernaturepics/media

The user ID can, for example, be obtained from the output of
$ gallery-dl -j --range 1 https://twitter.com/<screen-name>
2020-09-08 22:56:52 +02:00
Mike Fährmann
8f64585ff2
[twitter] handle 429 responses without x-rate-limit-reset header 2020-07-23 22:38:17 +02:00
Mike Fährmann
2da71cb561
[twitter] raise proper exception if user doesn't exist (#891) 2020-07-16 15:00:31 +02:00
Leonardo Taccari
86e5a05e29
[twitter] add support for nitter.net URLs in pattern (#890)
Please note that URLs are only "translated", all requests are still
done always via the Twitter API.
2020-07-13 23:48:42 +02:00
Mike Fährmann
3855d0dd3c
[twitter] add debug messages for all skipped Tweets (#867) 2020-07-11 00:41:50 +02:00
Mike Fährmann
6e2af9a8d8
[twitter] improve error message formatting 2020-07-06 23:13:05 +02:00
Mike Fährmann
9da2bc67f8
[twitter] add option to filter media from quoted tweets (#854) 2020-06-25 18:59:25 +02:00
Mike Fährmann
56ab5fb8f4
[twitter] improve handling of quoted tweets (#854)
Split each "quote" into two parts:
- the original tweet
- the tweet that quoted the original
2020-06-24 21:14:18 +02:00
Mike Fährmann
a8c2d997e8
[twitter] treat quoted tweets like retweets (#833)
- filter them when 'retweets' is disabled
- set 'author' to the creator of the quoted tweet

like it was before the rewrite
2020-06-21 19:14:12 +02:00
Mike Fährmann
aed1c63e51
[twitter] improve search results (fixes #847)
Adding 'tweet_search_mode=live' to the query parameters
is the most important part here.
2020-06-21 15:53:20 +02:00
Mike Fährmann
d81a8e6544
[twitter] update tests 2020-06-19 23:01:02 +02:00
Mike Fährmann
d39eedd9bb
[twitter] improve handling of deleted tweets (fixes #838) 2020-06-19 18:11:37 +02:00
Mike Fährmann
dc16f73965
[twitter] move '_guest_token()' into TwitterAPI class 2020-06-18 15:02:51 +02:00
Mike Fährmann
3561d1020a
[twitter] always provide an 'author' field (#831, #833)
The idea was to have less metadata clutter for most Tweets were
'author' and 'user' are the same (non-retweets), and only provide
a 'user' field.

The original Tweet author could be gotten with
{author[…]|user[…]}, but basically no one knows about that.
2020-06-18 15:02:51 +02:00
Mike Fährmann
c37a1c06c8
[twitter] add extractor for liked tweets (closes #837)
You need to be logged in to get access to anyone's liked tweets,
it seems.
2020-06-16 14:27:22 +02:00
Mike Fährmann
b94394104c
[twitter] don't download video previews (#833)
when 'videos' is set to False
2020-06-16 14:10:51 +02:00
Mike Fährmann
036a40943a
[twitter] don't cache results of 'user_by_screen_name()'
A 'keyarg=1' argument to the memcache decorator would have worked as
well, but keeping the user object in memory isn't useful for the vast
majority of use cases and only wastes space.

(closes #817)
2020-06-10 20:58:42 +02:00
Mike Fährmann
4442dfe7b8
[twitter] add 'reply_to' metadata to replies 2020-06-09 21:48:04 +02:00
Mike Fährmann
d769bb4b80
[twitter] improve pagination 2020-06-07 15:23:45 +02:00
Mike Fährmann
5bc1097f9d
[twitter] metadata cleanup #2
- remove useless clutter by creating new tweet-data dicts instead of
  reusing the original Tweet objects
- rename fields to how they were named before
  ('id_str' -> 'tweet_id', etc.)
- only include 'author' if it would differ from 'user'
- restore 'archive_fmt'
2020-06-07 02:25:29 +02:00
Mike Fährmann
3eed5f52d7
[twitter] small metadata cleanup
- add 'date' field
- remove 'entities' and 'extended_entities'
- don't include 'focus_fields' from 'original_info'
2020-06-04 18:21:54 +02:00
Mike Fährmann
655c98cbef
[twitter] skip unavailable tweets 2020-06-04 14:51:25 +02:00
Mike Fährmann
2132e5461a
[twitter] restore TwitPic support 2020-06-04 01:22:34 +02:00
Mike Fährmann
bd0f21478a
[twitter] login using the mobile nojs login page 2020-06-04 00:07:12 +02:00
Mike Fährmann
a10f31dde5
[twitter] rewrite; use new interface (#740, #806)
Everything except logging in with username & password and TwitPic
embeds should be working again.

Metadata per Tweet is massively different than before (mostly raw API
responses - might need some cleaning up) and the default 'archive_fmt'
changed.
2020-06-03 20:51:29 +02:00
Mike Fährmann
45baa13615
update extractor test results
- don't run Instagram tests on Travis anymore
- replace Twitter test because timeline was made private
- update Hiperdex domain to '.com' (again ...)
2020-05-28 02:18:06 +02:00
Mike Fährmann
9f638c2e01
[twitter] add 'replies' option (closes #705) 2020-04-29 23:20:06 +02:00
Mike Fährmann
d3b3b30107
update test results 2020-04-26 22:14:28 +02:00
Mike Fährmann
3eab07739f
[twitter] ensure videos have a 'filename'
This usually gets set when invoking the 'ytdl' downloader, but when
that fails, the error message would use 'None' as filename.
2020-04-24 22:34:19 +02:00
Mike Fährmann
c4371a6970
[twitter] add 'reply' metadata field (#705) 2020-04-24 22:31:24 +02:00
Mike Fährmann
d02f7c1118
improve Extractor.wait()
- allow 'until' to be a datetime object
- do "time calculations" with UTC timestamps
- set a default 'reason'
2020-04-05 21:23:05 +02:00
Mike Fährmann
b607d0ad7f
[twitter] fix typo in 'x-twitter-auth-type' header (#625) 2020-03-21 23:11:39 +01:00
Mike Fährmann
2d5703c493
[twitter] use a simpler data structure to store cookies in cache
Use a dict with name-value pairs instead of an entire
RequestsCookieJar object.
2020-03-12 22:02:12 +01:00
Mike Fährmann
32df8d06fe
[twitter] add 'bookmark' extractor (closes #625) 2020-03-06 01:20:04 +01:00
Mike Fährmann
19ae6f3fc4
update test results
- twitter:

    Don't test the whole kwdict, only the actual content, since the
    keyword hash changes whenever that user changes his display name.

- khinsider:

    Download host changed
2020-02-22 03:25:32 +01:00
Mike Fährmann
74e684e828
[twitter] change default value for 'videos' to 'true'
Every other 'videos' option defaulted to 'true', except Twitter.
2020-02-14 01:03:42 +01:00
Mike Fährmann
facc5daa6d
[twitter] force old login page layout (fixes #584, fixes #598) 2020-02-02 17:24:53 +01:00
Mike Fährmann
e0dd073ce0
[twitter] replace embedded tweet test
the old one was deleted
2020-01-31 12:51:55 +01:00
Mike Fährmann
25d5ec4ff3
[twitter] add option to extract TwitPic embeds (#579) 2020-01-18 21:31:29 +01:00
Alice
f498a9057f [twitter] Fix stop before real end (#573)
* [twitter] Fix stop before real end

Fix for https://github.com/mikf/gallery-dl/issues/544. Makes sure that it really reached the end by checking that both "min_position" is null and "has_more_items" is false before stopping.

* [twitter] Fix stop before real end (update)
2020-01-14 12:24:30 +01:00
Mike Fährmann
43ab9572b4
[twitter] handle API rate limits (#526) 2020-01-04 23:46:29 +01:00
Mike Fährmann
5532e9c158
[twitter] handle quoted tweets (#526)
… and categorize them as retweets
2020-01-04 21:26:55 +01:00
Mike Fährmann
896896a490
[twitter] fix URLs forwarded to youtube-dl (closes #540)
Since commit 3bba763 data["user"] is an entire dict object
and no longer just the user nickname …
2019-12-25 17:28:55 +01:00
Mike Fährmann
07dafad26d
[twitter] attempt to fix infinite loops (#499)
(Hopefully this doesn't break anything else)
2019-12-03 22:55:29 +01:00