Mike Fährmann
5158cbb4c1
[weibo] rework pagination logic ( #4168 )
...
don't automatically stop when receiving an empty status list
shouldn't improve 'tabtype=feed' results, but at least 'tabtype=album'
ones and others using cursors won't end prematurely
2024-03-14 00:06:25 +01:00
Mike Fährmann
ace16f00f5
[weibo] fix retweets ( #2825 , #3874 , #5263 )
...
- handle 快转 retweets
- disable 'retweets' by default
- skip all retweet media when 'retweets' are disabled
- extract all retweet media when 'retweets' is set to "original"
2024-03-06 19:36:53 +01:00
Mike Fährmann
0676a9d6ec
[weibo] fix 'livephoto' filename extensions ( #5287 )
2024-03-06 19:36:32 +01:00
Mike Fährmann
c83d0a1596
[weibo] add 'gifs' option ( #5183 )
2024-02-10 18:17:07 +01:00
Mike Fährmann
b4bcf40278
[weibo] fix AttributeError in 'user' extractor ( #5022 )
...
yet another bug caused by a383eca7
2024-01-05 17:18:33 +01:00
Mike Fährmann
e8b5e59a08
[weibo] detect redirects to login page ( #4773 )
2023-11-10 19:35:29 +01:00
Mike Fährmann
56cd9d408d
[weibo] fix Sina Visitor request
2023-10-30 22:14:52 +01:00
Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
1d4db83d49
[weibo] fix end of cursor based pagination
2023-07-04 17:41:22 +02:00
Mike Fährmann
654267a335
[weibo] fix 'json' extension for some videos
2023-06-15 13:49:17 +02:00
Mike Fährmann
0a9aaa7a8d
[weibo] prevent fatal exception due to missing video ( #4150 )
2023-06-08 22:22:43 +02:00
Mike Fährmann
6b6bb4be73
[weibo] require numeric IDs to have length >= 10 ( #4059 )
2023-05-14 18:45:37 +02:00
Mike Fährmann
72f1f16eb2
[weibo] support 'mix_media_info' entries ( #3793 )
2023-03-18 15:19:25 +01:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2023-02-09 15:22:00 +01:00
Mike Fährmann
7e277d0f7d
[weibo] add 'count' metadata field ( #3305 )
...
or '{status[count]}', as most metadata for weibo is inside 'status'
2022-11-30 11:36:46 +01:00
Mike Fährmann
c25905641e
[weibo] fix bug with empty 'playback_list' ( #3301 )
2022-11-26 12:00:17 +01:00
Mike Fährmann
e3abab8629
[weibo] send 'Referer' headers ( #3188 )
2022-11-10 17:11:57 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
Mike Fährmann
1c89ccb27d
[weibo] prevent errors when paginating over album entries ( #2817 )
2022-08-11 12:22:14 +02:00
Mike Fährmann
0f5826e884
[weibo] prevent exception for missing 'playback_list' ( #2792 )
2022-07-30 16:49:08 +02:00
Mike Fährmann
c6a9bab019
update extractor test results
2022-07-12 15:49:22 +02:00
Mike Fährmann
539e3bbed9
[weibo] handle invalid/broken status objects
2022-07-12 15:49:09 +02:00
Mike Fährmann
6db77d4656
[weibo] support '?tabtype=video' listings ( #2601 )
2022-06-12 17:55:23 +02:00
Mike Fährmann
45c980daf0
[weibo] fix retweets ( #2601 )
2022-06-11 15:30:26 +02:00
Mike Fährmann
61cbf8318c
[weibo] fix URLs generated by 'user' extractor ( #2601 )
2022-06-05 21:37:57 +02:00
Mike Fährmann
e59bcb8437
[weibo] ensure media URLs use https://
2022-06-03 17:37:57 +02:00
Mike Fährmann
73f673e3ca
[weibo] handle 'gif' pictures
2022-06-03 17:33:14 +02:00
Mike Fährmann
57508d3bb7
[weibo] support all different 'tabtype' listings ( #686 , #2601 )
2022-06-03 16:36:22 +02:00
Mike Fährmann
7a9cba9c10
[weibo] add support for usernames in URLs ( #1662 )
2022-05-31 22:48:34 +02:00
Mike Fährmann
4bf5bc2403
[weibo] support 'livephoto' entries ( #2146 )
2022-05-31 15:35:24 +02:00
Mike Fährmann
a0692818af
[weibo] switch to desktop API ( #2601 )
2022-05-31 12:46:35 +02:00
Mike Fährmann
afde76269c
[weibo] fix infinite retries for deleted accounts ( fixes #2521 )
2022-04-27 20:23:11 +02:00
Mike Fährmann
e670dc518e
[weibo] update pagination code ( fixes #2244 )
...
- send proper headers and query parameters
- use 'since_id' instead of page numbers
- set a 1-2 second delay between requests
2022-01-31 19:16:01 +01:00
Mike Fährmann
c80b18a477
[weibo] extend 'retweets' option ( closes #1542 )
...
Setting 'retweets' to "original" will use metadata from the
original posts, and not from the retweeted ones.
2021-05-27 23:09:42 +02:00
Mike Fährmann
73373c06ec
[weibo] handle posts with more than 9 images ( closes #926 )
...
Responses from '/api/container/getIndex' don't list more than
9 images per 'status' object, but the embedded JSON from a
'/detail/<ID>' page does.
2020-10-06 18:16:08 +02:00
Mike Fährmann
c51fbd72ba
update extractor test results
2020-07-13 22:57:48 +02:00
Mike Fährmann
7158bdd7c7
[weibo] improve extractor logic ( #829 )
2020-06-18 15:00:31 +02:00
Mike Fährmann
d5d90a0450
[weibo] add 'date' field to 'status' objects ( #829 )
2020-06-16 14:46:46 +02:00
Mike Fährmann
5e2974d699
[weibo] add 'videos' option
2020-04-30 00:00:30 +02:00
Mike Fährmann
699036ea0c
[weibo] accept status URLs with non-numeric IDs ( #664 )
2020-03-31 22:46:50 +02:00
Mike Fährmann
e35c2ea1a6
[weibo] use youtube-dl to download from m3u8 manifests
2020-01-24 23:39:34 +01:00
Mike Fährmann
922b8a9595
[weibo] raise NotFoundError for unavailable/deleted statuses
2019-12-14 22:10:02 +01:00
Mike Fährmann
d1ea08c67d
[weibo] fixes and improvements
...
- ignore unavailable videos (fixes #427 )
- handle empty 'geo' fields
- consistent metadata fields for images and videos
2019-09-26 14:57:35 +02:00
Mike Fährmann
17c11393f5
[weibo] allow user-ids in status URLs
2019-03-30 18:38:58 +01:00
Mike Fährmann
973a720a7a
[weibo] fix unit test URL patterns
2019-03-15 15:19:39 +01:00
Mike Fährmann
19860655a3
[weibo] add 'user' and 'status' extractors
2019-02-17 18:18:31 +01:00