Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
850df34c31
remove '&' from URL patterns part 2
...
follow-up on 968d3e8465
2023-05-03 20:26:25 +02:00
Mike Fährmann
4d415376d1
[pinterest] fix 'pin.it' extractor
...
it really was just the single '/' at the end of the url_shortener URL
2023-05-03 20:05:10 +02:00
Mike Fährmann
657b6a9100
[pinterest] update endpoint for related board pins
2023-05-03 18:41:09 +02:00
Mike Fährmann
0b93420a81
[pinterest] unescape search terms ( #3621 )
2023-02-15 15:44:20 +01:00
Mike Fährmann
5503ac4d5e
replace json.dumps with direct calls to JSONEncoder.encode
2023-02-09 15:51:40 +01:00
Mike Fährmann
9116398c1c
[pinterest] add 'domain' option ( #3484 )
...
use input URL domain by default
2023-01-04 17:20:14 +01:00
Mike Fährmann
294108c90a
[pinterest] support 'All Pins' boards ( #2855 , #3484 )
2023-01-03 19:11:20 +01:00
Mike Fährmann
311e9383af
[pinterest] handle section pins with separate extractors ( #2684 )
2022-07-03 18:12:16 +02:00
Mike Fährmann
0b33435da5
[pinterest] support multiple files per pin ( closes #1619 , #2452 )
2022-04-06 21:21:33 +02:00
Mike Fährmann
9c5d2d7af3
[pinterest] add extractor for created pins ( #2452 )
2022-04-01 16:59:58 +02:00
Mike Fährmann
9313d4dc10
[pinterest] do not force 'm3u8_native' for video downloads ( #2436 )
2022-03-21 10:11:51 +01:00
Mike Fährmann
36291176bc
[pinterest] add 'search' extractor ( #1411 )
2021-03-29 01:41:28 +02:00
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
...
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
8a88025dc4
[pinterest] support generic user URLs ( #1205 )
...
i.e. https://www.pinterest.com/USERNAME
also renames 'BoardsExtractor' to 'UserExtractor'
2021-01-02 02:36:53 +01:00
Mike Fährmann
6cdbab07b5
[pinterest] add support for getting all boards of a user
...
(#1205 )
2020-12-29 16:57:03 +01:00
Mike Fährmann
371e9ca6df
[pinterest] implement video support ( closes #1189 )
2020-12-21 16:09:06 +01:00
Mike Fährmann
b8daabc3ca
[pinterest] implement login support ( closes #1055 )
...
being logged allows access to secret/protected boards
2020-10-15 15:14:18 +02:00
Mike Fährmann
26a967cbd4
[pinterest] match 'pinterest.co.uk' URLs ( fixes #914 )
2020-07-27 14:41:34 +02:00
Mike Fährmann
0e714b9a0e
[pinterest] add 'section' extractor ( #835 )
2020-06-21 00:08:14 +02:00
Mike Fährmann
5ba90f72ca
[pinterest] add support for sections ( closes #835 )
2020-06-16 14:41:05 +02:00
Mike Fährmann
32d7195d08
[pinterest] improve detection of invalid pin.it links
2020-01-18 21:06:44 +01:00
Mike Fährmann
1f2a69f3c5
add '_extractor' information to redirect results
2019-12-29 23:37:34 +01:00
Mike Fährmann
c4702ec9b6
simplify some logging calls
2019-12-10 21:30:08 +01:00
Mike Fährmann
da6789b2b0
disable unique archive id checks for some tests
...
- same image twice in a livedoor blog post
- unreliable results for related pinterest items
2019-11-10 17:04:51 +01:00
Mike Fährmann
4409d00141
embed error messages in StopExtraction exceptions
2019-10-28 16:39:49 +01:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
6126615698
update URLs for supportedsites.rst
2019-01-30 16:18:22 +01:00
Mike Fährmann
98c6520384
[pinterest] update root URL of API calls
2019-01-14 15:22:04 +01:00
Mike Fährmann
40e30694f3
[pinterest] fix pin.it redirects
2018-12-02 19:38:50 +01:00
Mike Fährmann
7f6a0be982
adjust some tests
2018-11-15 22:50:04 +01:00
Mike Fährmann
3bdfc15be1
[pinterest] don't crash on pins without image info
2018-11-14 11:46:14 +01:00
Mike Fährmann
1532d1b690
fix 'range' tests and update a few test results
2018-10-08 23:53:58 +02:00
Mike Fährmann
d3f1eed2a6
[pinterest] improvements
...
- add stop condition for pin-related pins
- improve URL patterns
- make Pylint happy
2018-08-16 18:11:39 +02:00
Mike Fährmann
63fa0b2006
[pinterest] add extractors for related pins
...
Related pins can not be accessed by adding a "#related" fragment
to the end of a Pinterest URL, for example:
- https://www.pinterest.com/pin/858146903966145189/#related
- https://www.pinterest.com/g1952849/test-/#related
There are no explicit real URLs for related pins,
using an option to enable them results in "clunky" code,
and a custom "related:<URL>" scheme doesn't feel right either.
2018-08-15 21:49:45 +02:00
Mike Fährmann
a86f2bfc80
[pinterest] update not-found redirects
2018-08-07 12:13:19 +02:00
Mike Fährmann
b8c97d2295
use 'extractor.request()' for more HTTP requests
2018-06-25 23:40:59 +02:00
Mike Fährmann
017188d268
improve extractor.request()
...
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
2018-06-18 16:29:56 +02:00
Mike Fährmann
e1e23165a0
[pinterest] catch JSON decode errors
2018-05-11 17:37:27 +02:00
Mike Fährmann
2ea0d1da42
[smugmug] improve API code; use data expansions
2018-04-30 18:22:44 +02:00
Mike Fährmann
2395d870dd
[pinterest] unquote board and user names, better errors
2018-04-26 16:38:12 +02:00
Mike Fährmann
0f1e07f627
[pinterest] scrap OAuth implementation; code improvements
...
OAuth authentication isn't needed anymore and other tools
like Postman are better suited for this job anyway.
2018-04-25 16:04:30 +02:00
Mike Fährmann
55d4d23860
[pinterest] use Pinterest's "Web" API ( #83 )
...
no access tokens, no user credentials of any kind ...
2018-04-24 22:28:10 +02:00
Mike Fährmann
d10579edb5
[pinterest] improve PinterestAPI code; remove OAuth mentions
...
on another note: access_tokens have been set to only allow for
10 requests per hour (from 200 yesterday)
2018-04-17 17:12:42 +02:00
Mike Fährmann
4bd182c107
[pinterest] implement oauth:pinterest
( #83 )
...
Pinterest access tokens are rate limited at 200 requests per
hour (or maybe per 2 or 3 hours?) so having just one access token
for all users isn't going to work in the long run.
2018-04-16 20:03:28 +02:00
Mike Fährmann
9651f3fce0
[pinterest] improve error messages ( #83 )
2018-04-16 19:36:54 +02:00