Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
5503ac4d5e
replace json.dumps with direct calls to JSONEncoder.encode
2023-02-09 15:51:40 +01:00
Mike Fährmann
c6a9bab019
update extractor test results
2022-07-12 15:49:22 +02:00
Mike Fährmann
49a50fb2eb
[500px] create directories per photo
2021-12-25 17:16:45 +01:00
Mike Fährmann
89bebe1bef
[500px] add 'favorite' extractor ( closes #1927 )
2021-12-25 17:16:45 +01:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
21c2da454f
update extractor test results
2021-07-04 22:00:32 +02:00
Mike Fährmann
0d2961ae81
[500px] remove last query hash entry
...
forgot to include this in b56e2450
2021-06-16 23:00:45 +02:00
Mike Fährmann
b56e245094
[500px] update GraphQL queries
...
500px changed its method from query hashes to sending the entire query
string for every request.
2021-06-14 16:13:08 +02:00
Mike Fährmann
532ac79fb0
update extractor test results
2021-05-21 02:28:53 +02:00
Mike Fährmann
d7bc4a2b8b
[500px] update query hashes
2021-05-21 01:20:31 +02:00
Mike Fährmann
b3ee10a7fb
[500px] update query hashes
2021-05-06 17:28:26 +02:00
Mike Fährmann
82c32d25af
[500px] update query hashes
2021-04-15 17:28:31 +02:00
Mike Fährmann
9785c551bc
[500px] skip unavailable photos ( #1335 )
...
instead of crashing with a KeyError exception
2021-03-04 20:26:26 +01:00
Mike Fährmann
e88d5bede8
[500px] update query hash
2021-02-08 22:40:02 +01:00
Mike Fährmann
a46561bc16
[500px] update query hashes
2020-11-13 06:36:11 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
93e04bf9a9
[500px] update query hashes
2020-10-03 19:25:28 +02:00
Mike Fährmann
cc1fb0b4ea
[500px] update query hash
2020-09-16 01:26:31 +02:00
Mike Fährmann
84e04cc23b
[500px] fix extraction and update URL patterns ( fixes #956 )
...
- rewrite most API calls to GraphQL queries
- match '500px.com/p/<user>' URLs
2020-08-24 18:25:31 +02:00
Mike Fährmann
38b6bd66b0
[500px] match 'web.500px.com' subdomains
2020-04-26 22:17:20 +02:00
Mike Fährmann
a3c736fedc
[500px] fix extraction
...
Maximum available image dimensions have been reduced to 4096px
on the longest edge. (from 5000px)
A few (unimportant) metadata fields are no longer available or have
been changed to 'null'.
2019-07-19 17:23:03 +02:00
Mike Fährmann
8d96a8ce4c
[500px] add user-, gallery-, and image-extractors ( #185 )
2019-03-20 17:32:36 +01:00