Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
260ff55e19
[senmanga] ensure download URLs have a scheme ( #4235 )
2023-06-27 13:49:33 +02:00
Mike Fährmann
ac651c604c
[senmanga] fix and update ( #4160 )
2023-06-08 22:18:43 +02:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
978cb03f81
update misc test results
...
- Livedoor now uses https:// for its image URLs
- Instagram image URLs got simplified
2019-11-20 21:45:48 +01:00
Mike Fährmann
580baef72c
change Chapter and MangaExtractor classes
...
- unify and simplify constructors
- rename get_metadata and get_images to just metadata() and images()
- rename self.url to chapter_url and manga_url
2019-02-11 18:38:47 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
966a9ca3a0
update test results
2018-11-10 19:14:54 +01:00
Mike Fährmann
542a25c389
[ngomik] fix extraction
2018-09-09 13:45:40 +02:00
Mike Fährmann
b4eca2633e
[tumblr] support /archive URLs
2018-09-06 11:09:13 +02:00
Mike Fährmann
9e3415886c
[senmanga] fix/update tests
2018-06-27 20:05:22 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
2018-04-20 14:53:21 +02:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
b8cdd42cab
[senmanga] fix extraction (again)
...
this is basically a re-revert of 2ace5c7
2017-11-18 17:23:32 +01:00
Mike Fährmann
e6814aebe2
add 'extractor.*.user-agent' config option
2017-11-15 14:01:33 +01:00
Mike Fährmann
92027f67f9
use consistent names for URL constants
...
root := <scheme>://<host>
base_url := <root>/<common path>
2017-11-06 20:56:49 +01:00
Mike Fährmann
c4fcdf2691
Revert "[senmanga] fix extraction and download"
...
This reverts commit 2ace5c7b3c
.
2017-10-24 00:22:05 +02:00
Mike Fährmann
2ace5c7b3c
[senmanga] fix extraction and download
2017-10-19 18:25:31 +02:00
Mike Fährmann
a1c8b21cfd
[senmanga] improve metadata
2017-10-04 18:54:39 +02:00
Mike Fährmann
58e95a7487
share extractor and downloader sessions
...
There was never any "good" reason for the strict separation
between extractors and downloaders. This change allows for
reduced resource usage (probably unnoticeable) and less lines
of code at the "cost" of tighter coupling.
2017-06-30 19:38:14 +02:00
Mike Fährmann
94e10f249a
code adjustments according to pep8 nr2
2017-02-01 00:53:19 +01:00
Mike Fährmann
12c99293b6
allow extension by Content-Type for exhentai, seiga, senmanga
2016-09-30 16:43:43 +02:00
Mike Fährmann
56d810c896
update keyword hashes for tests
2016-09-25 17:28:46 +02:00
Mike Fährmann
19c2d4ff6f
remove explicit (sub)category keywords
2016-09-25 14:22:07 +02:00
Mike Fährmann
d7e168799d
consistent extractor naming scheme + docstrings
2016-09-12 10:34:31 +02:00
Mike Fährmann
b6a68775d4
[senmanga] add chapter extractor
2016-08-02 17:42:22 +02:00