Mike Fährmann
b2151f3928
[seiga] support mobile URLs ( closes #401 )
2019-08-28 22:56:43 +02:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
a2af2d2965
adjust cache maxage values
2019-03-14 22:21:49 +01:00
Mike Fährmann
3159dd79d5
[seiga] use HTTPS
2019-02-21 22:51:11 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
3a0b4af744
[seiga] recognize /thumb/ URLs
...
https://lohas.nicoseiga.jp/thumb/5977527i
2019-02-09 16:53:27 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
34bab080ae
rewrite URL patterns to use only 1 per extractor
2019-02-08 12:03:10 +01:00
Mike Fährmann
dd358b4564
improve cookie handling during logins
2019-01-30 17:09:32 +01:00
Mike Fährmann
6126615698
update URLs for supportedsites.rst
2019-01-30 16:18:22 +01:00
Mike Fährmann
b8c97d2295
use 'extractor.request()' for more HTTP requests
2018-06-25 23:40:59 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
2018-04-20 14:53:21 +02:00
Mike Fährmann
3905474805
[booru] call update_page() with correct dict ( closes #82 )
2018-03-19 11:33:19 +01:00
Mike Fährmann
32bbd12f08
update extractor tests
2018-03-08 18:04:34 +01:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
0876541e43
[seiga] update tests
2017-12-30 19:19:36 +01:00
Mike Fährmann
93482a1f88
implement 'util.advance()'
2017-12-03 01:38:24 +01:00
Mike Fährmann
3c576d10c0
[seiga] better metadata + 'skip()' support
2017-11-15 13:58:35 +01:00
Mike Fährmann
f72318e593
[seiga] support more than 200 images
...
Due to API restrictions and/or missing knowledge about and
documentation of API usage, it was only possible to retrieve the
latest 200 images of a niconico seiga user with said API.
The new approach manually visits each HTML page and gets its
information from there.
2017-11-13 20:46:24 +01:00
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
...
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.
(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
cfa479fab5
update error message for unspecified exceptions
...
- ask user to report unexpected errors, which usually indicate
extractor failure
- handle OSErrors separately (permissions, disk full, etc)
- revert 30eef52
2017-08-10 16:35:46 +02:00
Mike Fährmann
915a0137de
improve 'extractor.request'
...
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
2017-08-05 16:11:46 +02:00
Mike Fährmann
7aa9fa796a
code cleanup and fixes
2017-07-25 14:59:41 +02:00
Mike Fährmann
808f67ba7d
use 'cookiedomain' for cookies set by object-config-values
...
otherwise these cookies would not be picked up by the
_check_cookies() method.
2017-07-22 15:43:35 +02:00
Mike Fährmann
0610ae5000
skip login if cookies are present
2017-07-17 10:33:36 +02:00
Mike Fährmann
d3b04076f7
add .netrc support ( #22 )
...
Use the '--netrc' cmdline option or set the 'netrc' config option
to 'true' to enable the use of .netrc authentication data.
The 'machine' names for the .netrc info are the lowercase extractor
names (or categories): batoto, exhentai, nijie, pixiv, seiga.
2017-06-24 12:17:26 +02:00
Mike Fährmann
c184e47ee3
put common directory- and filename formats in base classes
2017-05-30 12:10:16 +02:00
Mike Fährmann
4b967fa189
implement and use extractor.config() method
2017-04-25 17:12:48 +02:00
Mike Fährmann
82ab1fca07
[seiga] reduce cache maxage to one week
2017-04-24 15:25:20 +02:00
Mike Fährmann
1d46be545c
add login notifications
2017-03-17 09:42:59 +01:00
Mike Fährmann
619c74159a
[seiga] fix file extension and xml parsing
...
- The file extension of the first image had been used for all further
images
- API responses can contain invalid characters, which cause the XML
parser to fail (http://seiga.nicovideo.jp/user/illust/26377934
contains several \x08 characters)
2017-03-14 09:09:04 +01:00
Mike Fährmann
94e10f249a
code adjustments according to pep8 nr2
2017-02-01 00:53:19 +01:00
Mike Fährmann
95986ed566
[seiga] add user extractor
2017-01-04 14:21:36 +01:00
Mike Fährmann
7952b8d18d
add a few tests expecting exceptions
2016-12-30 01:46:42 +01:00
Mike Fährmann
12c99293b6
allow extension by Content-Type for exhentai, seiga, senmanga
2016-09-30 16:43:43 +02:00
Mike Fährmann
56d810c896
update keyword hashes for tests
2016-09-25 17:28:46 +02:00
Mike Fährmann
19c2d4ff6f
remove explicit (sub)category keywords
2016-09-25 14:22:07 +02:00
Mike Fährmann
d7e168799d
consistent extractor naming scheme + docstrings
2016-09-12 10:34:31 +02:00
Mike Fährmann
98877a45fb
[seiga] raise NotFoundError
2016-08-29 17:02:53 +02:00
Mike Fährmann
813320d7db
[seiga] match direct-links to images
2016-08-26 23:32:59 +02:00
Mike Fährmann
6792c68254
[seiga] add extractor
2016-08-09 16:36:30 +02:00