1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 11:12:40 +01:00
Commit Graph

74 Commits

Author SHA1 Message Date
Mike Fährmann
16e276fca4
[nijie] fix image URLs for single image posts (#5842)
fixes regression introduced in 2e11b6e7
2024-07-10 15:06:25 +02:00
Mike Fährmann
2e11b6e756
[nijie] support downloading videos (#5707, #5617) 2024-06-08 22:55:28 +02:00
Mike Fährmann
fe7e2281ac
[nijie] increase default delay between requests (#5221)
1-2s is not enough
2024-02-20 18:19:49 +01:00
Mike Fährmann
2191e29e14
[nijie] fix image URL for single image posts (#5049) 2024-01-11 05:07:38 +01:00
Mike Fährmann
b6903a4c90
[nijie] add 'count' metadata field
https://github.com/mikf/gallery-dl/issues/146#issuecomment-1812849102
2023-12-30 22:25:59 +01:00
Mike Fährmann
a30a3e44d5
[nijie] move 'username required' out of _login_impl 2023-12-18 23:57:44 +01:00
Mike Fährmann
57fc6fcf83
replace '24*3600' with '86400'
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
4eb3590103
[nijie] fix image URLs of multi-image posts (#4876) 2023-12-05 17:48:50 +01:00
Mike Fährmann
3984a49abf
[nijie] set 1-2s delay between requests to avoid 429 errors 2023-11-03 23:44:47 +01:00
Mike Fährmann
3ecb512722
send Referer headers by default 2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
3b369ce3d1
[nijie] add 'followed' extractor (#3048) 2022-10-14 14:59:18 +02:00
Mike Fährmann
c4a62a48ae
[nijie] add 'feed' extractor (#3048) 2022-10-14 12:03:00 +02:00
Mike Fährmann
636d03df95
[nijie] reduce cache maxage to 90 days 2022-08-27 21:57:45 +02:00
Mike Fährmann
241e82e18d
[horne] add support for horne.red (#2700) 2022-06-25 16:52:16 +02:00
Mike Fährmann
d11e2191ae
[nijie] support /history_nuita.php listings (closes #2541) 2022-05-02 09:03:34 +02:00
Mike Fährmann
1f9a0e2fd8
update extractor test results 2022-04-18 17:24:00 +02:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
b58e605dc7
raise error when required username or password are missing
do not try to login as 'None' (#1192)
2020-12-22 14:40:18 +01:00
Mike Fährmann
6514312126
[nijie] add 'include' option (closes #1018) 2020-09-25 18:18:35 +02:00
Mike Fährmann
e62c209ca0
[nijie] fix 'date' parsing 2019-11-30 23:08:21 +01:00
Mike Fährmann
94dbdbf506
[nijie] change default filename format
… to be consistent with Pixiv filenames
2019-11-04 20:47:38 +01:00
Mike Fährmann
1faec285d1
[nijie] further improvements (closes #423)
- provide a 'user_name' metadata field
  - usually the same as 'artist_id', except for favorite downloads
- extract the whole description text and properly escape HTML entities
- fixed an issue with titles or tags containing double quotes
2019-09-27 23:14:32 +02:00
Mike Fährmann
20eb6c401f
[nijie] improvements and fixes (#423)
- ignore unavailable image pages
- more metadata fields: artist_name, date, tags
- rename 'index' to 'num'
- improved code structure
2019-09-26 21:45:01 +02:00
Mike Fährmann
12da6bd0c9
[simplyhentai] fix/improve extraction 2019-07-06 20:25:53 +02:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
b89f0d8d3c
update extractor result tests 2019-07-01 20:02:47 +02:00
Mike Fährmann
a2af2d2965
adjust cache maxage values 2019-03-14 22:21:49 +01:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)

Example: "https://example.org/path/filename.ext"

before:
- filename : filename.ext
- name     : filename
- extension: ext

now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
00dc37ccbf
replace AsynchronousMixin Extractor with a Mixin 2019-02-04 14:21:19 +01:00
Mike Fährmann
dd358b4564
improve cookie handling during logins 2019-01-30 17:09:32 +01:00
Mike Fährmann
173add6935
[nijie] fix artist_id extraction
view_popup.php pages for older images or dojins either have the
artist_id value at a different place or not at all.
2018-07-10 12:30:53 +02:00
Mike Fährmann
017188d268
improve extractor.request()
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
2018-06-18 16:29:56 +02:00
Mike Fährmann
2d17a9e07f
improve extractor.request()
- better retry behavior
- exponential back-off
- removed 'allow_empty' argument
2018-04-23 18:45:59 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module 2018-04-20 14:53:21 +02:00
Mike Fährmann
7b562907c3
[nijie] add favorites extractor
adds support for 'https://nijie.info/user_like_illust_view.php?id=...'
2018-03-31 18:54:25 +02:00
Mike Fährmann
445db75955
[nijie] improve extraction and metadata
- add 'title' and 'description'
- split 'artist_id' into 'user_id' and 'artist_id'
  - 'user_id' is the ID of the user from which the image entry
    originates from
  - 'artist_id' is the ID of the actual image artist
- improve pagination and URL patterns
2018-03-31 18:48:41 +02:00
Mike Fährmann
a112e3f2a0
[nijie] add doujin extractor
adds support for "https://nijie.info/members_dojin.php?id=<artist_id>"
2018-03-31 18:17:41 +02:00
Mike Fährmann
3cec533c28
Merge branch 'archive' 2018-02-12 18:07:58 +01:00
Mike Fährmann
f5f2d29f56
[nijie] fix dojin extraction
- correctly extract artist_id
- set extension to "jpg" if it was empty and let filetype checks do
  the rest
2018-02-09 22:06:26 +01:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
9c138dfc1f
[common] detect empty HTTP response bodies 2017-09-26 16:49:58 +02:00
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.

(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
915a0137de
improve 'extractor.request'
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
2017-08-05 16:11:46 +02:00
Mike Fährmann
7aa9fa796a
code cleanup and fixes 2017-07-25 14:59:41 +02:00