1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-23 03:02:50 +01:00
Commit Graph

55 Commits

Author SHA1 Message Date
Herp
99c53f7fa8
Fix imagefap extrcator 2024-03-15 23:44:25 +01:00
Mike Fährmann
05331f9cf1
[imagefap] flake8, cleanup, tests 2024-03-07 01:29:19 +01:00
termvacycurtocs
f8b037ed40
[Imagefap] Add folder metadata
[Imagefap] Add "folder" metadata when downloading a folder or user profile.
No additional request is made to the server.

Use for example with the following configuration :
"parent-metadata": true
"directory":["{category}", "{uploader}", "{folder}", "{gallery_id} {title}"]
2024-03-02 22:15:45 +01:00
Mike Fährmann
d119507037
[imagefap] fix single image resolution
Downloading from a single image page like
https://www.imagefap.com/photo/123456789/
returned only the thumbnail URL.
2023-11-26 00:30:52 +01:00
Mike Fährmann
3ecb512722
send Referer headers by default 2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
a996d936d2
[imagefap] fix pagination (#3013) 2023-07-18 17:56:33 +02:00
Mike Fährmann
2dfd4a3de2
[imagefap] extract 'categories' metadata and fix empty 'tags' 2023-04-17 14:49:50 +02:00
Mike Fährmann
02ec5bb8e5
[imagefap] extract 'description' metadata (#3905) 2023-04-16 17:02:16 +02:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
137a395ae0
[imagefap] fix infinite pagination loop (#3594) 2023-01-31 19:21:43 +01:00
Mike Fährmann
3c708ade8f
[imagefap] fix metadata extraction 2023-01-31 15:38:55 +01:00
Mike Fährmann
17e24eacf0
[imagefap] update 'gallery' URLs (#3595) 2023-01-31 15:33:35 +01:00
Mike Fährmann
4833ec323e
[imagefap] add 'folder' extractor (#3504) 2023-01-08 16:57:31 +01:00
Mike Fährmann
cbaeee9533
[imagefap] warn about redirects to '/human-verification' (#1140) 2023-01-07 13:04:42 +01:00
Mike Fährmann
435de1329a
[imagefap] use default delay between requests (#1140) 2023-01-07 12:59:09 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
bc9d291c13
[imagefap] fix and improve folder extraction (#3013) 2022-10-08 15:41:39 +02:00
Mike Fährmann
55fca5fe4b
[imagefap] fix and improve gallery pagination (#3013) 2022-10-08 15:41:39 +02:00
Mike Fährmann
c6a9bab019
update extractor test results 2022-07-12 15:49:22 +02:00
Mike Fährmann
47a780942c
update extractor test results 2021-09-03 19:36:12 +02:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
2ecf1efb16
update extractor test results
- tumblr: remove deleted post
- jaiminisbox: replace removed manga/chapters
- smugmug: one inconsequential field got removed
2020-07-18 15:12:28 +02:00
Mike Fährmann
1afb91363c
[imagefap] generalize URL patterns and add tests (#552) 2020-01-02 14:26:18 +01:00
Xope Totec
f701e9f33a Handle beta.imagefap.com URLs (#552) 2020-01-02 14:22:00 +01:00
Mike Fährmann
dcaa3d01bd
[imagefap] adapt to new image URL format 2019-11-30 23:48:02 +01:00
Mike Fährmann
108963d138
[imagefap] include Referer headers 2019-06-24 21:31:29 +02:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)

Example: "https://example.org/path/filename.ext"

before:
- filename : filename.ext
- name     : filename
- extension: ext

now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
61741d7333
provide type information for Queue messages
Child extractors are now directly constructed with Extractor.from_url()
if the extractor class is known beforehand, instead of using
extractor.find() and searching through all possible extractor classes.
2019-02-12 21:32:32 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
34bab080ae
rewrite URL patterns to use only 1 per extractor 2019-02-08 12:03:10 +01:00
Mike Fährmann
7f6a0be982
adjust some tests 2018-11-15 22:50:04 +01:00
Mike Fährmann
c69150f715
[imagefap] fix extraction
also adds tags to gallery-metadata and converts suitable values to int
2018-10-20 18:32:25 +02:00
Mike Fährmann
34b556922d
update/restore tests 2018-08-23 15:47:40 +02:00
Mike Fährmann
188e956c4e
[imagefap] use HTTPS + update test results 2018-06-30 19:40:46 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module 2018-04-20 14:53:21 +02:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
035ef655f1
[imagefap] update unit tests
old gallery/image has been deleted
2017-10-27 12:22:16 +02:00
Mike Fährmann
81a7788b40
replace space characters in unit test URLs 2017-10-23 17:00:53 +02:00
Mike Fährmann
26a866e7d8
implement (sub)category-transfer between extractors (#41)
ImageFap- and all Manga-Extractors will transfer their (sub)category
values to other extractors instantiated by them, which will in turn
allow those to use options set for their parents.

Example:
ImagefapGalleryExtractors will use options set under
extractor.imagefap.user, if (and only if) they have been instantiated by
a ImagefapUserExtractor; and options from extractor.imagefap.gallery
otherwise.
2017-09-26 21:05:11 +02:00
Mike Fährmann
9fc1d0c901
implement and use 'util.safe_int()'
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
2017-09-24 15:59:25 +02:00
Mike Fährmann
0dedbe759c
enable '--chapter-filter'
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.

TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.

(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
43e3bb24ae
[imagefap] don't rely on image-count
(fixes #9)
2017-03-09 20:34:39 +01:00
Mike Fährmann
94e10f249a
code adjustments according to pep8 nr2 2017-02-01 00:53:19 +01:00
Mike Fährmann
56d810c896
update keyword hashes for tests 2016-09-25 17:28:46 +02:00
Mike Fährmann
19c2d4ff6f
remove explicit (sub)category keywords 2016-09-25 14:22:07 +02:00