Mike Fährmann
4b1cda4cf7
[paheal] fix metadata extraction
2021-02-14 15:43:39 +01:00
Mike Fährmann
43120407cc
[paheal] create directory for each post ( closes #1147 )
2020-12-01 12:14:55 +01:00
Mike Fährmann
1e3dd7330e
merge SharedConfigMixin functionality into Extractor
2020-11-17 00:34:07 +01:00
Mike Fährmann
558cde139c
[paheal] fix extraction ( fixes #1088 )
2020-10-28 21:51:31 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
844793847c
update extractor test results
2020-10-11 18:15:41 +02:00
Mike Fährmann
19bf76bcf8
update extractor test results
2020-08-03 21:57:00 +02:00
Mike Fährmann
1d4a369ea2
update extractor test results
2020-02-27 22:15:40 +01:00
Mike Fährmann
e6cd49e78b
update extractor test results
2020-02-16 21:48:46 +01:00
Mike Fährmann
2852691d78
[paheal] replace test URL
...
searching for 'k-on' doesn't yield any results anymore
2020-01-27 22:19:41 +01:00
Mike Fährmann
62335b9015
[paheal] adjust test results
2019-06-05 11:42:01 +02:00
Mike Fährmann
6a34f4b0c1
skip tests on read timeouts; print list of skipped tests
2019-06-01 20:47:31 +02:00
Mike Fährmann
d6ddb74cde
update test results
...
- deviantart: 'index' is now an integer
- flickr: image file with lower quality
- paheal: image server name changed
- rule34: post got deleted
2019-04-12 09:59:48 +02:00
Mike Fährmann
f8782c05f2
[paheal] rename "tags" to "search_tags"
...
to better match field names of other booru extractors
2019-02-17 18:18:09 +01:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
4d656a81ca
replace SharedConfigExtractor class with a Mixin
2019-02-04 13:46:02 +01:00
Mike Fährmann
4d73cc785d
update test results
2018-12-14 16:07:32 +01:00
Mike Fährmann
c9f70e0a19
[paheal] use HTTPS
2018-07-17 21:25:03 +02:00
Mike Fährmann
7a58151566
fix util.parse_bytes invocations
...
(should be text.parse_bytes)
2018-05-10 22:07:55 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
2018-04-20 14:53:21 +02:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
40d35c87bc
[paheal] add tag- and post-extractors ( closes #69 )
2018-01-15 16:39:05 +01:00