Mike Fährmann
b788712844
[fallenangels] fix extraction of '.5' chapters
2020-10-23 16:56:08 +02:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
844502cad5
update extractor test results
2020-10-03 19:24:19 +02:00
Mike Fährmann
aa8e366b90
[luscious] fix tag extraction
2019-05-14 17:35:52 +02:00
Mike Fährmann
f2cf1c1d73
use 'text.extract_from()' in a few places
2019-04-21 15:19:20 +02:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
32edf4fc7b
add '_extractor' info to manga extractor results
2019-02-13 13:23:36 +01:00
Mike Fährmann
580baef72c
change Chapter and MangaExtractor classes
...
- unify and simplify constructors
- rename get_metadata and get_images to just metadata() and images()
- rename self.url to chapter_url and manga_url
2019-02-11 18:38:47 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
966a9ca3a0
update test results
2018-11-10 19:14:54 +01:00
Mike Fährmann
d1f3d32eec
[fallenangels] unescape chapter titles
2018-10-20 18:31:26 +02:00
Mike Fährmann
1532d1b690
fix 'range' tests and update a few test results
2018-10-08 23:53:58 +02:00
Mike Fährmann
d69db60e2a
update unit test results
2018-10-02 20:37:46 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
2018-04-20 14:53:21 +02:00
Mike Fährmann
e7525b1b0e
[artstation] add challenge extractor ( #80 )
2018-03-23 15:06:09 +01:00
Mike Fährmann
7a412f5c32
implement generic manga-chapter extractor
2018-02-04 22:02:04 +01:00
Mike Fährmann
0dd48d644f
update test results
...
nothing broke, but things got updated or changed
2018-01-23 21:38:29 +01:00
Mike Fährmann
633b376f35
improve/adjust default filename formats for manga sites
2017-10-02 19:06:24 +02:00
Mike Fährmann
82ea6c0cd3
adjust format strings with optional titles
...
... except for anything manga/comic related
2017-09-28 18:00:19 +02:00
Mike Fährmann
9fc1d0c901
implement and use 'util.safe_int()'
...
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
2017-09-24 15:59:25 +02:00
Mike Fährmann
84d4450410
[fallenangels] extract manga metadata
2017-09-15 20:51:40 +02:00
Mike Fährmann
a13eb6010f
[fallenangels] fix extraction of chapter URLs
2017-07-20 14:58:47 +02:00
Mike Fährmann
b6fffa9e26
[directlink] update filename format and metadata
2017-05-30 17:33:09 +02:00
Mike Fährmann
832a4a8ee9
[fallenangels] add manga extractor
2017-05-21 10:37:38 +02:00
Mike Fährmann
b0131ea402
[fallenangels] support this site's Vietnamese version
...
- https://truyen.fascans.com/
2017-05-18 15:22:25 +02:00
Mike Fährmann
fece09d326
[fallenangels] update to new domain and site-layout
2017-04-09 11:37:21 +02:00
Mike Fährmann
9a08f8a097
improved foolslide-based extractors
...
- this includes dokireader, fallenangels, jaiminisbox, powermanga,
sensescans, worldthree, yonkouprod, gomanga, yomanga
- added 'chapter_string', 'chapter_id', 'chapter_minor' and 'count'
keywords
- changed the 'chapter' keyword to always be just a number
- changed the default directory format
2017-02-16 23:42:30 +01:00
Mike Fährmann
52104b2bb6
[fallenangels] add chapter extractor
2017-02-06 20:05:58 +01:00