Mike Fährmann
28cd78aae0
[kissmanga] extend chapter-string regex ( closes #58 )
2017-12-24 22:53:10 +01:00
Mike Fährmann
fc7d165c97
[deviantart] add support for OAuth2 authentication
...
Some user galleries [*] require you to be either logged in or
authenticated via OAuth2 to access their deviations.
[*] e.g. https://polinaegorussia.deviantart.com/gallery/
--------------
known issue:
A deviantart 'refresh_token' can only be used once and gets updated
whenever it is used to request a new 'access_token', so storing its
initial value in a config file and reusing it again and again is not
possible.
2017-12-18 01:16:46 +01:00
Mike Fährmann
0a9a07a6e1
[slideshare] improve metadata; flake8
...
- added 'views' and 'published' keywords
- fixed longer titles and descriptions
2017-12-13 21:16:49 +01:00
Mike Fährmann
291369eab2
various smaller changes/additions
2017-12-06 21:45:56 +01:00
Mike Fährmann
300346ecdf
[mangazuki] remove extractors
...
This site has been in "rebuild"-mode for a fairly long time and the
current extractor code isn't going to work for the new version either.
2017-12-04 13:36:04 +01:00
Mike Fährmann
214972bc9a
[gelbooru] use manual extraction
...
... to compensate for their disabled API.
(https://gelbooru.com/index.php?page=forum&s=view&id=3875 )
This also adds an extractor for image-pools.
2017-11-29 20:48:17 +01:00
Mike Fährmann
b14de6ffc2
[tumblr] small improvements
...
- don't transform inline GIF URLs
- set 'type' parameter for API calls if there is only
one post type selected
2017-11-24 16:51:07 +01:00
Mike Fährmann
b8cdd42cab
[senmanga] fix extraction (again)
...
this is basically a re-revert of 2ace5c7
2017-11-18 17:23:32 +01:00
Mike Fährmann
6913eeaa40
[powermanga] replace manga extractor unit test
...
My Hero Academia is gone
2017-11-15 14:01:24 +01:00
Mike Fährmann
f72318e593
[seiga] support more than 200 images
...
Due to API restrictions and/or missing knowledge about and
documentation of API usage, it was only possible to retrieve the
latest 200 images of a niconico seiga user with said API.
The new approach manually visits each HTML page and gets its
information from there.
2017-11-13 20:46:24 +01:00
Mike Fährmann
2457b71633
skip tests on 5xx status codes
2017-11-12 20:51:12 +01:00
Mike Fährmann
305da540c3
[mangahere] fix metadata extraction
2017-11-03 14:54:46 +01:00
Mike Fährmann
035ef655f1
[imagefap] update unit tests
...
old gallery/image has been deleted
2017-10-27 12:22:16 +02:00
Mike Fährmann
caf26412dd
add option to set alternate location of .part files ( #29 )
...
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
27c026543f
re-enable download unit tests
2017-10-25 12:55:36 +02:00
Mike Fährmann
b0353aa02d
rewrite download modules ( #29 )
...
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
75d3a1f72f
[deviantart] always download original images
...
Deviation-objects returned by the DeviantArt API don't always contain
the URL and metadata of the original image ([1]). Getting this
information requires an additional API call [2], which is indicated by
the 'is_downloadable' and 'download_filesize' metadata within a
deviation-object.
[1] https://myria-moon.deviantart.com/art/Aime-Moi-part-en-vadrouille-261986576
[2] https://www.deviantart.com/developers/http/v1/20160316/deviation_download/bed6982b88949bdb08b52cd6763fcafd
2017-10-07 13:07:34 +02:00
Mike Fährmann
0386503c80
fix (sub)category-transfer for DownloadJob instances ( #41 )
...
... and extend "parent" parameters to TestJob- and DataJob-classes
as well.
2017-10-06 15:38:35 +02:00
Mike Fährmann
41adb99e9c
[pawoo] fix extraction
...
- changed access_token
- use account-search instead of general search
2017-10-02 18:33:52 +02:00
Mike Fährmann
b319f4bab3
smaller code and text changes
2017-10-01 18:23:40 +02:00
Mike Fährmann
85a2b2ae59
[khinsider] fix extraction
2017-09-28 11:47:26 +02:00
Mike Fährmann
8e14714c2b
[imgspice] fix extraction
2017-09-26 21:04:48 +02:00
Mike Fährmann
a85f06d2d1
[foolslide] restructure; convert suitable values to int
2017-09-24 16:57:47 +02:00
Mike Fährmann
a9e7145651
[hbrowse] extract hmanga metadata & general maintenance
2017-09-20 16:25:25 +02:00
Mike Fährmann
84d4450410
[fallenangels] extract manga metadata
2017-09-15 20:51:40 +02:00
Mike Fährmann
f32b1a0292
[imgyt] fix extraction
2017-09-14 15:04:32 +02:00
Mike Fährmann
31cd5b1c1d
[luscious] detect high-load responses
2017-09-12 15:46:21 +02:00
Mike Fährmann
81877bb5f6
add '-K' as shortcut for '--list-keywords'
2017-09-09 18:48:28 +02:00
Mike Fährmann
f98e3e8002
[luscious] fix tag extraction
2017-09-01 16:29:52 +02:00
Mike Fährmann
65997d835b
replace popular/ranking tests with older ones
...
Metadata of several year old lists shouldn't change as much as it
would for newer ones, which makes metadata-comparisons of the output
of build_testresult_db.oy easier.
2017-08-31 15:09:18 +02:00
Mike Fährmann
c0755a4d5e
[exhentai] revert login-method to its old version ( #37 )
...
Additional cookies don't seem to help and have to be manually set
anyway. The older method is more likely to succeed, so I'd rather
use this one.
2017-08-29 22:10:38 +02:00
Mike Fährmann
3ee39ffd93
[exhentai] update login procedure ( #37 )
...
This new version behaves pretty much exactly like a browser would and
caches all cookies sent to it and not just "ipb_member_id" and
"ipb_pass_hash".
2017-08-28 21:03:32 +02:00
Mike Fährmann
07214f4007
[booru] place subcategories into base classes
2017-08-26 22:27:55 +02:00
Mike Fährmann
47bcf53ec1
implement support for additional unit test result types
...
- "pattern" matches all resulting URLs against the given regex
- "count" allows to specify the amount of returned URLs
2017-08-25 22:01:14 +02:00
Mike Fährmann
c7ec103e15
[batoto] fix extraction of chapter URLs
2017-08-25 16:34:42 +02:00
Mike Fährmann
f7cdfd4c25
add a simplified version of 'parse_qs'
...
This version only returns a dict of plain string to string key-value
pairs and ignores multiple values for the same query variable.
2017-08-24 20:55:58 +02:00
Mike Fährmann
d70c66c516
fix "text:" downloader
2017-08-16 12:11:47 +02:00
Mike Fährmann
abd7c559cd
[yonkouprod] remove module
...
Every manga chapter on this site has been removed.
2017-08-07 18:32:14 +02:00
Mike Fährmann
852e7acd31
[twitter] ignore "Promoted Tweets"
2017-08-06 13:43:08 +02:00
Mike Fährmann
6950708e52
[hentaicdn] use HTTPS
2017-08-02 18:31:21 +02:00
Mike Fährmann
493bd235cf
workaround for missing 'assert_called_once' method
...
this method was introduced in Python 3.6, but calling it still
works (i.e. it doesn't cause the test to fail) on Python 3.3/3.4
2017-07-26 10:33:15 +02:00
Mike Fährmann
7aa9fa796a
code cleanup and fixes
2017-07-25 14:59:41 +02:00
Mike Fährmann
d7cb3c668a
update supportedsites.rst
2017-07-24 10:50:40 +02:00
Mike Fährmann
a13eb6010f
[fallenangels] fix extraction of chapter URLs
2017-07-20 14:58:47 +02:00
Mike Fährmann
1cb1d2e0a3
[mangazuki] add chapter extractor
2017-07-19 17:20:03 +02:00
Mike Fährmann
3460dc8950
update gallery-dl.conf
2017-07-14 08:23:11 +02:00
Mike Fährmann
af9bd17b19
[deviantart] adjust default paths
...
- user.deviantart.com/(gallery|favourites|journal)/ images go into
* <user>/
* <user>/Favourites/
* <user>/Journal/
(having an extra "Gallery" folder for a user's gallery-images seems
a bit too much if these are all you want to download, which is
probably the default use-case)
- single "deviations" (user.deviantart.com/(art|journal)/name-123) go
into their owner's directory:
* <user>/
(putting them into their own directory seems weird in practice)
2017-07-10 18:54:10 +02:00
Mike Fährmann
8c16cbe7ea
fix tests
2017-07-04 19:31:32 +02:00
Mike Fährmann
ce55ec6490
enable extractor tests without filters
...
$ python test_extractors.py all
2017-07-02 08:20:04 +02:00
Mike Fährmann
44d98e562b
[pixiv] support pixiv.me URLs ( #23 )
2017-06-25 20:21:01 +02:00