Mike Fährmann
c7ec103e15
[batoto] fix extraction of chapter URLs
2017-08-25 16:34:42 +02:00
Mike Fährmann
18e6ed1c7e
[booru] add extractors for "Popular" images
2017-08-24 21:29:22 +02:00
Mike Fährmann
f7cdfd4c25
add a simplified version of 'parse_qs'
...
This version only returns a dict of plain string to string key-value
pairs and ignores multiple values for the same query variable.
2017-08-24 20:55:58 +02:00
Mike Fährmann
3b21e0703c
[deviantart] allow distinction between users and groups ( #26 )
...
This is done by prepending "group-" to an extractor's subcategory
if the URL belongs to a group ("folder" becomes "group-folder" and
so on). This changes the configuration-path being used and is also
reflected in the output of '--list-keywords'.
2017-08-22 20:15:13 +02:00
Mike Fährmann
e61a3a56d1
[hentai2read] fix and update keywords
...
Added the "author" keyword and changed the name of a few others to be
consistent with other manga/chapter extractors.
2017-08-22 15:01:47 +02:00
Mike Fährmann
c45770331a
use 'str.partition()'
...
The (r)partition method is always faster then split() or any other
method that has been replaced in this commit.
2017-08-21 18:29:50 +02:00
Mike Fährmann
017a72f448
[pixiv] improve input validation
2017-08-21 17:53:27 +02:00
Mike Fährmann
dcf42c5e89
[pixiv] add extractor for ranking lists
2017-08-20 20:21:52 +02:00
Mike Fährmann
4ea82ea556
[warosu] add thread extractor
2017-08-18 19:54:07 +02:00
Mike Fährmann
6078ec5908
restructure the output of --help
...
Using argument groups is a definite improvement over how things looked
previously, but general group membership of individual items might be
a thing to reconsider.
2017-08-16 19:56:50 +02:00
Mike Fährmann
9aa95fba8c
[deviantart] adapt download URLs to use https
...
Even though DeviantArt is "completely switching over to HTTPS"[1],
every URL contained in an API response is still using HTTP
[1] https://danlev.deviantart.com/journal/DeviantArt-Is-Switching-To-HTTPS-697996906
2017-08-16 12:17:50 +02:00
Mike Fährmann
d70c66c516
fix "text:" downloader
2017-08-16 12:11:47 +02:00
Mike Fährmann
f7de048980
add additional debug output
2017-08-13 20:35:44 +02:00
Mike Fährmann
9bf9d64ad8
update unittests for util.py
2017-08-13 14:31:22 +02:00
Mike Fährmann
02e89700fc
[foolfuuka] ensure sorted posts
2017-08-13 14:29:26 +02:00
Mike Fährmann
8bcf88bff7
[flickr] fix extraction
...
This issue was only noticeable with older Python versions, as these
don't exhibit a consistent ordering of dict keys.
2017-08-12 21:41:10 +02:00
Mike Fährmann
e3bfb8325a
fix circular dependency
...
- util.py imported config.py and vice versa
- Python < 3.5 doesn't like this
2017-08-12 21:32:24 +02:00
Mike Fährmann
004456d5d5
properly update the config-dictionary
...
When using 2 or more config files, the values of the second would
improperly overwrite nested dictionaries of the first one.
The new method properly combines these nested dictionaries as well.
2017-08-12 20:07:27 +02:00
Mike Fährmann
ae2d61e5b3
handle format string exceptions separately
2017-08-11 21:48:37 +02:00
Mike Fährmann
3c9f190757
extend output of --list-keywords
2017-08-10 17:36:21 +02:00
Mike Fährmann
cfa479fab5
update error message for unspecified exceptions
...
- ask user to report unexpected errors, which usually indicate
extractor failure
- handle OSErrors separately (permissions, disk full, etc)
- revert 30eef52
2017-08-10 16:35:46 +02:00
Mike Fährmann
7e936e9c06
[luscious] simplify and remove dead code
2017-08-08 19:26:13 +02:00
Mike Fährmann
d74a635e41
[util] update 'default' values and improve test coverage
...
for 'code_to_language()' and 'language_to_code()'
2017-08-08 19:22:04 +02:00
Mike Fährmann
0245a0ba5f
fix extraction and update test results
...
- fixes for hbrowse, imgyt, imgcandy, hosturimage
- test updates for deviantart, gfycat
2017-08-08 19:11:13 +02:00
Mike Fährmann
abd7c559cd
[yonkouprod] remove module
...
Every manga chapter on this site has been removed.
2017-08-07 18:32:14 +02:00
Mike Fährmann
da7219ba74
[kisscomic] remove module
...
Image links on this site are dead.
2017-08-07 18:28:35 +02:00
Mike Fährmann
852e7acd31
[twitter] ignore "Promoted Tweets"
2017-08-06 13:43:08 +02:00
Mike Fährmann
915a0137de
improve 'extractor.request'
...
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
2017-08-05 16:11:46 +02:00
rachmadani haryono
dcd573806e
chg: dev: fix error ( #32 )
...
* fix: dev: error
* fix: dev: AttributeError when getting artist
* fix: dev: typo on luscious parser
2017-08-04 15:01:10 +02:00
Mike Fährmann
c4713404c8
[directlink] improve URL pattern
2017-08-02 21:06:49 +02:00
Mike Fährmann
d443822fdb
[luacious] get correct image URLs ( fixes #33 )
...
Instead of using thumbnail URLs and modifying them the extractor now
goes through every single image-page and gets its download URL from
there.
2017-08-02 19:58:13 +02:00
Mike Fährmann
6950708e52
[hentaicdn] use HTTPS
2017-08-02 18:31:21 +02:00
Mike Fährmann
4f1e6c109f
[deviantart] remove 'invalid escape sequence' warning
...
- use r"\w" or "\\w" instead of "\w"
2017-07-27 20:50:33 +02:00
Mike Fährmann
c864be479e
[directlink] update URL pattern & PEP 8
...
- combine some file extensions
- don't match '.je'
- line length < 80
2017-07-27 20:46:15 +02:00
H R X N
45f9d64c23
Update directlink.py with additional file exts. ( #30 )
...
Add WebP, still not that common, but it's increasing.
Add 3rd JPEG variant (https://en.wikipedia.org/wiki/JPEG#JPEG_filename_extensions )
Never seen JFIF in the wild, would probably be overkill.
Extend Ogg formats (https://en.wikipedia.org/wiki/Ogg ; https://wiki.xiph.org/MIME_Types_and_File_Extensions )
2017-07-27 20:40:00 +02:00
Mike Fährmann
4357966a70
[kissmanga] make URL pattern case-insensitive (fixes 28)
2017-07-26 10:36:59 +02:00
Mike Fährmann
493bd235cf
workaround for missing 'assert_called_once' method
...
this method was introduced in Python 3.6, but calling it still
works (i.e. it doesn't cause the test to fail) on Python 3.3/3.4
2017-07-26 10:33:15 +02:00
Mike Fährmann
7aa9fa796a
code cleanup and fixes
2017-07-25 14:59:41 +02:00
Mike Fährmann
f08af03845
Merge branch 'cookies'
2017-07-25 14:04:53 +02:00
Mike Fährmann
55f048d02b
ignore case of cookiejar magic strings
2017-07-24 18:33:42 +02:00
Mike Fährmann
de68cf84a8
release version 0.9.1
2017-07-24 11:36:21 +02:00
Mike Fährmann
f53bf1a323
[thebarchive] add thread extractor
2017-07-23 15:45:17 +02:00
Mike Fährmann
b8cf434bb0
[rebeccablacktech] add thread extractor
2017-07-23 15:41:56 +02:00
Mike Fährmann
808f67ba7d
use 'cookiedomain' for cookies set by object-config-values
...
otherwise these cookies would not be picked up by the
_check_cookies() method.
2017-07-22 15:43:35 +02:00
Mike Fährmann
390eeded4c
[mangazuki] support 'raws.…' subdomain
2017-07-21 16:25:56 +02:00
Mike Fährmann
4a60f6068a
[mangazuki] add manga extractor
2017-07-20 16:02:09 +02:00
Mike Fährmann
394241cd6f
[2chan] fix extraction
2017-07-20 15:01:47 +02:00
Mike Fährmann
a13eb6010f
[fallenangels] fix extraction of chapter URLs
2017-07-20 14:58:47 +02:00
Mike Fährmann
1cb1d2e0a3
[mangazuki] add chapter extractor
2017-07-19 17:20:03 +02:00
Mike Fährmann
2f2e363c97
[imgur] use /a/<key>/all as album-url
2017-07-18 21:06:31 +02:00
Mike Fährmann
1cec03c9c6
[imgur] fix extraction of large albums
2017-07-18 12:42:19 +02:00
Mike Fährmann
0610ae5000
skip login if cookies are present
2017-07-17 10:33:36 +02:00
Mike Fährmann
f105782435
[fireden] add thread extractor
2017-07-15 14:51:58 +02:00
Mike Fährmann
c93f7d7496
[archiveofsins] add thread extractor
2017-07-15 13:23:04 +02:00
Mike Fährmann
96e13604da
[archivedmoe] add thread extractor
2017-07-14 13:25:53 +02:00
Mike Fährmann
30d3a5f9b2
support redirects on 4chan archives
2017-07-14 13:24:09 +02:00
Mike Fährmann
98464d1f1b
[loveisover] add thread extractor
2017-07-14 11:17:47 +02:00
Mike Fährmann
47692f28da
[2chan] add thread extractor
2017-07-14 08:44:31 +02:00
Mike Fährmann
3460dc8950
update gallery-dl.conf
2017-07-14 08:23:11 +02:00
Mike Fährmann
9be8f7e106
[deviantart] add "extractor.deviantart.flat" option
...
Setting this to 'false' downloads images into individual subdirectories
for each gallery-folder or favourite-collection, otherwise it is just
creating a flat list of images.
2017-07-12 17:05:31 +02:00
Mike Fährmann
d075627fd9
[deviantart] support group galleries ( #26 )
...
For groups the 'GalleryExtractor' collects all gallery-folder URLs
and defers its work to the 'FolderExtractor'.
2017-07-12 09:47:01 +02:00
Mike Fährmann
b37a62501b
[pixiv] unquote tags
2017-07-12 08:21:29 +02:00
Mike Fährmann
fbd7dcdfdb
[desuarchive] add thread extractor
2017-07-11 17:48:22 +02:00
Mike Fährmann
af9bd17b19
[deviantart] adjust default paths
...
- user.deviantart.com/(gallery|favourites|journal)/ images go into
* <user>/
* <user>/Favourites/
* <user>/Journal/
(having an extra "Gallery" folder for a user's gallery-images seems
a bit too much if these are all you want to download, which is
probably the default use-case)
- single "deviations" (user.deviantart.com/(art|journal)/name-123) go
into their owner's directory:
* <user>/
(putting them into their own directory seems weird in practice)
2017-07-10 18:54:10 +02:00
Mike Fährmann
eb64fb267c
[nyafuu] add thread extractor ( #18 )
2017-07-08 17:16:41 +02:00
Mike Fährmann
726c6f01ae
allow 'cookies' config option to be a dictionary
2017-07-07 18:01:46 +02:00
Mike Fährmann
4877ef6314
[deviantart] support '?catpath=/' URLs ( #26 )
...
They previously weren't supported for galleries and journals.
This also increases the 'limit' parameter for API calls to its
respective maximum.
2017-07-06 20:45:22 +02:00
Mike Fährmann
8c16cbe7ea
fix tests
2017-07-04 19:31:32 +02:00
Mike Fährmann
a6f689e01a
[deviantart] add gallery-folder extractor ( #26 )
...
The code for this and the available metadata is probably going
to change again. This extractor is very similar to the favorite-
extractor, so they might be "combined" or something like that.
2017-07-03 21:57:10 +02:00
Mike Fährmann
474e9c1aec
[4plebs] add thread extractor ( #18 )
2017-07-03 16:43:04 +02:00
Mike Fährmann
a804a42e23
add '--cookies' command-line option
2017-07-03 15:02:19 +02:00
Mike Fährmann
dcc1d3b2ea
[hentaifoundry] fix infinite loop for multiple of 25 images
2017-07-03 14:16:08 +02:00
Mike Fährmann
34e6e1099e
[batoto] adapt to https chapter URLs
2017-07-02 08:20:04 +02:00
Mike Fährmann
85696d0b3b
[reddit] fix issue with datetime errors
2017-07-02 08:19:45 +02:00
Mike Fährmann
80c2e03aaa
[reddit] allow 'date-min/max' to be human readable dates
...
If the date-min/max config value is a string, try parsing it using
datetime.strptime [1] with 'date-format' as format string [2]
(default: "%Y-%m-%dT%H:%M:%S")
Example: get all submissions posted in 2016
$ gallery-dl reddit.com/r/... \
-o date-format=%Y \
-o date-min=\"2016\" \
-o date-max=\"2017\"
[1] https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime
[2] https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
2017-07-01 18:46:38 +02:00
Mike Fährmann
58e95a7487
share extractor and downloader sessions
...
There was never any "good" reason for the strict separation
between extractors and downloaders. This change allows for
reduced resource usage (probably unnoticeable) and less lines
of code at the "cost" of tighter coupling.
2017-06-30 19:38:14 +02:00
Mike Fährmann
4414aefe97
small fix for aes_cbc_decrypt_text
2017-06-30 15:21:04 +02:00
Mike Fährmann
21064146c1
fix test
2017-06-29 17:57:53 +02:00
Mike Fährmann
f3d0373120
[reddit] add ability to filter by submission id
...
'extractor.reddit.id-min' and '….id-max' specify the lowest and
highest submission-/post-id to consider, similar to 'date-min' and
'date-max'
2017-06-29 17:39:22 +02:00
Mike Fährmann
06c4cae05b
extend the output of '--list-extractors'
...
It now includes category and subcategory values for
each extractor class.
2017-06-28 18:51:47 +02:00
Mike Fährmann
1dac76fd1c
update extractor docstrings
2017-06-28 17:39:07 +02:00
Mike Fährmann
e217e23e29
release version 0.9.0
2017-06-28 09:46:34 +02:00
Mike Fährmann
92a11528d1
smaller changes
2017-06-28 09:42:49 +02:00
Mike Fährmann
44d98e562b
[pixiv] support pixiv.me URLs ( #23 )
2017-06-25 20:21:01 +02:00
Mike Fährmann
b373fe0eea
[pixiv] support shortened URLs and other variants ( #23 )
2017-06-25 17:49:24 +02:00
Mike Fährmann
c951d6276c
[imagetwist] use https
2017-06-24 16:21:00 +02:00
Mike Fährmann
d3b04076f7
add .netrc support ( #22 )
...
Use the '--netrc' cmdline option or set the 'netrc' config option
to 'true' to enable the use of .netrc authentication data.
The 'machine' names for the .netrc info are the lowercase extractor
names (or categories): batoto, exhentai, nijie, pixiv, seiga.
2017-06-24 12:17:26 +02:00
Mike Fährmann
e1d82af5e0
small fixes
2017-06-22 18:46:42 +02:00
Mike Fährmann
719d45f89e
[flickr] allow the use of Flickr's specifiers for format selection
...
- renamed the 'width-max' option to 'size-max'
- filter by both width and height
2017-06-20 16:09:25 +02:00
Mike Fährmann
b4c438c9ad
[oauth] add the 'extractor.oauth.browser' option
...
enables/disables the use of webbrowser.open() during OAuth authorization
2017-06-20 16:06:14 +02:00
Mike Fährmann
2633337833
[kissmanga] update regex ( fixes #20 )
2017-06-19 09:55:02 +02:00
Mike Fährmann
fac6c02224
[downloader] fix extension from content-type
2017-06-19 09:24:00 +02:00
Mike Fährmann
e68af4febe
[flickr] add 'width-max' option ( #16 )
...
This option allows for simple format selection by
specifying a maximum image width.
2017-06-18 22:20:15 +02:00
Mike Fährmann
2993206c4b
smaller fixes and "security" measures
...
- move the OAuthSession class into util.py
- block special extractors for reddit and recursive
- ignore 'only matching' tests for testresults script
2017-06-16 21:01:40 +02:00
Mike Fährmann
8d5e92f641
resolve cyclic dependency between oauth and flickr
2017-06-14 16:11:18 +02:00
Mike Fährmann
d60781de7b
[oauth] workaround for ctrl+c on Windows
2017-06-14 15:29:40 +02:00
Mike Fährmann
9759fe8c6b
allow 'only_matching' tests
2017-06-14 08:43:05 +02:00
Mike Fährmann
56bec79e6a
[reddit] add ability to load more comments ( #15 )
...
The 'extractor.reddit.morecomments' option enables the use of
the '/api/morechildren' API endpoint (1) to load even more
comments than the usual submission-request provides.
Possible values are the booleans 'true' and 'false' (default).
Note: this feature comes at the cost of 1 extra API call towards
the rate limit for every 100 extra comments.
(1) https://www.reddit.com/dev/api/#GET_api_morechildren
2017-06-13 18:49:07 +02:00
Mike Fährmann
05ed95e5b0
[flickr] add search extractor
2017-06-13 08:01:32 +02:00
Mike Fährmann
5f55c854b9
[flickr] replace getPublic... API call with regular ones
2017-06-12 16:37:06 +02:00