1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-26 04:32:51 +01:00
Commit Graph

75 Commits

Author SHA1 Message Date
Mike Fährmann
b62c466c14
[flickr] fix video download URLs (#6464)
continuation of 0e18fa395d
fix video detection in '_file_url'
2024-11-13 20:56:37 +01:00
Mike Fährmann
7916c8bf77
allow passing cookies to OAuth extractors
partially revert ce54b8c04c
2024-11-09 18:06:27 +01:00
Mike Fährmann
0e18fa395d
[flickr] use "download" URLs (#6360) 2024-11-09 17:33:27 +01:00
Mike Fährmann
7c43f9e152
[flickr] update default API credentials (#6300) 2024-10-10 11:50:34 +02:00
Mike Fährmann
9a0acbe7c4
[flickr] remove debug remains (#6252)
fixes regression introduced in a051e1c9
2024-09-29 13:01:51 +02:00
Mike Fährmann
a051e1c955
directly pass exception instances as 'exc_info' logger argument 2024-09-19 14:50:08 +02:00
Mike Fährmann
58113b73d1
[flickr] make album metadata extraction non-fatal (#3441)
https://github.com/mikf/gallery-dl/issues/3441#issuecomment-2313679156
2024-08-30 10:24:03 +02:00
Pedro Cunha
dcd44cf423
[flickr] reference the correct function 2024-08-22 17:00:35 +01:00
Mike Fährmann
e92a9ae343
[flickr] make exif and context metadata extraction non-fatal (#6002) 2024-08-14 09:44:04 +02:00
Mike Fährmann
5c1f5861b6
[flickr] add 'contexts' option (#5324) 2024-03-18 17:36:16 +01:00
Mike Fährmann
6e928300bc
[flickr] handle non-JSON errors (#5131) 2024-02-06 21:22:10 +01:00
Mike Fährmann
4cdab8074e
update/fix --list-extractors 2023-09-11 17:32:59 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
7da954f810
[flickr] update default API credentials (#4332)
and add a delay between API requests
2023-07-22 15:38:33 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
c45a913bfd
[flickr] add 'exif' option 2023-07-01 19:19:39 +02:00
Mike Fährmann
ccbc1a1d55
[flickr] add 'metadata' option (#4227) 2023-06-26 16:49:48 +02:00
Mike Fährmann
d0b73fec14
[flickr] add support for secure.flickr.com (#2910) 2022-09-14 16:19:27 +02:00
Vrihub
96fcff182c
generic extractor (#735)
* Generic extractor, see issue #683

* Fix failed test_names test, no subcategory needed

* Prefix directory_fmt with "generic"

* Relax regex (would break some urls)

* Flake8 compliance

* pattern: don't require a scheme

This fixes a bug when we force the generic extractor on urls without a
scheme (that are allowed by all other extractors).

* Fix using g: and r: on urls without http(s) scheme

Almost all extractors accept urls without an initial http(s) scheme.

Many extractors also allow for generic subdomains in their "pattern"
variable; some of them implement this with the regex character class
"[^.]+" (everything but a dot).

This leads to a problem when the extractor is given a url starting
with g: or r: (to force using the generic or recursive extractor)
and without the http(s) scheme: e.g. with "r:foobar.tumblr.com"
the "r:" is wrongly considered part of the subdomain.

This commit fixes the bug, replacing the too generic "[^.]+" with the
more specific "[\w-]+" (letters, digits and "-", the only characters
allowed in domain names), which is already used by some extractors.

* Relax imageurl_pattern_ext: allow relative urls

* First round of small suggested changes

* Support image urls starting with "//"

* self.baseurl: remove trailing slash

* Relax regexp (didn't catch some image urls)

* Some fixes and cleanup

* Fix domain pattern; option to enable extractor

Fixed the domain section for "pattern", to pass "test_add" and
"test_add_module" tests.
Added the "enabled" configuration option (default False) to enable the
generic extractor. Using "g(eneric):URL" forces using the extractor.
2021-12-29 22:39:29 +01:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
ca44111726
[flickr] update
- ensure every photo has an 'owner' (#828)
- change default directories to a more consistent schema
- create directory for each photo
2020-11-15 10:44:29 +01:00
Mike Fährmann
e6cd49e78b
update extractor test results 2020-02-16 21:48:46 +01:00
Mike Fährmann
ce54b8c04c
let extractors opt-out of cookie option usage
useful to avoid sending unnecessary cookies when all authentication
is done through OAuth tokens
2020-01-01 21:12:37 +01:00
Mike Fährmann
abfcb356fc
[flickr] support 3k, 4k, 5k, and 6k photo sizes (closes #472) 2019-11-10 17:52:51 +01:00
Mike Fährmann
4409d00141
embed error messages in StopExtraction exceptions 2019-10-28 16:39:49 +01:00
Mike Fährmann
20fd2d8450
[flickr] skip unavailable images/videos (fixes #398) 2019-08-27 23:26:49 +02:00
Mike Fährmann
5499934ae2
[ngomik] fix extraction 2019-05-30 20:18:36 +02:00
Mike Fährmann
9890bfdf23
[flickr] improve code and metadata
- simplify pagination
- add more metadata and slightly change its structure
  - convert suitable values to int or list
  - move keys from ["photo"] to the base level
- proper video support (#246)
- rename method and variable names to better fit with other extractors
2019-05-14 22:10:50 +02:00
Mike Fährmann
d6ddb74cde
update test results
- deviantart: 'index' is now an integer
- flickr: image file with lower quality
- paheal: image server name changed
- rule34: post got deleted
2019-04-12 09:59:48 +02:00
Mike Fährmann
87b0929bec
Revert "[flickr] restore image quality"
This reverts commit 3f513f1056.

Both live.staticflickr and farmN.staticflickr servers now produce the
same image file with a lower overall quality than before this change in
Flickr's end.
2019-04-11 20:31:05 +02:00
Mike Fährmann
3f513f1056
[flickr] restore image quality
Flickr started serving images from live.staticflickr.com (see ec88ff1),
but the old farmN.staticflickr.com URLs still work - at least for the
time being.
Filesize (and most likely quality as well) for images from live.…  is
severely reduced compared to images from farmN.… for non-original files,
so all live URLs are replaced to point to a randomly chosen farm server.
2019-04-06 11:26:10 +02:00
Mike Fährmann
ec88ff1562
[flickr] relax unit test results
Images are now randomly served from the 'live.staticflickr.com' domain
instead of the "old" 'farmN.staticflickr.com' one, making it impossible
to use static 'url' and 'keyword' hashes as results.

Image quality doesn't appear to be effected by which image-server is
used. Files from 'farmN' and 'live' are the same.
2019-03-30 18:31:59 +01:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)

Example: "https://example.org/path/filename.ext"

before:
- filename : filename.ext
- name     : filename
- extension: ext

now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
89ee8cd7e4
filter "private" kwdict entries 2019-02-13 13:22:11 +01:00
Mike Fährmann
61741d7333
provide type information for Queue messages
Child extractors are now directly constructed with Extractor.from_url()
if the extractor class is known beforehand, instead of using
extractor.find() and searching through all possible extractor classes.
2019-02-12 21:32:32 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107
simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
34bab080ae
rewrite URL patterns to use only 1 per extractor 2019-02-08 12:03:10 +01:00
Mike Fährmann
9a98b6769d
use extractor.request for API calls (#130)
... at least for OAuth1.0 based APIs (flickr, smugmug, tumblr)
2018-12-04 21:29:06 +01:00
Mike Fährmann
59bb434ba5
[flickr] add ability to download all albums of a user
for example with 'https://www.flickr.com/photos/shona_s/albums'
2018-11-23 09:09:37 +01:00
Mike Fährmann
8080071174
[flickr] improve album metadata (closes #109) 2018-09-29 16:21:55 +02:00
Mike Fährmann
26cbcb3a72
[flickr] improve error handling (#109) 2018-09-17 10:12:14 +02:00
Mike Fährmann
f3793660ef
update tests 2018-08-02 14:57:28 +02:00
Mike Fährmann
212130b048
[deviantart] improve public-private token switching
- rename option to `prefer-public`
- now also works for galleries with less than 24 items
2018-07-25 12:52:36 +02:00
Mike Fährmann
1c1e086d01
use common base class for OAuth1.0 based API interfaces 2018-05-10 21:57:45 +02:00
Mike Fährmann
6a31ada9e3
re-implement OAuth1.0 code
OAuth support for SmugMug needs some additional features
(auth-rebuild on redirect, query parameters in URL, ...)
and fixing this in the old code wouldn't work all that well.
2018-05-10 18:47:05 +02:00
Mike Fährmann
d11fcf4804
smaller changes and fixes
- fix the cloudflare challenge result if the last decimal places
  are zero (JS`s toFixed() removes trailing zeroes)
- fix downloading of kissmanga chapter-pages hosted on blogspot
  (accessing blogspot with "kissmanga.com" as referrer yields a 401)
- disable certificate validation for 'mangahere' tests
- update flickr test result
2018-04-06 15:30:09 +02:00
Mike Fährmann
a112e3f2a0
[nijie] add doujin extractor
adds support for "https://nijie.info/members_dojin.php?id=<artist_id>"
2018-03-31 18:17:41 +02:00
Mike Fährmann
5008e105ee
update archive IDs
... to behave in a more straightforward way when dealing with
bookmarks/favourites/etc.

specific IDs are now grouped by their owner, album-id, ... to
allow for duplicates when it would be expected.
2018-03-01 18:20:50 +01:00