1
0
mirror of https://github.com/mikf/gallery-dl.git synced 2024-11-22 18:53:21 +01:00
Commit Graph

619 Commits

Author SHA1 Message Date
Mike Fährmann
56039d2456
add 'hash_md5' and 'hash_sha1' functions (#3679)
... to global eval namespace
2023-02-22 10:58:44 +01:00
Mike Fährmann
e1df7f73b1
[deviantart] add 'search' extractor
(#538, #1264, #2954, #2970, #3577)

Requires login to fetch any results, since the API endpoint raises an
error for not logged in requests.

TODO: parse HTML search results
2023-02-20 20:54:46 +01:00
Gray Manley
38a6389e2c Fix lint. 2023-02-20 00:33:30 -06:00
Gray Manley
56cbae92ec Use more pythony naming. 2023-02-19 06:14:34 -06:00
Gray Manley
8e2ba4f32e Add test. 2023-02-19 06:13:21 -06:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
b7337d810e
[postprocessor:metadata] add 'sort' and 'separators' options 2023-02-07 18:28:14 +01:00
Mike Fährmann
3436c6b117
[postprocessor:metadata] speed up JSON encoding 2023-02-06 12:35:28 +01:00
Mike Fährmann
925b467496
split e621 from danbooru module (#3425) 2023-02-03 19:24:31 +01:00
Mike Fährmann
c2bc70593e
implement ability to load external extractor classes
- -X/--extractors
- extractor.module-sources
2023-01-30 23:10:10 +01:00
ClosedPort22
b6706b373a
[downloader:http] add signature checks for some formats
also add the MIME type for .obj files
2023-01-15 23:40:55 +08:00
Mike Fährmann
71d3143c35
fix bug in test_extractors.py
pattern matching tests would succeed
if there is exactly one match
but for the wrong extractor
2023-01-08 15:35:05 +01:00
Mike Fährmann
fa144f38ed
[ytdl} fix dfe4f00c for legacy yt-dlp 2023-01-04 21:42:22 +01:00
Mike Fährmann
dfe4f00ca2
[ytdl] update for yt-dlp changes 2023-01-04 13:12:24 +01:00
Mike Fährmann
d651d45239
implement specifying ranges in slice notation (#918, #2865)
e.g.
- '1:101'   or ':101' or ':101:'  for files 1 to 100
- '1::2'    or '::2'              for every second file
- '1:101:5' or ':101:5'           for files 1, 6, 11, ..., 91, 96

(the second argument specifies the first index NOT included)
2022-12-27 18:21:12 +01:00
Mike Fährmann
3616adfc75
implement '--range' with Python ranges 2022-12-26 18:32:34 +01:00
Mike Fährmann
1800bd7d14
allow '*-filter' options to be a list of expressions 2022-12-23 22:20:21 +01:00
Mike Fährmann
43c211f1a7
extend and rename util.CustomNone 2022-12-06 22:08:51 +01:00
Mike Fährmann
42481aed59
[formatter] implement 'S' format specifier (#3266)
to Sort lists
2022-11-21 21:44:42 +01:00
Mike Fährmann
6e08ad26f7
update downloader tests 2022-11-16 22:59:18 +01:00
Mike Fährmann
05255f5be0
add 'default' argument to 'text.extr()' 2022-11-09 11:00:32 +01:00
Mike Fährmann
8124c16a50
split 'build_path' from 'set_filename' and 'set_extension'
Do not automatically build a new path
when setting file metadata or updating its extension.
2022-11-08 17:03:24 +01:00
Mike Fährmann
eb33e6cf2d
add 'text.extr()'
a stripped-down version of text.extract() that
- always returns a string (like 'extract_from')
- only returns a string
- does not deal with 'pos' arguments
- is ~20% faster
2022-11-04 21:37:36 +01:00
Mike Fährmann
460095adca
update downloader tests 2022-11-01 18:48:35 +01:00
Mike Fährmann
f037429fa4
attempt to improve '-K' output for lists
- use [N] instead if [] to indicate a Number needs to be placed there
- enumerate list items
2022-10-28 12:04:58 +02:00
thatfuckingbird
062ef238a6
add support for aibooru (using danbooru extractor) (#3075) 2022-10-19 11:53:59 +02:00
Mike Fährmann
b57015cf0a
[postprocessor:metadata] assume 'mode: custom' when format is set
{"name": "metadata", "format": "foobar"}
will now implicitly use mode:custom and no longer mode:json like before
2022-10-04 22:35:26 +02:00
enduser420
f7ba19a1c0
[nana] add 'nana' extractors (#2967) 2022-10-04 09:23:24 +02:00
Mike Fährmann
b36125333f
[postprocessor:zip] implement 'files' option (#2872) 2022-09-09 11:41:27 +02:00
Mike Fährmann
67bad04dda
[formatter] add 'g' conversion to sluGify a string (#2410) 2022-08-26 17:57:17 +02:00
Mike Fährmann
6990ad0ba8
[formatter] do NOT apply :J to strings (#2833) 2022-08-16 16:41:19 +02:00
Mike Fährmann
c0051d7d4c
fix test 2022-08-01 21:40:35 +02:00
Mike Fährmann
dd3a6a9fd1
make 'enumerate_reversed()' work with generators (#2795) 2022-08-01 14:08:44 +02:00
Mike Fährmann
0c73914848
[postprocessor:metadata] implement 'mode: modify' (#2640) 2022-07-19 12:24:26 +02:00
Mike Fährmann
f3de6b7a87
[postprocessor:metadata] implement 'mode: delete' (#2640) 2022-07-19 00:57:29 +02:00
Mike Fährmann
9704c04172
[postprocessor:zip] ensure target directory exists (#2758) 2022-07-14 11:55:39 +02:00
Mike Fährmann
74865adae5
implement 'format-separator' option (#2737)
a global option, that servers as a workaround for shortcomings due to
lack of a proper format string parser
2022-07-10 13:31:43 +02:00
bradenhilton
117eeefda0
[postprocessor:mtime] add 'value' option (#2739) 2022-07-08 20:56:01 +02:00
Mike Fährmann
90ae48c40c
[formatter] implement 'O' format specifier (#2736)
to apply a UTC offset to 'date' values and other datetime objects
2022-07-08 12:51:03 +02:00
Mike Fährmann
04bed1eba3
[formatter] allow for custom "format" functions (#2721) 2022-07-05 12:22:01 +02:00
Mike Fährmann
54525d2e21
[formatter] implement slice operator as format specifier
this allows using a slice operator alongside other (special) format
specifiers like J, to first join list elements to a string and then
trimming that with a slice.

{tags:J, /[:50]}
2022-06-25 16:52:58 +02:00
Mike Fährmann
241e82e18d
[horne] add support for horne.red (#2700) 2022-06-25 16:52:16 +02:00
Mike Fährmann
42525cfe8d
fix '{…!j}' for otherwise non-serializable types (##2624)
like 'datetime'
2022-06-07 17:47:07 +02:00
Mike Fährmann
5b43faffed
[postprocessor:metadata] write to stdout by setting filename to "-"
(#2624)
2022-05-30 21:17:31 +02:00
Mike Fährmann
6ad39f2b68
add ytdl tests
they only run when youtube-dl or yt-dlp are installed,
i.e. if __import__("<ytdl-package>") succeeds
2022-05-23 18:30:26 +02:00
Mike Fährmann
688d6553b4
replace calls to print() with stdout_write() (#2529) 2022-05-19 17:09:24 +02:00
Mike Fährmann
f3408a9d92
implement string literals in replacement fields
- either {_lit[foo]} or {'foo'}
- useful as alternative for empty metadata fields: {title|'no title'}
- due to using '_string.formatter_field_name_split()' to parse format
  strings, using certain characters will result in an error: [].:!
2022-05-09 23:49:33 +02:00
Mike Fährmann
c4b9f7bab8
update functions working with cookies.txt files
- rename
  - load_cookiestxt -> cookiestxt_load
  - save_cookiestxt -< cookiestxt_store
- in cookiestxt_load, add cookies directly to a cookie jar
  instead of storing them in a list first
- other unnoticeable performance increases
2022-05-06 13:21:29 +02:00
Mike Fährmann
ca3a364db7
fix build_duration_func() (#2533)
for extractors with request_interval_min > 0
2022-04-27 20:28:14 +02:00
Mike Fährmann
7fe54bab2a
attempt to fix some issues with 'contains()' (#2446)
add a third argument that gets used
when the values o search are given as a string
2022-04-08 14:40:26 +02:00
Mike Fährmann
d78a2c7163
re.escape() arguments for 'contains()' (#2446) 2022-04-07 15:35:54 +02:00
Mike Fährmann
413b77757b
implement 'contains()' (#2446)
and add it to globals() in compiled expressions for --filter etc
2022-03-30 16:18:33 +02:00
Mike Fährmann
e7b30866d0
[postprocessor:mtime] fix timestamps from datetime objects (#2307)
'datetime.timestamp()', which got used to convert datetime objects to
POSIX timestamps, assumes naive datetimes represent LOCAL time, while
datetimes in 'date' metadata fields represent UTC time.

Ref: https://docs.python.org/3/library/datetime.html#datetime.datetime.timestamp
> Naive datetime instances are assumed to represent local time
> you can obtain the POSIX timestamp by … calculating the timestamp directly
2022-03-23 23:05:14 +01:00
Mike Fährmann
29db716a63
implement 'datetime_to_timestamp()'
and rename 'to_timestamp()'
to the more descriptive 'datetime_to_timestamp_string()'
2022-03-23 22:36:01 +01:00
Mike Fährmann
8295bc6d97
fix loading/storing cookies without domain 2022-03-19 15:14:55 +01:00
Mike Fährmann
500a479026
fix a third(!) bug in _check_cookies() (#2372)
turns out tests are worthless if you get em wrong ...
2022-03-18 19:52:37 +01:00
Mike Fährmann
cf44aba333
[formatter] allow evaluating f-string literals
by starting a format string with '\fF'.

This was technically already possible with '\fE',
but this makes it a bit more convenient.
2022-03-18 13:31:01 +01:00
Mike Fährmann
94452761ed
fix cookies tests 2022-03-11 18:16:00 +01:00
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
use domain from input URL for kemono
2022-03-01 03:09:57 +01:00
Mike Fährmann
f5b2b9333f
fix another bug in _check:cookies (#2160)
regression introduced in ed317bfc

Added a couple of tests to hopefully catch such bugs
before they land in a release.
2022-02-16 22:58:57 +01:00
Mike Fährmann
563bd0ecf4
[danbooru] inherit from BaseExtractor
- merge danbooru and e621 code
- support booru.allthefallen.moe (closes #2283)
- remove support for old e621 tag search URLs
2022-02-11 21:01:51 +01:00
Mike Fährmann
b5b4f5a168
use 'build_extractor_filter' in test_results.py 2021-12-28 17:25:07 +01:00
Mike Fährmann
64cf26eaf4
allow specifying sleep-* options as string
either as single value or as range: "3.5", "2.1 - 5.0"
2021-12-18 23:28:56 +01:00
Mike Fährmann
010d65dcec
extend blacklist/whitelist syntax (#2025)
Each entry in such a list can now also include a subcategory
'<category>:<subcategory>'
and it is possible to use '*' or an empty string as placeholder
'*:<subcategory>', ':<subcategory>', '<category>:*'

For example
  "blacklist": "imgur,*:tag,gfycat:user" or
  "blacklist": ["imgur", "*:tag", "gfycat:user"]
will filter all 'imgur' extractors, all extractors  with a 'tag'
subcategory (e.g. https://danbooru.donmai.us/posts?tags=bonocho),
and all 'gfycat' user extractors.
2021-11-23 20:31:43 +01:00
Mike Fährmann
af6424f398
allow testing metadata in list elements 2021-11-21 22:46:34 +01:00
Mike Fährmann
3842cdcd8f
[formatter] implement 'D' format specifier
To be able to parse any string into a 'datetime' object
and format it as necessary.

Example:

{created_at:D%Y-%m-%dT%H:%M:%S%z}
->
"2010-01-01 00:00:00"

{created_at:D%Y-%m-%dT%H:%M:%S%z/%b %d %Y %I:%M %p}
->
"Jan 01 2010 12:00 AM"

with 'created_at' == "2010-01-01T01:00:00+0100"
2021-11-20 23:04:34 +01:00
Mike Fährmann
2ab190ce08
add tests for special format strings 2021-11-01 23:26:18 +01:00
Mike Fährmann
46e17c5e61
support accessing the current local datetime in format strings
{_now}, {_now:%Y-%m-%d}, etc
(#1968)
2021-10-30 21:41:09 +02:00
Mike Fährmann
38193dba46
support accessing environment variables in format strings (#1968)
{_env[HOME]} to get the value of $HOME
every other format string feature is supported as well
2021-10-28 19:18:55 +02:00
Mike Fährmann
f2d6b3e6b4
run tests without using 'nose'
run_tests.sh -> run_tests.py
2021-10-13 04:07:41 +02:00
Mike Fährmann
12fc646c53
fix filename formatting tests 2021-09-29 23:39:02 +02:00
Mike Fährmann
e0bdacd932
[fappic] add 'image' extractor (closes #1898) 2021-09-28 23:35:29 +02:00
Mike Fährmann
c22ff97743
remove 'unit' argument from 'util.format_value()' 2021-09-28 23:07:55 +02:00
Mike Fährmann
cad85640de
move 'util.PathFormat' into its own 'path' module
to prevent circular imports between 'formatter' and 'util'
2021-09-27 21:29:37 +02:00
Mike Fährmann
74145467dd
move 'util.Formatter' into its own 'formatter' module 2021-09-27 02:37:04 +02:00
Mike Fährmann
9377543162
[mastodon] add 'following' extractor (#1891) 2021-09-26 00:12:34 +02:00
Mike Fährmann
bd845303ad
implement a way to shorten filenames with east-asian characters
(#1377)

Setting 'output.shorten' to "eaw" (East-Asian Width) uses a slower
algorithm that also considers characters with a width > 1.
2021-09-13 21:38:33 +02:00
Mike Fährmann
292fffc83c
add 'j' format string conversion
to convert to a JSON formatted string
2021-08-28 01:19:36 +02:00
Mike Fährmann
bb6a130942
automatically set required DDoS-GUARD cookies (#1779)
for kemono.party and seiso.party
2021-08-16 17:40:29 +02:00
Mike Fährmann
2792ed6e4b
implement 'util.format_value()' 2021-07-26 02:11:22 +02:00
Mike Fährmann
9e42cd58ea
replace ChainPredicate class with 'functools.partial' 2021-07-20 20:21:32 +02:00
Mike Fährmann
36ac2197db
[ytdl] add extractor for sites supported by youtube-dl
(#1680, #878)

Can be used by prefixing any URL with 'ytdl:',
or by setting 'extractor,ytdl.enabled' to 'true'.
2021-07-10 20:55:47 +02:00
Mike Fährmann
64240c8d42
[imagevenue] fix extraction
(closes #1677)
2021-07-09 20:13:18 +02:00
Mike Fährmann
0179581340
add 'T' format string conversion (#1646)
to convert 'date'/datetime to timestamp
2021-06-25 22:35:45 +02:00
Mike Fährmann
f74cf52e2b
[seisoparty] add 'user' and 'post' extractors (#1635) 2021-06-25 18:40:11 +02:00
Mike Fährmann
759735fb02
[kemonoparty] fix 'username' extraction (fixes #1652)
The site's <title> content changed from

<title>NAME | Kemono</title>

to

<title>
    NAME | Kemono
</title>
2021-06-25 15:35:20 +02:00
Mike Fährmann
07c8adbd8b
[mangadex] implement login with username & password (#1535) 2021-06-08 02:12:57 +02:00
Mike Fährmann
4a747a31a3
[postprocessor:metadata] handle dicts in mode;tags (fixes #1598) 2021-06-04 22:37:43 +02:00
Mike Fährmann
3cbbefd4ed
support 'filter' option for post processors (#1460) 2021-06-04 18:23:32 +02:00
Mike Fährmann
0abad8bc12
implement 'compile_expression()' 2021-06-03 22:34:58 +02:00
Mike Fährmann
da6806a161
fix job tests for Python 3.4 and 3.5
assert_called() and assert_not_called() got added in Python 3.6
2021-05-22 21:40:52 +02:00
Mike Fährmann
8fd8126117
fix ISO 639-1 code for Japanese
"jp" -> "ja"
2021-05-22 16:07:04 +02:00
Mike Fährmann
af9dba4684
add DataJob tests 2021-05-21 02:59:54 +02:00
Mike Fährmann
adf4d661b3
use '_extractor' info in UrlJobs 2021-05-19 15:52:30 +02:00
Mike Fährmann
1eabfa5c7a
[pillowfort] implement login with username & password (#846) 2021-05-19 02:59:16 +02:00
Mike Fährmann
559462789d
add some tests for job.py 2021-05-14 19:44:16 +02:00
Mike Fährmann
c5ca7905ce
add 'noop()' and 'identity()' functions 2021-05-04 19:27:17 +02:00
Mike Fährmann
bc868e7bb8
consider apparently long extensions as part of the filename
(#1516)
2021-05-02 21:15:50 +02:00
Mike Fährmann
bdfcc9c4b1
update extractor test results 2021-04-18 20:28:15 +02:00
Mike Fährmann
387fe415d5
unescape items in text.split_html() 2021-03-29 02:12:29 +02:00