Mike Fährmann
bfdc07632a
[deviantart] expand nested comment replies ( #4653 )
2023-10-17 19:40:53 +02:00
Mike Fährmann
9bc5ad4784
[tests] implement 'len:'
2023-10-17 19:25:31 +02:00
Mike Fährmann
a1977a698e
[tests] fix spurious failures in '_assert_isotime()'
2023-10-16 18:16:48 +02:00
Mike Fährmann
390d14dbcc
[chevereto] support 'img.kiwi' and 'deltaporno.com' ( #4664 , #1381 )
2023-10-16 18:14:30 +02:00
Mike Fährmann
727c8eec6c
merge #4667 : [redgifs] fix 'niches' extraction ( #4666 )
2023-10-16 14:20:01 +02:00
Mike Fährmann
2911ed1240
[chevereto] add generic extractors ( #4664 )
...
- support jpgfish
- support pixl.li / pixl.is (#3179 , #4357 )
2023-10-16 14:15:39 +02:00
enduser420
db3363ac0b
[redgifs] fix 'niches' extraction
2023-10-16 16:51:30 +05:30
Mike Fährmann
ade8347ead
[kemonoparty] fix DM dates
2023-10-15 19:54:28 +02:00
Mike Fährmann
6dfe200ae4
[kemonoparty] support discord URLs with channel IDs ( #4662 )
2023-10-15 19:45:22 +02:00
Mike Fährmann
c6a3892210
[imgbb] update username extraction ( #4626 )
2023-10-14 20:55:39 +02:00
Mike Fährmann
13ce3a9acb
[warosu] fix extraction ( #4634 )
2023-10-13 23:03:39 +02:00
Mike Fährmann
c4c4e4d2f4
[newgrounds] improve 'art-image' extraction ( #4642 )
...
- download files in original resolution
- replace .webp with extension of first file
2023-10-13 20:10:55 +02:00
Mike Fährmann
833dce141f
[fantia] add 'content_count' and 'content_num' metadata fields ( #4627 )
2023-10-13 20:10:55 +02:00
Mike Fährmann
2d41702762
[deviantart] implement '"group": "skip"' ( #4630 )
2023-10-12 22:14:20 +02:00
Mike Fährmann
a9c3442d4e
[deviantart] add a couple 'deactivated account' test URLs
2023-10-12 21:40:10 +02:00
Mike Fährmann
2974b8e3c8
[moebooru] add 'metadata' option ( #4646 )
...
for extended 'pool' metadata
2023-10-12 21:34:25 +02:00
Mike Fährmann
67ba4ee842
[pp:exec] support more replacement fields for '--exec' ( #4633 )
...
- {_directory}
- {_filename}
- {_path} (alias for {})
2023-10-09 12:50:10 +02:00
Mike Fährmann
9a008523ac
[hentaifoundry] fix '.swf' file downloads ( #4641 )
2023-10-09 11:45:55 +02:00
Mike Fährmann
15f940819b
[newgrounds] support 'art-image' files ( #4642 )
2023-10-09 11:20:10 +02:00
Mike Fährmann
efaab4fbfa
[twitter] fix crash due to missing 'source' ( #4620 )
...
regression caused by 06aaedde
2023-10-04 23:01:04 +02:00
Mike Fährmann
84fbbd96aa
[shimmie2] remove 'meme.museum'
2023-10-02 20:41:25 +02:00
Mike Fährmann
0b150d45db
[tests] add 'msg' arguments to assert statements
2023-10-01 13:52:00 +02:00
Mike Fährmann
27da3f2958
[tests] re-implement filtering by basecategory
2023-10-01 13:31:23 +02:00
Mike Fährmann
c7bd9925d9
[tests] use fallback URLs for content tests ( #3163 )
2023-09-30 21:00:55 +02:00
Mike Fährmann
b92645cd37
[bunkr] fix extraction ( #4514 , #4532 , #4529 , #4540 )
2023-09-30 18:05:12 +02:00
Mike Fährmann
bd3f7a5bbc
[tests] support one regex per URL for #pattern
2023-09-28 21:56:09 +02:00
Mike Fährmann
0c5d8b1505
[deviantart] re-add 'quality' option and 'intermediary' transform
2023-09-24 17:36:05 +02:00
Mike Fährmann
dbd820d7c5
[tests] allow checking for exact URL results
2023-09-24 01:52:47 +02:00
Mike Fährmann
642998504d
[tests] support 'range()' for #count and metadata checks
2023-09-24 01:52:40 +02:00
Mike Fährmann
1e31fce37b
[pillowfort] support '/tagged/' URLs ( #4570 )
2023-09-23 00:11:01 +02:00
Mike Fährmann
1d2fd0b831
[pillowfort] extract 'b2_lg_url' media ( #4570 )
2023-09-23 00:05:26 +02:00
Mike Fährmann
50e2ebaff0
[danbooru] support 'donmai.moe' URLs
2023-09-22 20:58:38 +02:00
Mike Fährmann
918ba4f847
[redgifs] match gfycat image URLs ( #4558 )
2023-09-22 18:02:55 +02:00
Mike Fährmann
2cd801232b
fix --range causing crashes ( #4557 )
...
regression caused by a383eca7
2023-09-22 16:28:20 +02:00
Mike Fährmann
27ec653991
fix bug in test_init and update example URLs
2023-09-14 13:27:03 +02:00
Mike Fährmann
24a1d46391
[mastodon] support '/@USER/following' URLs
...
Previously, only '/users/USER/following' got matched.
2023-09-13 23:42:51 +02:00
Mike Fährmann
ac00d47a16
update test/test_results.py
2023-09-13 14:54:25 +02:00
Mike Fährmann
65b6011cc5
update test/test_extractor.py
2023-09-11 17:20:06 +02:00
Mike Fährmann
a833c244c8
add exported extractor results
2023-09-10 14:45:01 +02:00
Mike Fährmann
93a7a89cf6
[formatter] use value of last alternative ( #4492 )
...
fixes {fieldname|''} evaluating to the value of 'keywords-default'
instead of an empty string
2023-09-05 17:53:27 +02:00
Mike Fährmann
f2de70f254
[gfycat] remove module
2023-09-04 18:27:11 +02:00
Mike Fährmann
d319777a24
[tests] skip 'test_init_ytdl' on Python<3.6
...
It passes without error in a Python 3.4/3.5 venv on my own machine,
but fails for some inexplicable reason on Github Actions.
2023-08-10 23:34:49 +02:00
Mike Fährmann
0ef1fcab20
[postprocessor] update 'finalize' events
...
Add 'finalize-error' and 'finalize-success' events that trigger
depending on whether error(s) did or did not happen.
'finalize' itself now always triggers regardless of error status.
(was supposed to have the same behavior as the new 'finalize-success')
2023-08-10 19:46:37 +02:00
Mike Fährmann
d50c312ff0
prevent test failure when there's no 'ytdl' module ( #4364 )
...
split of ytdl into its own test function and
skip it when there's an ImportError similar to test_ytdl.py
2023-07-29 13:48:31 +02:00
Mike Fährmann
48ef062867
fix issues with 'Extractor.finalize()'
...
- prevent crash in InstagramUserExtractor (#4359 )
- call it at the end of every DownloadJob
- add it to tests
2023-07-29 13:43:27 +02:00
Mike Fährmann
255d08b79e
add test for 'Extractor.initialize()' ( #4359 )
2023-07-28 16:58:16 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
f0203b7559
[postprocessor:python] add tests
2023-07-24 15:22:57 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
c5565f79f7
merge #4096 : [danbooru] add support for booru.borvar.art instance
2023-07-18 18:33:08 +02:00
Mike Fährmann
63326e3168
[danbooru] add tests for booruvar
2023-07-18 18:29:57 +02:00
Mike Fährmann
5171d8975c
[E621] support 'e6ai.net' ( #4320 )
2023-07-18 18:16:30 +02:00
Mike Fährmann
7444fc125b
[gfycat] implement login support ( #3770 , #4271 )
...
For the record: '/webtoken' and '/weblogin' are not the same ...
2023-07-06 18:56:34 +02:00
Mike Fährmann
25c5a6ffcb
no f-strings
2023-06-25 14:01:26 +02:00
Mike Fährmann
ec64cbefeb
[postprocessor:exec] add tests
2023-06-21 23:54:35 +02:00
Mike Fährmann
ce93c460a6
[formatter] implement 'H' conversion ( #4164 )
...
to remove HTML tags and unescape HTML entities
2023-06-15 13:07:51 +02:00
Mike Fährmann
deff3b434d
[vipergirls] implement login support ( #4166 )
2023-06-13 21:05:09 +02:00
Mike Fährmann
69865dcc05
[formatter] implement slicing strings as bytes ( #4087 )
...
prefixing a slice '[10:30]' with a lowercase b '[b10:30]' encodes
the string to bytes in filesystem encoding before applying the slice
2023-05-22 18:30:45 +02:00
Mike Fährmann
df11214281
[ytdl] improve --xff/--geo-bypass detection ( #3989 )
...
check if --xff is supported in a try-except block
and select expected results accordingly
2023-05-01 18:26:37 +02:00
Mike Fährmann
aa731c4298
[ytdl] run yt-dlp tests with latest code from master ( #3989 )
...
Only use PyPI version for Python 3.6, since that's no longer supported
by the current codebase.
2023-05-01 16:42:57 +02:00
Mike Fährmann
43f4bd9faa
[ytdl] fix tests
...
tests pass with latest Git HEAD, but not with the current PyPI version
2023-04-29 18:05:45 +02:00
Mike Fährmann
61a65d5bb9
[ytdl] fix crash due to --geo-bypass deprecation ( #3975 )
2023-04-29 17:25:38 +02:00
Mike Fährmann
a96745368e
"fix" tests on Python 3.4 and 3.5
...
can't rely on dict insertion order
2023-04-26 19:31:27 +02:00
Mike Fährmann
3905f05f00
[postprocessor:metadata] support putting keys in quotes
...
for mode 'modify' and 'delete'
based on fe41a2b1
2023-04-25 14:30:18 +02:00
Mike Fährmann
7459e4abce
[postprocessor:metadata] fix traversing more than 1 level deep
...
for mode 'modify' and 'delete'
2023-04-25 14:17:25 +02:00
Mike Fährmann
2edcdee32f
[downloader:http] add MIME type and signature for .heic files
...
(#3915 )
https://github.com/strukturag/libheif/issues/83
2023-04-15 17:09:22 +02:00
Mike Fährmann
082d55de16
fix circular reference detection for -K
2023-03-21 23:46:36 +01:00
Mike Fährmann
2ab66ad899
update -K output to include quotes around keys
2023-03-21 22:28:04 +01:00
Mike Fährmann
fe41a2b159
[formatter] support putting keys in quotes
...
i.e. obj["key"] or obj['key']
as in f-strings
2023-03-21 22:06:54 +01:00
Mike Fährmann
46fdf46f21
[formatter] support loading an f-string from a template file
...
"\fTF ~/path/to/file.txt"
2023-03-20 22:05:33 +01:00
Mike Fährmann
1a4d4a799b
[formatter] support filesystem paths for \fM
2023-03-20 22:01:33 +01:00
Mike Fährmann
00f0233b28
[postprocessor:metadata] add 'skip' option ( #3786 )
2023-03-17 23:30:11 +01:00
Mike Fährmann
8f8b4de0e8
[ytdl] fix '--parse-metadata' ( #3663 )
2023-03-05 19:57:23 +01:00
Mike Fährmann
7610d9cf82
merge #3675 : [pixiv] fix --write-tags for '"tags": "original"'
2023-03-02 21:48:31 +01:00
Mike Fährmann
83e7a25b6b
extend OAuth tests
2023-03-02 17:26:51 +01:00
Mike Fährmann
d788e6c60c
implement 'globals' option
2023-02-28 18:18:55 +01:00
Mike Fährmann
56039d2456
add 'hash_md5' and 'hash_sha1' functions ( #3679 )
...
... to global eval namespace
2023-02-22 10:58:44 +01:00
Mike Fährmann
e1df7f73b1
[deviantart] add 'search' extractor
...
(#538 , #1264 , #2954 , #2970 , #3577 )
Requires login to fetch any results, since the API endpoint raises an
error for not logged in requests.
TODO: parse HTML search results
2023-02-20 20:54:46 +01:00
Gray Manley
38a6389e2c
Fix lint.
2023-02-20 00:33:30 -06:00
Gray Manley
56cbae92ec
Use more pythony naming.
2023-02-19 06:14:34 -06:00
Gray Manley
8e2ba4f32e
Add test.
2023-02-19 06:13:21 -06:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2023-02-09 15:22:00 +01:00
Mike Fährmann
b7337d810e
[postprocessor:metadata] add 'sort' and 'separators' options
2023-02-07 18:28:14 +01:00
Mike Fährmann
3436c6b117
[postprocessor:metadata] speed up JSON encoding
2023-02-06 12:35:28 +01:00
Mike Fährmann
925b467496
split e621 from danbooru module ( #3425 )
2023-02-03 19:24:31 +01:00
Mike Fährmann
c2bc70593e
implement ability to load external extractor classes
...
- -X/--extractors
- extractor.module-sources
2023-01-30 23:10:10 +01:00
ClosedPort22
b6706b373a
[downloader:http] add signature checks for some formats
...
also add the MIME type for .obj files
2023-01-15 23:40:55 +08:00
Mike Fährmann
71d3143c35
fix bug in test_extractors.py
...
pattern matching tests would succeed
if there is exactly one match
but for the wrong extractor
2023-01-08 15:35:05 +01:00
Mike Fährmann
fa144f38ed
[ytdl} fix dfe4f00c
for legacy yt-dlp
2023-01-04 21:42:22 +01:00
Mike Fährmann
dfe4f00ca2
[ytdl] update for yt-dlp changes
2023-01-04 13:12:24 +01:00
Mike Fährmann
d651d45239
implement specifying ranges in slice notation ( #918 , #2865 )
...
e.g.
- '1:101' or ':101' or ':101:' for files 1 to 100
- '1::2' or '::2' for every second file
- '1:101:5' or ':101:5' for files 1, 6, 11, ..., 91, 96
(the second argument specifies the first index NOT included)
2022-12-27 18:21:12 +01:00
Mike Fährmann
3616adfc75
implement '--range' with Python ranges
2022-12-26 18:32:34 +01:00
Mike Fährmann
1800bd7d14
allow '*-filter' options to be a list of expressions
2022-12-23 22:20:21 +01:00
Mike Fährmann
43c211f1a7
extend and rename util.CustomNone
2022-12-06 22:08:51 +01:00
Mike Fährmann
42481aed59
[formatter] implement 'S' format specifier ( #3266 )
...
to Sort lists
2022-11-21 21:44:42 +01:00
Mike Fährmann
6e08ad26f7
update downloader tests
2022-11-16 22:59:18 +01:00
Mike Fährmann
05255f5be0
add 'default' argument to 'text.extr()'
2022-11-09 11:00:32 +01:00
Mike Fährmann
8124c16a50
split 'build_path' from 'set_filename' and 'set_extension'
...
Do not automatically build a new path
when setting file metadata or updating its extension.
2022-11-08 17:03:24 +01:00
Mike Fährmann
eb33e6cf2d
add 'text.extr()'
...
a stripped-down version of text.extract() that
- always returns a string (like 'extract_from')
- only returns a string
- does not deal with 'pos' arguments
- is ~20% faster
2022-11-04 21:37:36 +01:00
Mike Fährmann
460095adca
update downloader tests
2022-11-01 18:48:35 +01:00
Mike Fährmann
f037429fa4
attempt to improve '-K' output for lists
...
- use [N] instead if [] to indicate a Number needs to be placed there
- enumerate list items
2022-10-28 12:04:58 +02:00
thatfuckingbird
062ef238a6
add support for aibooru (using danbooru extractor) ( #3075 )
2022-10-19 11:53:59 +02:00
Mike Fährmann
b57015cf0a
[postprocessor:metadata] assume 'mode: custom' when format is set
...
{"name": "metadata", "format": "foobar"}
will now implicitly use mode:custom and no longer mode:json like before
2022-10-04 22:35:26 +02:00
enduser420
f7ba19a1c0
[nana] add 'nana' extractors ( #2967 )
2022-10-04 09:23:24 +02:00
Mike Fährmann
b36125333f
[postprocessor:zip] implement 'files' option ( #2872 )
2022-09-09 11:41:27 +02:00
Mike Fährmann
67bad04dda
[formatter] add 'g' conversion to sluGify a string ( #2410 )
2022-08-26 17:57:17 +02:00
Mike Fährmann
6990ad0ba8
[formatter] do NOT apply :J to strings ( #2833 )
2022-08-16 16:41:19 +02:00
Mike Fährmann
c0051d7d4c
fix test
2022-08-01 21:40:35 +02:00
Mike Fährmann
dd3a6a9fd1
make 'enumerate_reversed()' work with generators ( #2795 )
2022-08-01 14:08:44 +02:00
Mike Fährmann
0c73914848
[postprocessor:metadata] implement 'mode: modify' ( #2640 )
2022-07-19 12:24:26 +02:00
Mike Fährmann
f3de6b7a87
[postprocessor:metadata] implement 'mode: delete' ( #2640 )
2022-07-19 00:57:29 +02:00
Mike Fährmann
9704c04172
[postprocessor:zip] ensure target directory exists ( #2758 )
2022-07-14 11:55:39 +02:00
Mike Fährmann
74865adae5
implement 'format-separator' option ( #2737 )
...
a global option, that servers as a workaround for shortcomings due to
lack of a proper format string parser
2022-07-10 13:31:43 +02:00
bradenhilton
117eeefda0
[postprocessor:mtime] add 'value' option ( #2739 )
2022-07-08 20:56:01 +02:00
Mike Fährmann
90ae48c40c
[formatter] implement 'O' format specifier ( #2736 )
...
to apply a UTC offset to 'date' values and other datetime objects
2022-07-08 12:51:03 +02:00
Mike Fährmann
04bed1eba3
[formatter] allow for custom "format" functions ( #2721 )
2022-07-05 12:22:01 +02:00
Mike Fährmann
54525d2e21
[formatter] implement slice operator as format specifier
...
this allows using a slice operator alongside other (special) format
specifiers like J, to first join list elements to a string and then
trimming that with a slice.
{tags:J, /[:50]}
2022-06-25 16:52:58 +02:00
Mike Fährmann
241e82e18d
[horne] add support for horne.red ( #2700 )
2022-06-25 16:52:16 +02:00
Mike Fährmann
42525cfe8d
fix '{…!j}' for otherwise non-serializable types (##2624)
...
like 'datetime'
2022-06-07 17:47:07 +02:00
Mike Fährmann
5b43faffed
[postprocessor:metadata] write to stdout by setting filename to "-"
...
(#2624 )
2022-05-30 21:17:31 +02:00
Mike Fährmann
6ad39f2b68
add ytdl tests
...
they only run when youtube-dl or yt-dlp are installed,
i.e. if __import__("<ytdl-package>") succeeds
2022-05-23 18:30:26 +02:00
Mike Fährmann
688d6553b4
replace calls to print() with stdout_write() ( #2529 )
2022-05-19 17:09:24 +02:00
Mike Fährmann
f3408a9d92
implement string literals in replacement fields
...
- either {_lit[foo]} or {'foo'}
- useful as alternative for empty metadata fields: {title|'no title'}
- due to using '_string.formatter_field_name_split()' to parse format
strings, using certain characters will result in an error: [].:!
2022-05-09 23:49:33 +02:00
Mike Fährmann
c4b9f7bab8
update functions working with cookies.txt files
...
- rename
- load_cookiestxt -> cookiestxt_load
- save_cookiestxt -< cookiestxt_store
- in cookiestxt_load, add cookies directly to a cookie jar
instead of storing them in a list first
- other unnoticeable performance increases
2022-05-06 13:21:29 +02:00
Mike Fährmann
ca3a364db7
fix build_duration_func() ( #2533 )
...
for extractors with request_interval_min > 0
2022-04-27 20:28:14 +02:00
Mike Fährmann
7fe54bab2a
attempt to fix some issues with 'contains()' ( #2446 )
...
add a third argument that gets used
when the values o search are given as a string
2022-04-08 14:40:26 +02:00
Mike Fährmann
d78a2c7163
re.escape() arguments for 'contains()' ( #2446 )
2022-04-07 15:35:54 +02:00
Mike Fährmann
413b77757b
implement 'contains()' ( #2446 )
...
and add it to globals() in compiled expressions for --filter etc
2022-03-30 16:18:33 +02:00
Mike Fährmann
e7b30866d0
[postprocessor:mtime] fix timestamps from datetime objects ( #2307 )
...
'datetime.timestamp()', which got used to convert datetime objects to
POSIX timestamps, assumes naive datetimes represent LOCAL time, while
datetimes in 'date' metadata fields represent UTC time.
Ref: https://docs.python.org/3/library/datetime.html#datetime.datetime.timestamp
> Naive datetime instances are assumed to represent local time
> you can obtain the POSIX timestamp by … calculating the timestamp directly
2022-03-23 23:05:14 +01:00
Mike Fährmann
29db716a63
implement 'datetime_to_timestamp()'
...
and rename 'to_timestamp()'
to the more descriptive 'datetime_to_timestamp_string()'
2022-03-23 22:36:01 +01:00
Mike Fährmann
8295bc6d97
fix loading/storing cookies without domain
2022-03-19 15:14:55 +01:00
Mike Fährmann
500a479026
fix a third(!) bug in _check_cookies() ( #2372 )
...
turns out tests are worthless if you get em wrong ...
2022-03-18 19:52:37 +01:00
Mike Fährmann
cf44aba333
[formatter] allow evaluating f-string literals
...
by starting a format string with '\fF'.
This was technically already possible with '\fE',
but this makes it a bit more convenient.
2022-03-18 13:31:01 +01:00
Mike Fährmann
94452761ed
fix cookies tests
2022-03-11 18:16:00 +01:00
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
...
use domain from input URL for kemono
2022-03-01 03:09:57 +01:00
Mike Fährmann
f5b2b9333f
fix another bug in _check:cookies ( #2160 )
...
regression introduced in ed317bfc
Added a couple of tests to hopefully catch such bugs
before they land in a release.
2022-02-16 22:58:57 +01:00
Mike Fährmann
563bd0ecf4
[danbooru] inherit from BaseExtractor
...
- merge danbooru and e621 code
- support booru.allthefallen.moe (closes #2283 )
- remove support for old e621 tag search URLs
2022-02-11 21:01:51 +01:00
Mike Fährmann
b5b4f5a168
use 'build_extractor_filter' in test_results.py
2021-12-28 17:25:07 +01:00
Mike Fährmann
64cf26eaf4
allow specifying sleep-* options as string
...
either as single value or as range: "3.5", "2.1 - 5.0"
2021-12-18 23:28:56 +01:00
Mike Fährmann
010d65dcec
extend blacklist/whitelist syntax ( #2025 )
...
Each entry in such a list can now also include a subcategory
'<category>:<subcategory>'
and it is possible to use '*' or an empty string as placeholder
'*:<subcategory>', ':<subcategory>', '<category>:*'
For example
"blacklist": "imgur,*:tag,gfycat:user" or
"blacklist": ["imgur", "*:tag", "gfycat:user"]
will filter all 'imgur' extractors, all extractors with a 'tag'
subcategory (e.g. https://danbooru.donmai.us/posts?tags=bonocho ),
and all 'gfycat' user extractors.
2021-11-23 20:31:43 +01:00
Mike Fährmann
af6424f398
allow testing metadata in list elements
2021-11-21 22:46:34 +01:00
Mike Fährmann
3842cdcd8f
[formatter] implement 'D' format specifier
...
To be able to parse any string into a 'datetime' object
and format it as necessary.
Example:
{created_at:D%Y-%m-%dT%H:%M:%S%z}
->
"2010-01-01 00:00:00"
{created_at:D%Y-%m-%dT%H:%M:%S%z/%b %d %Y %I:%M %p}
->
"Jan 01 2010 12:00 AM"
with 'created_at' == "2010-01-01T01:00:00+0100"
2021-11-20 23:04:34 +01:00
Mike Fährmann
2ab190ce08
add tests for special format strings
2021-11-01 23:26:18 +01:00
Mike Fährmann
46e17c5e61
support accessing the current local datetime in format strings
...
{_now}, {_now:%Y-%m-%d}, etc
(#1968 )
2021-10-30 21:41:09 +02:00
Mike Fährmann
38193dba46
support accessing environment variables in format strings ( #1968 )
...
{_env[HOME]} to get the value of $HOME
every other format string feature is supported as well
2021-10-28 19:18:55 +02:00
Mike Fährmann
f2d6b3e6b4
run tests without using 'nose'
...
run_tests.sh -> run_tests.py
2021-10-13 04:07:41 +02:00
Mike Fährmann
12fc646c53
fix filename formatting tests
2021-09-29 23:39:02 +02:00
Mike Fährmann
e0bdacd932
[fappic] add 'image' extractor ( closes #1898 )
2021-09-28 23:35:29 +02:00
Mike Fährmann
c22ff97743
remove 'unit' argument from 'util.format_value()'
2021-09-28 23:07:55 +02:00
Mike Fährmann
cad85640de
move 'util.PathFormat' into its own 'path' module
...
to prevent circular imports between 'formatter' and 'util'
2021-09-27 21:29:37 +02:00
Mike Fährmann
74145467dd
move 'util.Formatter' into its own 'formatter' module
2021-09-27 02:37:04 +02:00
Mike Fährmann
9377543162
[mastodon] add 'following' extractor ( #1891 )
2021-09-26 00:12:34 +02:00
Mike Fährmann
bd845303ad
implement a way to shorten filenames with east-asian characters
...
(#1377 )
Setting 'output.shorten' to "eaw" (East-Asian Width) uses a slower
algorithm that also considers characters with a width > 1.
2021-09-13 21:38:33 +02:00
Mike Fährmann
292fffc83c
add 'j' format string conversion
...
to convert to a JSON formatted string
2021-08-28 01:19:36 +02:00
Mike Fährmann
bb6a130942
automatically set required DDoS-GUARD cookies ( #1779 )
...
for kemono.party and seiso.party
2021-08-16 17:40:29 +02:00
Mike Fährmann
2792ed6e4b
implement 'util.format_value()'
2021-07-26 02:11:22 +02:00
Mike Fährmann
9e42cd58ea
replace ChainPredicate class with 'functools.partial'
2021-07-20 20:21:32 +02:00
Mike Fährmann
36ac2197db
[ytdl] add extractor for sites supported by youtube-dl
...
(#1680 , #878 )
Can be used by prefixing any URL with 'ytdl:',
or by setting 'extractor,ytdl.enabled' to 'true'.
2021-07-10 20:55:47 +02:00
Mike Fährmann
64240c8d42
[imagevenue] fix extraction
...
(closes #1677 )
2021-07-09 20:13:18 +02:00
Mike Fährmann
0179581340
add 'T' format string conversion ( #1646 )
...
to convert 'date'/datetime to timestamp
2021-06-25 22:35:45 +02:00
Mike Fährmann
f74cf52e2b
[seisoparty] add 'user' and 'post' extractors ( #1635 )
2021-06-25 18:40:11 +02:00
Mike Fährmann
759735fb02
[kemonoparty] fix 'username' extraction ( fixes #1652 )
...
The site's <title> content changed from
<title>NAME | Kemono</title>
to
<title>
NAME | Kemono
</title>
2021-06-25 15:35:20 +02:00
Mike Fährmann
07c8adbd8b
[mangadex] implement login with username & password ( #1535 )
2021-06-08 02:12:57 +02:00
Mike Fährmann
4a747a31a3
[postprocessor:metadata] handle dicts in mode;tags ( fixes #1598 )
2021-06-04 22:37:43 +02:00
Mike Fährmann
3cbbefd4ed
support 'filter' option for post processors ( #1460 )
2021-06-04 18:23:32 +02:00
Mike Fährmann
0abad8bc12
implement 'compile_expression()'
2021-06-03 22:34:58 +02:00
Mike Fährmann
da6806a161
fix job tests for Python 3.4 and 3.5
...
assert_called() and assert_not_called() got added in Python 3.6
2021-05-22 21:40:52 +02:00
Mike Fährmann
8fd8126117
fix ISO 639-1 code for Japanese
...
"jp" -> "ja"
2021-05-22 16:07:04 +02:00
Mike Fährmann
af9dba4684
add DataJob tests
2021-05-21 02:59:54 +02:00
Mike Fährmann
adf4d661b3
use '_extractor' info in UrlJobs
2021-05-19 15:52:30 +02:00
Mike Fährmann
1eabfa5c7a
[pillowfort] implement login with username & password ( #846 )
2021-05-19 02:59:16 +02:00
Mike Fährmann
559462789d
add some tests for job.py
2021-05-14 19:44:16 +02:00
Mike Fährmann
c5ca7905ce
add 'noop()' and 'identity()' functions
2021-05-04 19:27:17 +02:00
Mike Fährmann
bc868e7bb8
consider apparently long extensions as part of the filename
...
(#1516 )
2021-05-02 21:15:50 +02:00
Mike Fährmann
bdfcc9c4b1
update extractor test results
2021-04-18 20:28:15 +02:00
Mike Fährmann
387fe415d5
unescape items in text.split_html()
2021-03-29 02:12:29 +02:00
Mike Fährmann
78fd63b8f0
remove 'text.clean_xml()'
...
was not used anywhere
2021-03-28 04:05:16 +02:00
Mike Fährmann
8553b218d9
replace calls to 'os.path.splitext()' with 'str.rpartition()'
...
Makes functions who used it more than twice as fast
and we can get rid of an import as well.
2021-03-28 04:01:27 +02:00
Mike Fährmann
bff71cde80
implement 'util.unique_squence()'
2021-03-02 23:11:08 +01:00
Mike Fährmann
5f1a6ff6fa
remove unneeded 'TRAVIS_SKIP' from test_results.py
2021-03-01 01:38:18 +01:00
Mike Fährmann
8821dceb79
use __import__() to dynamically load modules
2021-03-01 01:27:02 +01:00
Mike Fährmann
36bf76fa44
update 'oauth:mastodon:<instance>' code
2021-01-28 02:20:12 +01:00
Mike Fährmann
91308140ec
make 'generate_token()' compatible with Python 3.4
2021-01-14 03:48:10 +01:00
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
...
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
0fdaea00a3
[postprocessor:metadata] sanitize filenames
2021-01-10 00:13:20 +01:00
Mike Fährmann
aac00a2024
add 'd' conversion for format strings
...
to convert a timestamp to a formattable 'datetime' object.
For example '{created_at!d:%Y-%m-%d}'
transforms the timestamp in 'created_at' into a 'datetime' object
and then formats its content using '%Y-%m-%d' as template.
1262304000 -> datetime(2010, 1, 1) -> "2010-01-01"
2021-01-09 01:58:44 +01:00
Mike Fährmann
912eea29bc
update extractor test results
2020-12-27 17:41:08 +01:00
Mike Fährmann
1f9121fecb
release version 1.16.0
2020-12-12 23:08:25 +01:00
Mike Fährmann
b2c55f0a72
[sankaku] remove login support
...
The old login method for 'https://chan.sankakucomplex.com/user/login '
and the cookies it produces have no effect on the results from
'beta.sankakucomplex.com'.
2020-12-08 21:05:47 +01:00
Mike Fährmann
547107307e
fix 'Metadata' messages in result tests
2020-11-24 13:34:54 +01:00
Mike Fährmann
578dcf805c
[mangapanda] don't force https://
2020-11-21 20:24:37 +01:00
Mike Fährmann
ca59bd691c
[postprocessor:metadata] add 'event' and 'filename' options
2020-11-20 22:29:11 +01:00
Mike Fährmann
9fffa9c343
rework post processor callbacks
2020-11-19 02:29:06 +01:00
Mike Fährmann
1e3dd7330e
merge SharedConfigMixin functionality into Extractor
2020-11-17 00:34:07 +01:00
Mike Fährmann
e5438b8a29
release version 1.15.3
2020-11-13 15:50:05 +01:00
Mike Fährmann
b9bfa4c675
update extractor test results
2020-11-07 02:03:22 +01:00
Mike Fährmann
c3f01dc4e6
implement 'util.unique()'
2020-10-29 23:33:41 +01:00
Mike Fährmann
d83b95fd28
[postprocessor:metadata] accept a string-list for 'content-format'
...
(closes #1080 )
2020-10-27 20:09:58 +01:00
Mike Fährmann
350b1afe1c
speed up _list_classes() after iterating over all modules once
2020-10-26 22:18:15 +01:00
Mike Fährmann
18213dc5ba
release version 1.15.2
2020-10-24 18:57:29 +02:00
Mike Fährmann
ec61696316
add 't' format string conversion ( closes #1065 )
...
to Trim whitespace from the beginning and end of strings.
Example: '{field!t}' becomes 'foo' for 'field' == " \nfoo\t\r"
2020-10-16 00:37:22 +02:00
Mike Fährmann
07432d6262
[seiga] fix flake8 and cookie test ( #1063 )
2020-10-15 15:37:58 +02:00
Mike Fährmann
b8daabc3ca
[pinterest] implement login support ( closes #1055 )
...
being logged allows access to secret/protected boards
2020-10-15 15:14:18 +02:00
kurumigi
7e0e872f4f
[seiga] Add metadata for single image downloads ( #1063 )
...
* [seiga] Support image metadata.
* [seiga] Update test data.
* [seiga] Fix cookie check.
* [test_cookies] [seiga] Fit test_cookies.py to the last commit.
2020-10-15 15:13:27 +02:00
Mike Fährmann
844793847c
update extractor test results
2020-10-11 18:15:41 +02:00
Mike Fährmann
c874071f5a
[kissmanga] remove module
2020-10-04 22:46:41 +02:00
Mike Fährmann
844502cad5
update extractor test results
2020-10-03 19:24:19 +02:00
Mike Fährmann
7cd383c0f9
update extractor test results
2020-09-20 21:54:39 +02:00
Mike Fährmann
65744a7a31
use alternative for all falsey values in format strings
...
… and not just None (#525 )
It would be better to consistently use None for all non-existent
fields and/or fields without a valid value, but this is a good
enough workaround for now.
2020-09-19 22:02:47 +02:00
Mike Fährmann
f5b7ae01c1
update extractor test results
2020-09-15 18:07:08 +02:00
Mike Fährmann
392d022b04
implement 'config.accumulate()' ( #994 )
2020-09-14 21:13:08 +02:00
Mike Fährmann
3108e85b89
[worldthree] remove extractors
...
http://www.slide.world-three.org/ hasn't been accessible for a long time.
2020-09-11 18:12:57 +02:00
Mike Fährmann
3918b69677
remove 'extractor.blacklist' context manager
2020-09-11 13:17:35 +02:00
Mike Fährmann
ac3036ef56
add 'filesize-min' and 'filesize-max' options ( closes #780 )
2020-09-03 18:21:04 +02:00
Mike Fährmann
fd0685d9b5
[postprocessor:zip] defer zip file creation ( fixes #968 )
...
don't try to create zip files on postprocessor construction,
wait until directory creation during file download,
2020-08-31 21:53:18 +02:00
Mike Fährmann
d50f3b333a
update extractor test results
2020-08-30 20:55:22 +02:00
Mike Fährmann
e33293fdd8
[hentaihand] update to new site layout
2020-08-30 00:41:03 +02:00
Mike Fährmann
69e4871005
update extractor test results
...
- sensescans: replace 404d chapters
- mangapark: replace 404d chapters
- subscribestar: update test for attached files
2020-08-28 22:32:32 +02:00
Mike Fährmann
688bd046fc
release version 1.14.4
2020-08-15 21:29:02 +02:00
Mike Fährmann
422e69f187
skip external OAuth tests ( closes #908 )
2020-07-30 19:26:09 +02:00
Mike Fährmann
8dbf827649
[bobx] remove module
2020-07-24 17:00:43 +02:00
Mike Fährmann
87202b8d74
[inkbunny] add 'user' and 'post' extractors ( #283 )
2020-07-22 22:21:30 +02:00
Mike Fährmann
2ecf1efb16
update extractor test results
...
- tumblr: remove deleted post
- jaiminisbox: replace removed manga/chapters
- smugmug: one inconsequential field got removed
2020-07-18 15:12:28 +02:00
Mike Fährmann
e62ebb4643
update CHANGELOG before building sdist and wheel packages
2020-06-27 19:45:09 +02:00
Mike Fährmann
0cac14c3bd
update extractor test results
2020-06-25 19:11:47 +02:00
Mike Fährmann
53cc498d9c
improve config lookup when there are multiple possible locations
...
This specifically applies to all Mastodon extractors and all
extractors with a 'basecategory', i.e. 'booru', 'foolslide', etc.
Values inside those general config locations wouldn't be recognized
when a value with the same was set on the 'extractor' level.
For example 'extractor.mastodon.directory' should be used over
'extractor.directory' when both are set, but this was impossible
with the previous implementation.
(fixes #843 )
2020-06-21 00:07:10 +02:00
Mike Fährmann
d81a8e6544
[twitter] update tests
2020-06-19 23:01:02 +02:00
Mike Fährmann
37d71f6e09
strip microseconds in text.parse_datetime()
2020-06-17 21:40:16 +02:00
Mike Fährmann
6db7ed90cb
release version 1.14.1
2020-06-12 20:12:09 +02:00
Mike Fährmann
087e3184dc
use a non-twitter URL when testing snap creation
2020-06-12 18:31:14 +02:00
Mike Fährmann
7daef6ee70
update extractor test results
...
- certain posts on Instagram now return
https://static.cdninstagram.com/rsrc.php/null.jpg
for public users
- MangaDex is deploying its new MangaDex@Home network similar to
exhentai's Hentai@Home
- realbooru has a new site layout, but the underlying booru API still
works like before
2020-06-12 00:36:06 +02:00
Mike Fährmann
3bad1579ee
update extractor test results
2020-05-31 17:42:07 +02:00
Mike Fährmann
45baa13615
update extractor test results
...
- don't run Instagram tests on Travis anymore
- replace Twitter test because timeline was made private
- update Hiperdex domain to '.com' (again ...)
2020-05-28 02:18:06 +02:00
Mike Fährmann
dfcf2a2c91
write OAuth token to cache by default ( #616 )
2020-05-25 22:35:45 +02:00
Mike Fährmann
6294e2c540
add 'text.ensure_http_scheme()'
2020-05-19 22:32:53 +02:00
Mike Fährmann
ece73b5b2a
make 'path' and 'keywords' available in logging messages
...
Wrap all loggers used by job, extractor, downloader, and postprocessor
objects into a (custom) LoggerAdapter that provides access to the
underlying job, extractor, pathfmt, and kwdict objects and their
properties.
__init__() signatures for all downloader and postprocessor classes have
been changed to take the current Job object as their first argument,
instead of the current extractor or pathfmt.
(#574 , #575 )
2020-05-18 19:04:51 +02:00
Mike Fährmann
4b606b68e4
skip OAuth tests when server is unreachable
2020-05-10 00:33:00 +02:00
Mike Fährmann
8b60bd6a91
mock 'time()' in cache tests
...
instead of calling 'sleep()' to let time advance.
This shortens the time needed to run those tests,
and ensures consistent results.
(Tests would randomly fail when using 'sleep()')
2020-05-09 23:55:14 +02:00
Mike Fährmann
8f2c1da041
skip example config tests if files are not available ( #730 )
2020-05-08 22:56:00 +02:00
Mike Fährmann
5df8f2959b
insert local directory into PYTHONPATH when running tests
2020-05-02 01:15:50 +02:00
Mike Fährmann
ff47641b13
test whether default/example config files contain valid JSON
2020-04-30 00:00:41 +02:00
Mike Fährmann
d6facdee7b
[mastodon] add tests ( #701 )
2020-04-22 21:10:34 +02:00
Mike Fährmann
fd438f0d78
update extractor test results
2020-04-11 23:00:42 +02:00
Mike Fährmann
a0f4c295c0
add optional 'utcoffset' argument to 'parse_datetime()'
2020-04-11 02:05:00 +02:00
Mike Fährmann
406449b0d6
ensure keys for mastodon instances are available during tests
...
Calls to config.clear() from other tests are removing the API
credentials set when importing mastodon.py for the first time.
2020-04-08 21:56:14 +02:00
Mike Fährmann
9e7dfc0cfc
[myportfolio] fix extraction of galleries without title
2020-04-08 21:08:05 +02:00
Mike Fährmann
3b50c4f49d
add tests for "Extractors" in oauth.py ( #670 )
2020-04-07 20:26:12 +02:00
Mike Fährmann
04bd0472de
add tests for Extractor.wait()
2020-04-07 20:24:56 +02:00
Mike Fährmann
7499d71d02
[simplyhentai] ignore certificate errors in video test
2020-03-28 21:07:30 +01:00
Mike Fährmann
4203dc0bdc
[mangapark] fix metadata extraction
2020-03-28 03:00:26 +01:00