Mike Fährmann
afd20ef42c
[kemonoparty] implement filtering duplicate revisions ( #5013 )
...
set 'revisions' to '"unique"' to have it ignore duplicate revisions
2024-01-26 14:44:15 +01:00
Mike Fährmann
c28475d325
[kemonoparty] fix deleting 'name' in orginal objects ( #5103 )
...
... when computing 'revision_hash'
regression caused by 3d68eda4
dict.copy() only creates a shallow copy
I know that and still managed to get I wrong ...
2024-01-25 23:46:19 +01:00
Mike Fährmann
3d68eda4ab
[kemonoparty] add 'revision_hash' metadata ( #4706 , #4727 , #5013 )
...
A SHA1 hexdigest of other relevant metadata fields like
title, content, file and attachment URLs.
This value does NOT reflect which revisions are listed on the website.
Neither does 'edited' or any other metadata field (combinations).
2024-01-16 00:38:10 +01:00
Mike Fährmann
57fc6fcf83
replace '24*3600' with '86400'
...
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
caf31e751c
[kemonoparty] limit 'title' length ( #4741 )
2023-11-02 15:53:23 +01:00
Mike Fährmann
d0effcae20
[kemonoparty] add 'revision_index' metadata field ( #4727 )
2023-10-26 22:26:38 +02:00
Mike Fährmann
3bbaa875f1
[kemonoparty] fix parsing of non-standard 'dates' ( #4676 )
2023-10-26 21:50:18 +02:00
Mike Fährmann
0d52b775cb
[kemonoparty] add 'revisions' option ( #4498 , #4597 )
2023-10-20 15:20:49 +02:00
Mike Fährmann
6e830ffc9e
[kemonoparty] support post searches ( #3385 , #4057 )
2023-10-19 23:06:06 +02:00
Mike Fährmann
aaf539009b
[kemonoparty] initial support for post revisions ( #4498 , #4597 )
...
- single revision
https://kemono.party/SERVICE/user/12345/post/12345/revision/12345
- all revisions
https://kemono.party/SERVICE/user/12345/post/12345/revisions
2023-10-19 22:32:51 +02:00
Mike Fährmann
174191cb79
[kemonoparty] restore discord pagination ( #4676 )
2023-10-19 21:57:27 +02:00
Mike Fährmann
c9a976d8a6
[kemonoparty] various updates and fixes ( #4676 , #4681 )
...
- fix pagination
- fix 'date' metadata
- fix discord channel API endpoint
2023-10-19 17:36:16 +02:00
Klion Xu
dc1c2139b1
fix line too long
2023-10-19 10:54:08 +08:00
Klion Xu
6b22af9720
[kemonoparty] update API endpoint ( #4676 )
2023-10-19 10:32:59 +08:00
Mike Fährmann
ade8347ead
[kemonoparty] fix DM dates
2023-10-15 19:54:28 +02:00
Mike Fährmann
6dfe200ae4
[kemonoparty] support discord URLs with channel IDs ( #4662 )
2023-10-15 19:45:22 +02:00
Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
d13c82eff1
[kemonoparty] update favorites API endpoint ( #4522 )
2023-09-14 14:57:01 +02:00
Mike Fährmann
27ec653991
fix bug in test_init and update example URLs
2023-09-14 13:27:03 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
4ae925c88f
[kemonoparty] support '.su' TLD ( #4139 )
2023-06-06 20:55:03 +02:00
Mike Fährmann
3516fdae74
[kemonoparty] fix kemono and coomer logins using the same cache
...
(#4098 )
2023-05-26 13:35:02 +02:00
Mike Fährmann
76b01b64cf
[kemonoparty] remove MD5 hash extraction ( #3531 )
...
This partially reverts commit 20d6194ffa
.
2023-01-25 11:10:09 +01:00
ClosedPort22
20d6194ffa
[kemonoparty] improve hash extraction
...
- extract MD5 hash from URLs
- extract MD5 and SHA256 hash from Discord URLs (kemono.party only)
- minor optimization (do not call 'hashes.add' when 'duplicates' is
true)
- update tests accordingly
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2023-01-15 12:01:13 +08:00
Mike Fährmann
85bd1cbc89
[kemonoparty] fix regression from 473bd380
( #3519 )
...
- do not access 'response.content' unless necessary
- only validate responses if filename extensions differ
2023-01-11 15:25:01 +01:00
Mike Fährmann
473bd380c8
[kemonoparty] reject invalid/empty files ( #3510 )
2023-01-10 19:04:47 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
Mike Fährmann
77173694d5
[kemonoparty] fix 'dms' extraction ( #3106 )
2022-10-26 14:25:43 +02:00
Mike Fährmann
94a2dfe205
[kemonoparty] update pagination offset
2022-10-17 10:22:12 +02:00
Mike Fährmann
78694a61bb
[kemonoparty] restore 'favorites' API endpoints ( #2994 )
2022-10-01 12:15:32 +02:00
Mike Fährmann
b84982b2f9
[kemonoparty] send Referer headers ( #2989 , #2990 )
2022-10-01 11:45:56 +02:00
Mike Fährmann
779e75c6f8
[kemonoparty] fix attachment IDs overwriting post IDs ( #2984 )
...
regression from 09a5cc61
2022-09-30 16:47:09 +02:00
Mike Fährmann
09a5cc6103
[kemonoparty] add 'count' metadata field ( #2952 )
2022-09-23 10:44:12 +02:00
enduser420
574e38a287
[kemonoparty] add 'favorites' option ( #2826 ) ( #2831 )
...
* [kemonoparty] add 'favorites' option (#2826 )
* [kemonoparty] add regex for the url parameter and fallback on the config
option
* [kemonoparty] simplify
2022-08-18 18:01:42 +02:00
Mike Fährmann
7c0505868c
[kemonoparty] ensure all files have an 'extension' ( #2740 )
2022-07-10 13:53:07 +02:00
Mike Fährmann
ba69fb669d
[kemonoparty] add 'duplicates' option ( closes #2440 )
2022-03-24 11:58:38 +01:00
Mike Fährmann
fac8047899
[kemonoparty] limit default filename length ( #2373 )
2022-03-08 21:14:47 +01:00
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
...
use domain from input URL for kemono
2022-03-01 03:09:57 +01:00
Mike Fährmann
92c492dc09
[kemonoparty] match beta.kemono.party URLs ( #2348 )
2022-03-01 03:02:30 +01:00
Mike Fährmann
a57a44f510
[kemonoparty] handle files without 'name' ( fixes #2276 )
2022-02-08 18:27:05 +01:00
Mike Fährmann
d7b8e04b50
[kemonoparty] use 'Accept-Encoding: identity' for all downloads
...
(#2267 )
fixes issues when data send with 'Content-Encoding: gzip' or other
encodings is larger than the actual file
2022-02-05 18:06:58 +01:00
Mike Fährmann
a2eecc6aa8
[kemonoparty] fix DMs extraction ( #2008 )
2022-01-25 23:16:13 +01:00
Mike Fährmann
6af8d71da6
[kemonoparty] use service as subcategory ( closes #2147 )
2021-12-29 22:46:17 +01:00
Mike Fährmann
8ed282f7f2
[kemonoparty] support coomer.party URLs ( #2100 )
2021-12-15 16:21:05 +01:00
Mike Fährmann
f1b142e993
{kemonoparty[ change default 'files' order to attachments,file,inline
...
(#1991 )
2021-11-29 04:41:30 +01:00
Mike Fährmann
e298882acc
[kemonoparty] match URLs with www subdomain
2021-11-26 18:58:26 +01:00
Mike Fährmann
af6424f398
allow testing metadata in list elements
2021-11-21 22:46:34 +01:00
Mike Fährmann
c67756e187
[kemonoparty] add 'dms' option ( #2008 )
2021-11-20 23:36:16 +01:00