Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
359e31e462
[nozomi] update file URLs ( #3925 )
...
Static images are now only available in WebP format over the 'w'
subdomain. GIFs also got their own 'g' subdomain.
2023-04-17 15:42:42 +02:00
Mike Fährmann
d1314df6e6
[nozomi] fix extraction ( #3051 )
2022-10-14 00:19:37 +02:00
Mike Fährmann
baf3815ebd
[nozomi] small code optimizations
2022-07-14 14:59:11 +02:00
Mike Fährmann
62cc47755b
[nozomi] reduce memory consumption during searches ( #2754 )
...
only load and use the entire 'index.nozomi' database
if there are only negative search terms
2022-07-13 17:16:10 +02:00
Mike Fährmann
2687ef6bd9
[nozomi] remove slashes from search terms ( fixes #2653 )
2022-06-02 22:17:15 +02:00
Mike Fährmann
fba95c3a9e
[nozomi] preserve case of search tags ( fixes #1860 )
2021-09-16 16:43:06 +02:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
577fffad5f
[nozomi] update 'archive_fmt' values for tag and search extractors
...
… so they actually work for posts with more than 1 file.
(fixes #1523 )
2021-05-04 19:28:37 +02:00
HRXN
e13cae182b
[nozomi] Extend default archive-fmt for Tag and Search Extractor ( #1529 )
...
Closes #1523
2021-05-04 19:26:35 +02:00
Mike Fährmann
4be27ff0fe
[nozomi] support '/index-N.html' URLs ( closes #1365 )
...
and '/index-Popular-N.html'
2021-03-11 01:06:47 +01:00
Mike Fährmann
193dca2ce1
update extractor test results
2021-01-21 21:35:42 +01:00
Mike Fährmann
318876e4dd
[nozomi] add 'num' enumeration index ( closes #1239 )
2021-01-12 22:32:52 +01:00
Mike Fährmann
4225f12783
[nozomi] handle empty 'date' fields ( fixes #1163 )
2020-12-07 00:08:53 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
844793847c
update extractor test results
2020-10-11 18:15:41 +02:00
Mike Fährmann
456f6e8d05
[nozomi] move '_unpack()' method to global scope
2020-04-20 21:44:16 +02:00
Mike Fährmann
2c3b9e1450
[nozomi] support multiple images per post ( #646 )
...
This changes the default filename format as well as archive IDs,
since those assumed that each post would only have one image.
2020-03-19 21:07:31 +01:00
Mike Fährmann
33b42dc847
[nozomi] sort search results ( fixes #646 )
2020-03-17 22:28:23 +01:00
Mike Fährmann
4e361b3008
add tests for specific datetime values
2020-02-23 16:48:30 +01:00
Mike Fährmann
fbc0a6a059
[nozomi] skip unavailable posts ( #388 )
2019-10-17 23:05:04 +02:00
Mike Fährmann
ae98dbcbb3
[nozomi] implement searching for negated terms ( #388 )
...
It's incredibly slow and resource intensive (> 1GB of memory),
but that is also how it is implemented on nozomi.la itself.
2019-10-17 22:53:37 +02:00
Mike Fährmann
91643ca54b
[nozomi] add search extractor ( #388 )
2019-10-14 23:49:46 +02:00
Mike Fährmann
6779512fc7
[nozomi] add post and tag extractors ( #388 )
2019-10-13 22:16:03 +02:00