Mike Fährmann
05255f5be0
add 'default' argument to 'text.extr()'
2022-11-09 11:00:32 +01:00
Mike Fährmann
eb33e6cf2d
add 'text.extr()'
...
a stripped-down version of text.extract() that
- always returns a string (like 'extract_from')
- only returns a string
- does not deal with 'pos' arguments
- is ~20% faster
2022-11-04 21:37:36 +01:00
Mike Fährmann
67bad04dda
[formatter] add 'g' conversion to sluGify a string ( #2410 )
2022-08-26 17:57:17 +02:00
Mike Fährmann
bddcec49f1
implement 'text.root_from_url()'
...
use domain from input URL for kemono
2022-03-01 03:09:57 +01:00
Mike Fährmann
bc868e7bb8
consider apparently long extensions as part of the filename
...
(#1516 )
2021-05-02 21:15:50 +02:00
Mike Fährmann
387fe415d5
unescape items in text.split_html()
2021-03-29 02:12:29 +02:00
Mike Fährmann
78fd63b8f0
remove 'text.clean_xml()'
...
was not used anywhere
2021-03-28 04:05:16 +02:00
Mike Fährmann
8553b218d9
replace calls to 'os.path.splitext()' with 'str.rpartition()'
...
Makes functions who used it more than twice as fast
and we can get rid of an import as well.
2021-03-28 04:01:27 +02:00
Mike Fährmann
37d71f6e09
strip microseconds in text.parse_datetime()
2020-06-17 21:40:16 +02:00
Mike Fährmann
6294e2c540
add 'text.ensure_http_scheme()'
2020-05-19 22:32:53 +02:00
Mike Fährmann
5df8f2959b
insert local directory into PYTHONPATH when running tests
2020-05-02 01:15:50 +02:00
Mike Fährmann
a0f4c295c0
add optional 'utcoffset' argument to 'parse_datetime()'
2020-04-11 02:05:00 +02:00
Mike Fährmann
b1bea8aaeb
add 'restrict-filenames' option ( #348 )
2019-07-23 17:41:24 +02:00
Mike Fährmann
b171befa87
implement 'parse_unicode_escapes()'
2019-06-16 21:47:24 +02:00
Mike Fährmann
2b1999476e
implement 'text.rextract()'
2019-05-28 21:03:41 +02:00
Mike Fährmann
2316e0ed3d
fix strptime workaround from b0e85a4
...
Don't return a modified version of 'date_time' if strptime fails.
2019-05-25 23:22:26 +02:00
Mike Fährmann
b0e85a42e3
apply workaround from 4736912
in parse_datetime() itself
2019-05-09 21:53:17 +02:00
Mike Fährmann
4736912d4e
[pixiv] work around strptime limitations in Python < 3.7
...
"%z" doesn't allow a colon separator in older Python versions:
- "+0900" is OK
- "+09:00" raises an exception
2019-05-08 18:08:03 +02:00
Mike Fährmann
d09864b581
implement text.parse_datetime()
2019-05-08 15:43:59 +02:00
Mike Fährmann
6264a46212
use 'utcfromtimestamp()'
...
'fromtimestamp()' converts its results to the local timezone and causes
problems when running tests on a different machine.
2019-04-21 16:22:53 +02:00
Mike Fährmann
d670de0344
implement 'text.parse_timestamp()'
2019-04-21 15:28:27 +02:00
Mike Fährmann
21a7e395a7
implement convenience wrapper for text.extract functionality
2019-04-19 22:30:11 +02:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
e1d3e9a926
add 'ext_from_url' to text.py
2019-01-31 12:23:25 +01:00
Mike Fährmann
2d2953a5bf
add 'text.parse_float()' + cleanup in text.py
2019-01-29 16:46:21 +01:00
Mike Fährmann
ae9a37a528
implement text.split_html()
2018-05-27 15:00:41 +02:00
Mike Fährmann
cc36f88586
rename safe_int to parse_int; move parse_* to text module
2018-04-20 14:53:21 +02:00
Mike Fährmann
4ffa94f634
remove 'shorten_path()' and 'shorten_filename()'
2018-04-15 18:44:13 +02:00
Mike Fährmann
27eab4e467
rewrite text tests and improve functions
...
- test more edge cases
- consistently return an empty string for invalid arguments
- remove the ungreedy-flag in 'remove_html()'
2018-04-15 18:13:46 +02:00
Mike Fährmann
e3f2bd4087
add tests for 'text.clean_xml()' and improve it
2018-04-14 22:07:01 +02:00
Mike Fährmann
6d8b191ea7
improve 'parse_query()' and add tests
...
- another irrelevant micro-optimization !
- use urllib.parse.parse_qsl directly instead of parse_qs, which
just packs the results of parse_qsl in a different data structure
- reduced memory requirements since no additional dict and lists are
created
2018-04-13 19:21:32 +02:00
Mike Fährmann
4f123b8513
code adjustments according to pep8
2017-01-30 19:40:15 +01:00
Mike Fährmann
eeae580781
more tests
2015-11-28 01:47:17 +01:00
Mike Fährmann
ca523b9f64
add helper method to text module
2015-11-16 03:46:43 +01:00
Mike Fährmann
cba4b91b14
add tests
2015-10-31 00:39:10 +01:00
Mike Fährmann
2962bf36f6
add tests for text-module
2015-10-03 14:51:13 +02:00