Commit Graph

7698 Commits

Author SHA1 Message Date
aaaxx
e5c3157767 Update eng_OCRFixReplaceList.xml
Closes #1978

Edits
========================================

Should be spaced instead of hyphenated (probably joined by OCR):

- `<Word from="airstrike" to="air-strike" />`
- `<Word from="wallplant" to="wall-plant" />`

Typo in replacement:

- `<Word from="lfeelonelung" to="l feel one lung" />`
- `<Word from="lneed"        to="l need" />`
- `<Word from="lthink___"    to="l think..." />`
- `<Word from="ltold"        to="l told" />`
- `<Word from="lv\/asn't"    to="l wasen't" />`
- `<Word from="Voilé"        to="Voilá" />`
- `<Ending from="pshycol"    to="pshyco!" />`

Capital "i" is a more likely replacement:

- `<Word from="lt"      to="it" />`
- `<Word from="lt'II"   to="it'll" />`
- `<Word from="lt'Il"   to="it'll" />`
- `<Word from="lt'll"   to="it'll" />`
- `<Word from="lt's"    to="it's" />`
- `<Word from="lfstill" to="if still" />`

Vocative, always needs a comma:

- `<Word from="HeyJennifer" to="Hey Jennifer" />`

Removals
========================================

Spelling varies between dictionaries:

- `<Word from="kickflip"  to="kick-flip" />`
- `<Word from="voicemail" to="voice-mail" />`

British vs. American spelling:

- `<Word from="judgement"  to="judgment" />`
- `<Word from="fulfilment" to="fulfillment" />`

Typo, not an OCR error, so spellchecker should deal with it (it doesn't make sense to keep a list of all possible misspellings):

- `<Word from="Goddamit"     to="Goddammit" />`
- `<Word from="mischevious"  to="mischievous" />`
- `<Word from="perscribed"   to="prescribed" />`
- `<Word from="perscription" to="prescription" />`
- `<Word from="pshyco"       to="psycho" />`
- `<Word from="thoguht"      to="thought" />`

Spelling changes meaning:

- `<Word from="ahold"  to="a hold" />`
- `<Word from="google" to="Google" />`

Find and replace are the same:

- `<Word from="I thought" to="I thought" />`
- `<Word from="literally" to="literally" />`

Resulting punctuation seems unlikely:

- `<Word from="'Qkay_"         to="- Okay!" />`
- `<Word from="_Qkay-"         to="- Okay!" />`
- `<Word from="'Qkay"          to="- Okay" />`
- `<Word from="JOEY-"          to="Joey!" />`
- `<Word from="_NO__"          to="No--" />`

Other reason:

Replacement rule                              | Comment
:---------------------------------------------|:-------------------
`<Word from="cp"          to="op" />`         | doesn't seem useful
`<Word from="lnte"        to="inte" />`       | doesn't seem useful
`<Word from="gothere"     to="go there" />`   | could also be "got here"
`<Word from="ridonculous" to="ridiculous" />` | intentional mispronunciation
`<Word from="I02"         to="Pops" />`       | seems really implausible, and it could mess up IDs, codes, etc.
2016-09-27 15:10:58 +02:00
Nikolaj Olsson
1914a4942f Merge pull request #1976 from ivandrofly/phxSub
[PhoenixSubtitle] - Minor fixes/Update.
2016-09-26 21:14:06 +02:00
Ivandro Ismael
3962ab9335 [PhoenixSubtitle] - Minor fixes/Update. 2016-09-26 17:40:33 +01:00
Nikolaj Olsson
f953e75a40 Merge pull request #1973 from ivandrofly/main-10
[Main] - Cache subtitle hash.
2016-09-25 18:14:37 +02:00
Ivandro Ismael
c349cfdfc3
[Main] - Cache subtitle hash. 2016-09-25 16:06:54 +01:00
aaaxx
74c5c0a29e Updated OcrFixReplaceList.cs
Added other common Latin ligatures present in Unicode.

Also added the acute accent, which I've often seen used instead of the
apostrophe, either as an OCR error or because people mistake it for the
curly apostrophe.

Closes #1961
2016-09-25 11:15:04 +02:00
Nikolaj Olsson
8aba3f8f17 Merge pull request #1970 from ghoogland/issue-1969-cavena890mapping
Fix issue-1969 cavena890 character mapping
2016-09-25 06:57:50 +02:00
Nikolaj Olsson
8b966c46a8 Merge pull request #1964 from ivandrofly/phoenix-sf
[PhoenixSubtitle] - Add new subtitle format.
2016-09-25 06:47:57 +02:00
Ivandro Ismael
efebaf8cdf
[PhoenixSubtitle] - Encode/Decode to frame & Minor refact. 2016-09-24 20:33:32 +01:00
Kruno H
6792825dda Update hrv_OCRFixReplaceList.xml
Closes #1972
2016-09-24 19:17:58 +02:00
ghoogland
38932e796f Fix issue-1969 cavena890 character mapping
Added mapping for more diacritics/special characters in cavena890.cs, back and forth.
2016-09-23 16:42:15 +02:00
Ivandro Ismael
4a5a61e419
[PhoenixSubtitle] - Minor refact. 2016-09-23 15:28:13 +01:00
Waldi Ravens
c5cff7d0bc installer: added gpl.txt to the remove-old-files list 2016-09-23 13:29:46 +02:00
Ivandro Ismael
c1188fefc0
[PhoenixSubtitle] - Add new subtitle format. 2016-09-22 14:11:14 +01:00
Waldi Ravens
d44323f8df Updated hrv_OCRFixReplaceList.xml 2016-09-21 14:33:39 +02:00
Waldi Ravens
1b5ddde1aa translations: convert <SyntaxColorTextMoreThanXLines> to <SyntaxColorTextMoreThanMaxLines> 2016-09-21 13:43:16 +02:00
Waldi Ravens
f6e8956080 Formatting (whitespace only) 2016-09-21 12:40:08 +02:00
Waldi Ravens
e26c5acdf5 dictionaries: automated XML upkeep 2016-09-21 12:40:08 +02:00
Kruno H
dc01fe0b27 Update hrv_OCRFixReplaceList.xml
Closes #1965
2016-09-21 12:29:00 +02:00
Waldi Ravens
2e71b9f31b installer: gpl.txt ==> LICENSE.txt 2016-09-21 11:50:24 +02:00
Nikolaj Olsson
a8275ff459 Merge pull request #1963 from ivandrofly/license
[LICENSE] - Rename gpl => LICENSE.
2016-09-21 03:52:10 +02:00
Ivandro Ismael
aaae3d4277
[LICENSE] - Rename gpl => LICENSE. 2016-09-20 17:51:51 +01:00
Waldi Ravens
7a2fe71686 Updated German translation 2016-09-20 12:27:44 +02:00
Kruno H
6771480232 Update hrv_OCRFixReplaceList.xml
Closes #1959
2016-09-19 10:14:55 +02:00
xylographe
5573ae0ea9 Merge pull request #1957 from aaaxx/patch-1
Updated the English OCR fix replace list
2016-09-19 08:43:12 +02:00
xylographe
633e24972a Merge pull request #1962 from aaaxx/OcrFixUseHardcodedRules-2
minor comment edit in OcrFixReplaceList.cs
2016-09-19 08:34:52 +02:00
aaaxx
a02a82a599 minor comment edit in OcrFixReplaceList.cs 2016-09-19 03:00:26 +02:00
aaaxx
1443e279a6 Removed licence/license rule: it's not a typo
In British English "licence" is a noun and "license" a verb.
2016-09-16 06:40:36 +02:00
Nikolaj Olsson
822c917a4c Updated simplified Chinese translation - thx Leon :) 2016-09-15 17:54:47 +02:00
Nikolaj Olsson
8d2f6cbd25 New five seonds back/forward shortcut is now actually save/loaded - thx Victor :) 2016-09-15 17:43:49 +02:00
Waldi Ravens
ec0ceb3131 Updated French translation 2016-09-15 13:15:25 +02:00
Waldi Ravens
e20dc78352 Updated Dutch translation 2016-09-14 15:05:57 +02:00
Nikolaj Olsson
14d94375b2 Shortcut "Set end, add new and go to new" now tries to use min gap - thx Grega :) 2016-09-13 22:17:38 +02:00
Nikolaj Olsson
a4df3306eb Updated change log 2016-09-13 17:16:35 +02:00
Nikolaj Olsson
89dc661454 A few fixes for format "SubViewer 2.0" - thx fox :) 2016-09-13 16:51:25 +02:00
Nikolaj Olsson
e4850c4c89 Added unit test for #1951 2016-09-13 16:03:38 +02:00
Nikolaj Olsson
246d95a4c4 Merge pull request #1951 from ivandrofly/ocr-02
[OcrFixEngine] - Fix accesing index outside of bounds.
2016-09-13 16:00:33 +02:00
Kruno H
9b7c9f3387 Update hrv_OCRFixReplaceList.xml
Closes #1947
2016-09-12 10:24:29 +02:00
Ivandro Ismael
15afa97655
[OcrFixEngine] - Fix accesing index outside of bounds. 2016-09-12 03:04:43 +01:00
Nikolaj Olsson
5e0d1bd5da Merge pull request #1945 from ivandrofly/optz-12
[HtmlUtil] - Make sure text contains '<'.
2016-09-07 16:57:38 +02:00
Nikolaj Olsson
8a4a1e04c2 Merge pull request #1943 from ivandrofly/engine-patch-1
[OcrEngine] - init _namesEtcListWithApostrophe and _namesEtcListUppercase in one run.
2016-09-07 16:54:38 +02:00
Mircea Voiculescu
3515286f86 Updated Romanian translation - thx Mircea :) 2016-09-07 16:40:10 +02:00
Ivandro Ismael
16c8b4d59c
[HtmlUtil] - Make sure text contains '<'. 2016-09-07 13:19:59 +01:00
Nikolaj Olsson
e07a0340da Merge pull request #1944 from ivandrofly/ocrfrl
[OcrReplaceList] - Remove fruitless replaces.
2016-09-07 06:15:50 +02:00
Nikolaj Olsson
dba2e0a3a4 Merge pull request #1942 from ivandrofly/quick-fix
Minor fixes
2016-09-07 06:13:11 +02:00
Ivandro Ismael
e2ea2c87e2
[OcrReplaceList] - Remove fruitless replaces. 2016-09-07 04:29:32 +01:00
Ivandro Ismael
561b67a786
[Ebu] - Use pre-defined variable. 2016-09-07 03:39:07 +01:00
Ivandro Ismael
db4f7e6ab9
[ChangeLog/Subtitle] - Fix typos. 2016-09-07 03:38:47 +01:00
Ivandro Ismael
df72c62c57
[OcrEngine] - init hashsets in one run. 2016-09-07 03:18:54 +01:00
niksedk
ce0f2fdd37 Possible fix for #1938 2016-09-06 10:36:47 +02:00