Special characters in the input of hunspell are treated as space

别等时光非礼了梦想. 提交于 2019-12-12 03:22:13

问题


This question was asked on superuser, but got only 8 views in 7 days. Hunspell knowledgeable people go to stackoverflow, hence my reasking the question here.


I am testing hunspell in the command line with a swedish dictionary. The input in the interactive mode replaces all special characters (for example å ä ö) with blanks before spell cheching.

Hunspell 1.3.2
sjögräs
& sj 15 0: SJ, aj, dj, sk, s, j, sej, sju, sjö, sjå, sa, se, ej, st, si
& gr 15 3: ge, g, r, ger, gir, gro, gör, grå, går, gry, er, nr, dr, go, kr
*

sj gr s
& sj 15 0: SJ, aj, dj, sk, s, j, sej, sju, sjö, sjå, sa, se, ej, st, si
& gr 15 3: ge, g, r, ger, gir, gro, gör, grå, går, gry, er, nr, dr, go, kr
*

As you see, the prompt's encoding is working, showing å ä and ö both in the input and the output.

Piping gives the same result:

echo sjögräs | hunspell -d sv_SE

I have tried to give different options to hunspell, including -i UTF-8, -i UTF-16, and keeping the aff file's SET ISO8859-1. Nothing worked.

The same thing happens with french:

C:\Users\gauthier>echo résultat | hunspell -d fr-moderne
Hunspell 1.3.2
*
& sultat 2 2: sultan, rAcsultat

with in addition problems with the output.

I compiled hunspell in MinGW and moved the resulting needed files to somewhere in my path, but I don't think that this information is very relevant.

How do I make hunspell recognize special characters on its input?


回答1:


By echoing the variables $LC_ALL or $LANG you can see which language and locale configuration you have on your the terminal.

Then you can try to change it to the charset hunspell by redefining those variables. For example, you can set

LC_ALL=en_US.ISO8859-15

or

LANG=ca_ES.cp1252

As I recall, the default character set is latin1, but I'm not sure (I'm not with Linux right now).

Try this approach instead of modifing the hunspell software.



来源:https://stackoverflow.com/questions/9787648/special-characters-in-the-input-of-hunspell-are-treated-as-space

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!