cjk | 易学教程

Chinese encoding issue while listing files

阅读更多关于 Chinese encoding issue while listing files

问题 I am running a Java application on a Solaris10 with Chinese . Now there are some files in a directory with chinese filenames. When I do files = new File(dir).list() where "dir" is the parent directory containing that chinese file, I get the result filename files[0] as ????? (some junk characters). Now the deal is that my programs file.encoding property is already set to GBK and I also do Charset.isSupported("GBK") and it returns true too. So where could be the problem. I am running out of

Custom UITableViewCell With Core Text

阅读更多关于 Custom UITableViewCell With Core Text

问题 Using the help of @yaslam I've created in Core Text an UILabel that show Japanese text in both horizontal and vertical way with Furigana using CTRubyAnnotation. Unfortunately I've a problem. I need to use this label inside a custom cell and I need that the cell dynamically resize the height of the cell based on text. but don't work. the cell doesn't expands Can you help me? Thank you very much Here's code import UIKit protocol SimpleVerticalGlyphViewProtocol { } extension

How to parse UTF-8 characters in Excel files using POI

阅读更多关于 How to parse UTF-8 characters in Excel files using POI

问题 I have been using POI to parse XLS and XLSX files successfully. However, I am unable to correctly extract special characters, such as UTF-8 encoded characters like Chinese or Japanese, from an Excel spreadsheet. I have figured out how to extract data from a UTF-8 encoded csv or tab delimited file, but no luck with the Excel file. Can anyone help? ( Edit: Code snippet from comments ) HSSFSheet sheet = workbook.getSheet(worksheet); HSSFEvaluationWorkbook ewb = HSSFEvaluationWorkbook.create

How to get the length of Japanese characters in Javascript?

阅读更多关于 How to get the length of Japanese characters in Javascript?

问题 I have an ASP Classic page with SHIFT_JIS charset. The meta tag under the page's head section is like this: <meta http-equiv="Content-Type" content="text/html; charset=shift_jis"> My page has a text box (txtName) that should only allow 200 characters. I have a Javascript function that validates the character length, which is called on the onclick() event of my Submit button. if(document.frmPage.txtName.value.length > 200) { alert("You have exceeded the maximum length of 200."); return false;

What is the replacement for Language Analysis framework's Morpheme analysis deprecated APIs

阅读更多关于 What is the replacement for Language Analysis framework's Morpheme analysis deprecated APIs

问题 The Language Analysis framework is deprecated and its not even available in 64-bit. The documentation says - use CFStringTokenizer but the tokenizer doesn't provide functionalities available in lang analysis framework. What is the replacement for morpheme analysis APIs that lang analysis framework provided? EDIT: Though Pantong's reply helped but it doesn't work in all cases, e.g. for words with 3-4 kanji characters it returns incorrect result. (By incorrect I mean its not same as what it

Transliterate CJK to Latin — preferably in C++ [closed]

阅读更多关于 Transliterate CJK to Latin — preferably in C++ [closed]

问题 It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 7 years ago . I am trying to write a program that can transliterate CJK to Latin (i.e Pinyin, Romaji, etc.). For example you give a Chinese, Japanese or Korean document as input and then you get the transliterated version into

How to sort Japanese like Excel

阅读更多关于 How to sort Japanese like Excel

问题 I want to sort Japanese words ( Kanji) like sort feature in excel. I have tried many ways to sort Japanese text in PHP but the result is not 100% like result in excel. First . I tried to convert Kanji to Katakana by using this lib (https://osdn.net/projects/igo-php/) but some case is not same like excel. I want to sort these words ASC けやきの家高森台病院みのりの里 My Result : けやきの家高森台病院みのりの里 Excel Result: けやきの家みのりの里高森台病院 Second I tried other way by using this function mb_convert_kana($text, "KVc",

Chinese character in source code when UTF-8 settings can't be used [duplicate]

阅读更多关于 Chinese character in source code when UTF-8 settings can't be used [duplicate]

问题 This question already has an answer here : PHP and C++ for UTF-8 code unit in reverse order in Chinese character (1 answer) Closed 6 years ago . This is the scenario: I can only use the char* data type for the string, not wchar_t * My MS Visual C++ compiler has to be set to MBCS, not UNICODE because the third party source code that I have is using MBCS; Setting it to UNICODE will cause data type issues. I am trying to print chinese characters on a printer which needs to get a character string

Programmatically determine number of strokes in a Chinese character?

阅读更多关于 Programmatically determine number of strokes in a Chinese character?

问题 Does Unicode store stroke count information about Chinese, Japanese, or other stroke-based characters? 回答1: A little googling came up with Unihan.zip, a file published by the Unicode Consortium which contains several text files including Unihan_RadicalStrokeCounts.txt which may be what you want. There is also an online Unihan Database Lookup based on this data. 回答2: In Python there is a library for that: >>> from cjklib.characterlookup import CharacterLookup >>> cjk = CharacterLookup('C') >>>

How to make Haskell or ghci able to show Chinese characters and run Chinese characters named scripts?

阅读更多关于 How to make Haskell or ghci able to show Chinese characters and run Chinese characters named scripts?

问题 I want to make a Haskell script to read files in my /home folder. However there are many files named with Chinese characters, and Haskell and Ghci cannot manage it. It seems Haskell and Ghci aren't good at displaying UTF-8 characters. Here is what I encountered: Prelude> "让Haskell或者Ghci能正确显示汉字并且读取汉字命名的文档" "\35753Haskell\25110\32773Ghci\33021\27491\30830\26174\31034\27721\23383\24182\19988\35835\21462\27721\23383\21629\21517\30340\25991\26723" 回答1: Prelude> putStrLn "\35753Haskell\25110