How to handle accents in Common Lisp (SBCL)?

给你一囗甜甜゛ 提交于 2019-12-20 02:28:12

问题


That's probably very basic, but I didn't know where else to ask. I'm trying to process some text information in an SLIME REPL from a file that are written in Portuguese, hence uses lots of accents characters - such as é, á, ô, etc..

When I'm handling texts in English I use the following function:

(defun txt2list (name)
  (with-open-file (in name)
      (let ((res))
        (do ((line (read-line in nil nil)
                   (read-line in nil nil)))
        ((null line)
         (reverse res))
      (push line res))
    res)))

that cannot read accented characters, giving the error "the octet sequence #(195) cannot be decoded.".

So my question is: Is there a way to manipulate those characters automatically? It's okay to replace those characters for the letter without the accent ('á' turns into 'a') or simply deleting such characters ('cômodo' turns into 'cmodo'), whether it is done in the file itself before reading or during the reading process.


回答1:


You would need to find out what text encoding is used for the file. Then tell WITH-OPEN-FILE to use the correct one.

See the SBCL manual: External Formats

Example:

 (with-open-file (stream pathname :external-format '(:utf-8 :replacement #\?))
   (read-line stream))


来源:https://stackoverflow.com/questions/41473029/how-to-handle-accents-in-common-lisp-sbcl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!