iconv

Can I use iconv to convert multi-byte smart quotes to extended ASCII smart quotes?

為{幸葍}努か 提交于 2019-12-29 06:56:29
问题 I have some UTF-8 content that includes multi-byte smart quote characters. I've found that this code will easily convert those characters to ASCII straight quotes (ASCII code 34): $content = iconv("UTF-8", "ASCII//TRANSLIT", $content); OR $content = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $content); However, I'd rather convert these to extended ASCII smart quotes (ASCII codes 147 and 148 in Latin 1 encoding). Does anyone know how to do this? 回答1: You're looking for CP-1252 which contains

linux 查看文件编码以及修改编码

心已入冬 提交于 2019-12-28 00:12:13
linux 查看文件编码以及修改编码 如果你需要在Linux中操作windows下的文件,那么你可能会经常遇到文件编码转换的问题。Windows中默认的文件格式是GBK(gb2312),而Linux一般都是UTF-8。下面介绍一下,在Linux中如何查看文件的编码及如何进行对文件进行编码转换。 查看文件编码   在Linux中查看文件编码可以通过以下几种方式:   1.在Vim中可以直接查看文件编码   :set fileencoding   即可显示文件编码格式。   如果你只是想查看其它编码格式的文件或者想解决用Vim查看文件乱码的问题,那么你可以在   ~/.vimrc 文件中添加以下内容:   set encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936   这样,就可以让vim自动识别文件编码(可以自动识别UTF-8或者GBK编码的文件),其实就是依照fileencodings提供的编码列表尝试,如果没有找到合适的编码,就用latin-1(ASCII)编码打开。   2. enca (如果你的系统中没有安装这个命令,可以用sudo yum install -y enca 安装 )查看文件编码   $ enca filename   filename: Universal transformation format 8 bits;

Using iconv translit to convert from UTF-8 to CP1251

自闭症网瘾萝莉.ら 提交于 2019-12-25 07:19:36
问题 I'am trying to convert string " aÜ " from UTF-8 to CP1251 via C++ library iconv.h using TRANSLIT and as a result I get string " a? ", when expecting " aU ". When I use php script <?php echo iconv("UTF-8", "Windows-1251//TRANSLIT", "Ü");> on this computer, I get " aU " string as result. Here's the code: #include <cstdlib> #include <iconv.h> #include <locale.h> #include <stdio.h> #include <string.h> #include <errno.h> using namespace std; int IConvert(char *buf,char *outbuf, size_t len, const

Encoding Problem when filling out a text_field with Watir in Ruby

烈酒焚心 提交于 2019-12-24 02:56:06
问题 I'm using Watir to fill out a text_field with the html-code that I have scraped with another program before. The language of the website-content that I'm transfering is German, so there are some special characters involved, that don't exist in the English alphabet. Those characters are displayed properly in the html-file, but when transfered into the text_field of the Joomla installation (I'm transfering a website to Joomla with this program), the special characters are not displayed properly

iconv returns strange results

微笑、不失礼 提交于 2019-12-24 01:53:36
问题 I'm working on a way to solve the problem with special characters in an automated script for creating accounts in PHP. Since special characters are unwanted in email addresses and other places I'm trying to get rid of them, but I can't remove them before feeding them to the script since the users name has to be displayed properly to other users. Example: Jörgen Götz should get the email address jorgen.gotz@domain.com but in the user database his first name should still be Jörgen and his last

PHP iconv greek/cyrillic transliteration does not work

↘锁芯ラ 提交于 2019-12-22 09:07:42
问题 i have the following test code: setlocale(LC_ALL, 'en_US.UTF8'); function t($text) { echo "$text\n"; echo "encoding: ", mb_detect_encoding($text), "\n"; // transliterate $text = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $text); echo "iconv: ", $text, "\n"; } // Latvian alphabet t('AĀBCČDEĒFGĢHIĪJKĶLĻMNŅOPRSŠTUŪVZŽ aābcčdeēfgģhiījkķlļmnņoprsštuūvzž'); // Greek alphabet t('ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΜμΝνΞξΟοΠπΡρΣσςΤτΥυΦφΧχΨψΩω'); // Cyrillic alphabet + some rarer versions t(

Convert escaped codepoint to unicode character

六眼飞鱼酱① 提交于 2019-12-21 12:16:10
问题 I am trying to take a chunk of JSON that has strings which contain the literal characters \u009e and I would like to convert those characters to its associated single unicode character, in this case é . I use curl or wget to download the json which looks like: { "name": "Kitsun\u00e9" } And need to translate this in Vim to: { "name": "Kitsuné" } My first thought was to use Vim's iconv, but it does not evaluate the string as a single character and just returns the input. let code = '\u00e9'

iconv or mbstring?

北战南征 提交于 2019-12-21 07:36:34
问题 Which multibyte-handling library should I use : iconv or mbstring ? After some Googling I didn't find enough arguments to convince me to use one particularly, and I could not get any benchmark (and I'm too lazy do create one :-p). After all maybe this choice doesn't really matters ? Thanks for any piece of advice. 回答1: I tend to use a combination of both - depending on my needs. I use iconv to convert from one charset to another, but mbstring for simpler operations like mb_strtoupper() and mb

Is it possible to use a gcc compiled library with MSVC?

半腔热情 提交于 2019-12-21 05:29:20
问题 I have a project that relies on libiconv for several operations. I was using precompiled binaries for iconv.lib for Visual Studio 2008 but now I had to move on to Visual Studio 2010 and no more precompiled binaries were available. I decided to compile it myself but as the libiconv documentation states, there is no official support for MSVC compilers. However, I read somewhere that gcc could generate static libraries that were binary compatible with MSVC compilers, as long as the binary

PHP, convert UTF-8 to ASCII 8-bit

半腔热情 提交于 2019-12-21 04:58:22
问题 I'm trying to convert a string from UTF-8 to ASCII 8-bit by using the iconv function. The string is meant to be imported into an accounting software (some basic instructions parsed accordingly to SIE standards). What I'm running now: iconv("UTF-8", "ASCII", $this->_output) This works for accounting software #1, but software #2 complains about the encoding. Specified encoding by the standard is: IBM PC 8-bit extended ASCII (Codepage 437) . My question is, what version of ASCII is PHP encoding