non-ascii-characters | 易学教程

c reading non ASCII characters

阅读更多关于 c reading non ASCII characters

问题 I am parsing a file that involves characters such as æ ø å . If we assume I have stored a line of the text file as follows #define MAXLINESIZE 1024 char* buffer = malloc(MAXLINESIZE) ... fgets(buffer,MAXLINESIZE,handle) ... if I wanted to count the number of characters on a line. If I try to do the following: char* p = buffer int count = 0; while (*p != '\n') { if (isgraph(*p)) { count++; } p++; } this ignores the any occurrence of æ ø å ie: counting "aåeæioøu" would return 5 not 8 do I need

Node JS crypto, cannot create hmac on chars with accents

阅读更多关于 Node JS crypto, cannot create hmac on chars with accents

I am having an issue generating the correct signature in NodeJS (using crypto.js) when the text I am trying to encrypt has accented characters (such as ä,ï,ë) generateSignature = function (str, secKey) { var hmac = crypto.createHmac('sha1', secKey); var sig = hmac.update(str).digest('hex'); return sig; }; This function will return the correct HMAC signature if 'str' contains no accented characters (chars such as ä,ï,ë). If there are accented chars present in the text, it will not return the correct HMAC. The accented characters are valid in UTF8 encoding so I dont know why crypto has a problem

jQuery DataTables - Accent-Insensitive Alphabetization and Searching

阅读更多关于 jQuery DataTables - Accent-Insensitive Alphabetization and Searching

问题 When using jQuery DataTables is it possible to do accent-insensitive searches when using the filter? For instance, when I put the 'e' character, I'd like to search every word with 'e' or 'é', 'è'. Something that came to mind is normalizing the strings and putting them into a separate, hidden column but that wouldn't solve the alphabetizing issue. EDIT I tried the following: $.fn.dataTableExt.ofnSearch = function ( data ) { return ! data ? '' : typeof data === 'string' ? data .replace( /\n/g,

Python encoding/decoding problems

阅读更多关于 Python encoding/decoding problems

问题 How do I decode strings such as this one "weren\xe2\x80\x99t" back to the normal encoding. So this word is actually weren't and not "weren\xe2\x80\x99t"? For example: print "\xe2\x80\x9cThings" string = "\xe2\x80\x9cThings" print string.decode('utf-8') print string.encode('ascii', 'ignore') â€œThings “Things Things But I actually want to get "Things. or: print "weren\xe2\x80\x99t" string = "weren\xe2\x80\x99t" print string.decode('utf-8') print string.encode('ascii', 'ignore') werenâ€™t weren

Encode extended ASCII characters in a Code 128 barcode

阅读更多关于 Encode extended ASCII characters in a Code 128 barcode

I want to encode the string "QuiÑones" in a Code 128 bar code. Is it possible to include extended ASCII characters in the Code 128 encoding? . I did some research on Google which suggested that it is possible by using FNC4, but I didn't find exactly how to do it. It would be of great help if some one could assist me with a solution in the C language. "Extended ASCII" characters with byte values from 128 to 255 can indeed be represented in Code 128 encodation by using the special FNC4 function character. For general use (in open applications) it is necessary that such characters belong to the

Ignoring accents while searching the database using Entity Framework

阅读更多关于 Ignoring accents while searching the database using Entity Framework

I have a database table that contains names with accented characters. Like ä and so on. I need to get all records using EF4 from a table that contains some substring regardless of accents . So the following code: myEntities.Items.Where(i => i.Name.Contains("a")); should return all items with a name containing a , but also all items containing ä , â and so on. Is this possible? If you set an accent-insensitive collation order on the Name column then the queries should work as required. Setting an accent-insensitive collation will fix the problem. You can change the collation for a column in SQL

How to account for accent characters for regex in Python?

阅读更多关于 How to account for accent characters for regex in Python?

问题 I currently use re.findall to find and isolate words after the '#' character for hash tags in a string: hashtags = re.findall(r'#([A-Za-z0-9_]+)', str1) It searches str1 and finds all the hashtags. This works however it doesn't account for accented characters like these for example: áéíóúñü¿ . If one of these letters are in str1, it will save the hashtag up until the letter before it. So for example, #yogenfrüz would be #yogenfr . I need to be able to account for all accented letters that

PHP-REGEX: accented letters matches non-accented ones, and vice versa. How to achieve this?

阅读更多关于 PHP-REGEX: accented letters matches non-accented ones, and vice versa. How to achieve this?

I want to do typical highlight code. So I have something like: $valor = preg_replace("/(".$_REQUEST['txt_search'].")/iu", "<span style='background-color:yellow; font-weight:bold;'>\\1</span>", $valor); Now, the request word could be something like "josé". And with it, I want "jose" or "JOSÉ" or "José" etc highlighted too. With this expression, if I write "josé", it matches "josé" and "JOSÉ" (and all the case variants). It always matches the accented variants only. If I search "jose", it matches "JOSE", "jose", "Jose" but not the accented ones. So I've partially what I want, cause I have case

jQuery DataTables - Accent-Insensitive Alphabetization and Searching

阅读更多关于 jQuery DataTables - Accent-Insensitive Alphabetization and Searching

When using jQuery DataTables is it possible to do accent-insensitive searches when using the filter? For instance, when I put the 'e' character, I'd like to search every word with 'e' or 'é', 'è'. Something that came to mind is normalizing the strings and putting them into a separate, hidden column but that wouldn't solve the alphabetizing issue. EDIT I tried the following: $.fn.dataTableExt.ofnSearch = function ( data ) { return ! data ? '' : typeof data === 'string' ? data .replace( /\n/g, ' ' ) .replace( /á/g, 'a' ) .replace( /é/g, 'e' ) .replace( /í/g, 'i' ) .replace( /ó/g, 'o' ) .replace(

Why is this symbol showing up on Chrome and not Firefox or Edge?

阅读更多关于 Why is this symbol showing up on Chrome and not Firefox or Edge?

So this web page is rendering with these symbols and they are found throughout this website/application but on no other sites. Can anyone tell me 1) What the symbol is 2) why it is showing up only in one browser ? 9999years That character is U+2028 Line Separator, which is a kind of newline character. Think of it as the Unicode equivalent of HTML’s <br> . As to why it shows up here: my guess would be that an internal database uses LSEP to not conflict with literal newlines or HTML tags (which might break the database or cause security errors), and either: The server-side scripts that convert