string-search

php - Is strpos the fastest way to search for a string in a large body of text?

旧时模样 提交于 2019-11-30 01:30:50
问题 if (strpos(htmlentities($storage->getMessage($i)),'chocolate')) Hi, I'm using gmail oauth access to find specific text strings in email addresses. Is there a way to find text instances quicker and more efficiently than using strpos in the above code? Should I be using a hash technique? 回答1: According to the PHP manual, yes- strpos() is the quickest way to determine if one string contains another. Note: If you only want to determine if a particular needle occurs within haystack, use the faster

Boyer Moore Algorithm Understanding and Example?

主宰稳场 提交于 2019-11-29 18:47:43
I am facing issues in understanding Boyer Moore String Search algorithm. I am following the following document. Link I am not able to work out my way as to exactly what is the real meaning of delta1 and delta2 here, and how are they applying this to find string search algorithm. Language looked little vague.. Kindly if anybody out there can help me out in understanding this, it would be really helpful. Or, if you know of any other link or document available that is easy to understand, then please share. Thanks in advance. btilly First piece of advice, take a deep breath. You're clearly

String searching algorithms in Java

a 夏天 提交于 2019-11-29 15:45:59
问题 I am doing string matching with big amount of data. EDIT: I am matching words contained in a big list with some ontology text files. I take each file from ontology, and search for a match between the third String of each file line and any word from the list. I made a mistake in overseeing the fact that what I need to do is not pure matching (results are poor), but I need some looser matching function that will also return results when the string is contained inside another string. I did this

Fastest way to search in a string collection

岁酱吖の 提交于 2019-11-28 15:22:34
Problem: I have a text file of around 120,000 users (strings) which I would like to store in a collection and later to perform a search on that collection. The search method will occur every time the user change the text of a TextBox and the result should be the strings that contain the text in TextBox . I don't have to change the list, just pull the results and put them in a ListBox . What I've tried so far: I tried with two different collections/containers, which I'm dumping the string entries from an external text file (once, of course): List<string> allUsers; HashSet<string> allUsers; With

Boyer Moore Algorithm Understanding and Example?

﹥>﹥吖頭↗ 提交于 2019-11-28 13:42:28
问题 I am facing issues in understanding Boyer Moore String Search algorithm. I am following the following document. Link I am not able to work out my way as to exactly what is the real meaning of delta1 and delta2 here, and how are they applying this to find string search algorithm. Language looked little vague.. Kindly if anybody out there can help me out in understanding this, it would be really helpful. Or, if you know of any other link or document available that is easy to understand, then

How to find best fuzzy match for a string in a large string database

混江龙づ霸主 提交于 2019-11-28 06:26:15
I have a database of strings (arbitrary length) which holds more than one million items (potentially more). I need to compare a user-provided string against the whole database and retrieve an identical string if it exists or otherwise return the closest fuzzy match(es) (60% similarity or better). The search time should ideally be under one second. My idea is to use edit distance for comparing each db string to the search string after narrowing down the candidates from the db based on their length. However, as I will need to perform this operation very often, I'm thinking about building an

stripos returns false when special characters is used

旧巷老猫 提交于 2019-11-28 01:57:57
I am using the stripos function to check if a string is located inside another string, ignoring any cases. Here is the problem: stripos("ø", "Ø") returns false. While stripos("Ø", "Ø") returns true. As you might see, it looks like the function does NOT do a case- insensitive search in this case. The function has the same problems with characters like Ææ and Åå. These are Danish characters. Use mb_stripos() instead. It's character set aware and will handle multi-byte character sets. stripos() is a holdover from the good old days when there was only ASCII and all chars were only 1 byte. You need

MySQL: How to search multiple tables for a string existing in any column

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-27 19:11:11
How can I search for in table_a table_b table_c , which have a random number of columns for a string? I know this is not proper sql but it would be something like: SELECT * FROM users, accounts, something_else WHERE ->ANY COLUMN CONTAINS 'this_string'<- Ty in advance for SO community Add fulltext indexes to all of the string columns in all of those tables, then union the results select * from table1 where match(col1, col2, col3) against ('some string') union all select * from table2 where match(col1, col2) against ('some string') union all select * from table3 where match(col1, col2, col3,

How to find best fuzzy match for a string in a large string database

筅森魡賤 提交于 2019-11-27 05:36:57
问题 I have a database of strings (arbitrary length) which holds more than one million items (potentially more). I need to compare a user-provided string against the whole database and retrieve an identical string if it exists or otherwise return the closest fuzzy match(es) (60% similarity or better). The search time should ideally be under one second. My idea is to use edit distance for comparing each db string to the search string after narrowing down the candidates from the db based on their

Change foreign characters to their normal equivalent

江枫思渺然 提交于 2019-11-27 04:03:00
I am using php and I was wondering if there was a predefined way to convert foreign characters to their non-foreign alternatives. Characters such as ê, ë, é all resulting to 'e' . I'm looking for a function that would take a string and return it without the special characters. Any ideas would be greatly appreciated! Edgar Zagórski After failing to find suitable convertors I created my own collection that suits my needs including my favorite Cyrillic conversion that by default has numerous variations. function transliterateString($txt) { $transliterationTable = array('á' => 'a', 'Á' => 'A', 'à'