问题
Background
Suppose you wish to find a partial text from a formatted phone number, and you wish to mark the finding.
For example, if you have this phone number: "+972 50-123-4567" , and you search for 2501 , you will be able to mark the text within it, of "2 50-1".
More examples of a hashmap of queries and the expected result, if the text to search in is "+972 50-123-45678", and the allowed characters are "01234567890+*#" :
val tests = hashMapOf(
"" to Pair(0, 0),
"9" to Pair(1, 2),
"97" to Pair(1, 3),
"250" to Pair(3, 7),
"250123" to Pair(3, 11),
"250118" to null,
"++" to null,
"8" to Pair(16, 17),
"+" to Pair(0, 1),
"+8" to null,
"78" to Pair(15, 17),
"5678" to Pair(13, 17),
"788" to null,
"+ " to Pair(0, 1),
" " to Pair(0, 0),
"+ 5" to null,
"+ 9" to Pair(0, 2)
)
The problem
You might think: Why not just use "indexOf" or clean the string and find the occurrence ?
But that's wrong, because I want to mark the occurrence, ignoring some characters on the way.
What I've tried
I actually have the answer after I worked on it for quite some time. Just wanted to share it, and optionally see if anyone can write a nicer/shorter code, that will produce the same behavior.
I had a solution before, which was quite shorter, but it assumed that the query contains only allowed characters.
The question
Well there is no question this time, because I've found an answer myself.
However, again, if you can think of a more elegant and/shorter solution, which is as efficient as what I wrote, please let me know.
I'm pretty sure regular expressions could be a solution here, but they tend to be unreadable sometimes, and also very inefficient compared to exact code. Still could also be nice to know how this kind of question would work for it. Maybe I could perform a small benchmark on it too.
回答1:
OK so here's my solution, including a sample to test it:
TextSearchUtil.kt
object TextSearchUtil {
/**@return where the query was found. First integer is the start. The second is the last, excluding.
* Special cases: Pair(0,0) if query is empty or ignored, null if not found.
* @param text the text to search within. Only allowed characters are searched for. Rest are ignored
* @param query what to search for. Only allowed characters are searched for. Rest are ignored
* @param allowedCharactersSet the only characters we should be allowed to check. Rest are ignored*/
fun findOccurrenceWhileIgnoringCharacters(text: String, query: String, allowedCharactersSet: HashSet<Char>): Pair<Int, Int>? {
//get index of first char to search for
var searchIndexStart = -1
for ((index, c) in query.withIndex())
if (allowedCharactersSet.contains(c)) {
searchIndexStart = index
break
}
if (searchIndexStart == -1) {
//query contains only ignored characters, so it's like an empty one
return Pair(0, 0)
}
//got index of first character to search for
if (text.isEmpty())
//need to search for a character, but the text is empty, so not found
return null
var mainIndex = 0
while (mainIndex < text.length) {
var searchIndex = searchIndexStart
var isFirstCharToSearchFor = true
var secondaryIndex = mainIndex
var charToSearch = query[searchIndex]
secondaryLoop@ while (secondaryIndex < text.length) {
//skip ignored characters on query
if (!isFirstCharToSearchFor)
while (!allowedCharactersSet.contains(charToSearch)) {
++searchIndex
if (searchIndex >= query.length) {
//reached end of search while all characters were fine, so found the match
return Pair(mainIndex, secondaryIndex)
}
charToSearch = query[searchIndex]
}
//skip ignored characters on text
var c: Char? = null
while (secondaryIndex < text.length) {
c = text[secondaryIndex]
if (allowedCharactersSet.contains(c))
break
else {
if (isFirstCharToSearchFor)
break@secondaryLoop
++secondaryIndex
}
}
//reached end of text
if (secondaryIndex == text.length) {
if (isFirstCharToSearchFor)
//couldn't find the first character anywhere, so failed to find the query
return null
break@secondaryLoop
}
//time to compare
if (c != charToSearch)
break@secondaryLoop
++searchIndex
isFirstCharToSearchFor = false
if (searchIndex >= query.length) {
//reached end of search while all characters were fine, so found the match
return Pair(mainIndex, secondaryIndex + 1)
}
charToSearch = query[searchIndex]
++secondaryIndex
}
++mainIndex
}
return null
}
}
Sample usage to test it :
MainActivity.kt
class MainActivity : AppCompatActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
//
val text = "+972 50-123-45678"
val allowedCharacters = "01234567890+*#"
val allowedPhoneCharactersSet = HashSet<Char>(allowedCharacters.length)
for (c in allowedCharacters)
allowedPhoneCharactersSet.add(c)
//
val tests = hashMapOf(
"" to Pair(0, 0),
"9" to Pair(1, 2),
"97" to Pair(1, 3),
"250" to Pair(3, 7),
"250123" to Pair(3, 11),
"250118" to null,
"++" to null,
"8" to Pair(16, 17),
"+" to Pair(0, 1),
"+8" to null,
"78" to Pair(15, 17),
"5678" to Pair(13, 17),
"788" to null,
"+ " to Pair(0, 1),
" " to Pair(0, 0),
"+ 5" to null,
"+ 9" to Pair(0, 2)
)
for (test in tests) {
val result = TextSearchUtil.findOccurrenceWhileIgnoringCharacters(text, test.key, allowedPhoneCharactersSet)
val isResultCorrect = result == test.value
val foundStr = if (result == null) null else text.substring(result.first, result.second)
when {
!isResultCorrect -> Log.e("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result found String: \"$foundStr\"")
foundStr == null -> Log.d("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result")
else -> Log.d("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result found String: \"$foundStr\"")
}
}
//
Log.d("AppLog", "special cases:")
Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("a", "c", allowedPhoneCharactersSet) == Pair(0, 0)}")
Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("ab", "c", allowedPhoneCharactersSet) == Pair(0, 0)}")
Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("ab", "cd", allowedPhoneCharactersSet) == Pair(0, 0)}")
Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("a", "cd", allowedPhoneCharactersSet) == Pair(0, 0)}")
}
}
If I want to highlight the result, I can use something like that:
val pair = TextSearchUtil.findOccurrenceWhileIgnoringCharacters(text, "2501", allowedPhoneCharactersSet)
if (pair == null)
textView.text = text
else {
val wordToSpan = SpannableString(text)
wordToSpan.setSpan(BackgroundColorSpan(0xFFFFFF00.toInt()), pair.first, pair.second, Spannable.SPAN_EXCLUSIVE_EXCLUSIVE)
textView.setText(wordToSpan, TextView.BufferType.SPANNABLE)
}
来源:https://stackoverflow.com/questions/57534062/how-to-find-a-string-within-another-ignoring-some-characters