Extract links from string optimization

前端 未结 7 989
予麋鹿
予麋鹿 2021-01-03 05:21

I get data (HTML string) from website. I want to extract all links. I write function (it works), but it is so slow...

Can you help me to optimize it? What standard

7条回答
  •  一生所求
    2021-01-03 05:35

    As others have pointed out, you are better off using regexes, data detectors or a parsing library. However, as specific feedback on your string processing:

    The key with Swift strings is to embrace the forward-only nature of them. More often than not, integer indexing and random access is not necessary. As @gnasher729 pointed out, every time you call count you are iterating over the string. Similarly, the integer indexing extensions are linear, so if you use them in a loop, you can easily accidentally create a quadratic or cubic-complexity algorithm.

    But in this case, there's no need to do all that work to convert string indices to random-access integers. Here is a version that I think is performing similar logic (look for a prefix, then look from there for a " character - ignoring that this doesn't cater for https, upper/lower case etc) using only native string indices:

    func extractAllLinks(text: String) -> [String] {
        var links: [String] = []
        let prefix = "http://"
        let prefixLen = count(prefix)
    
        for var idx = text.startIndex; idx != text.endIndex; ++idx {
            let candidate = text[idx..

    Even this could be further optimized (the advance(idx, count()) is a little inefficient) if there were other helpers such as findFromIndex etc. or a willingness to do without string slices and hand-roll the search for the end character.

提交回复
热议问题