Extract links from string optimization

前端未结

关注

 7  989

予麋鹿 2021-01-03 05:21

I get data (HTML string) from website. I want to extract all links. I write function (it works), but it is so slow...

Can you help me to optimize it? What standard

7条回答

一生所求 (楼主)

2021-01-03 05:35
As others have pointed out, you are better off using regexes, data detectors or a parsing library. However, as specific feedback on your string processing:

The key with Swift strings is to embrace the forward-only nature of them. More often than not, integer indexing and random access is not necessary. As @gnasher729 pointed out, every time you call count you are iterating over the string. Similarly, the integer indexing extensions are linear, so if you use them in a loop, you can easily accidentally create a quadratic or cubic-complexity algorithm.

But in this case, there's no need to do all that work to convert string indices to random-access integers. Here is a version that I think is performing similar logic (look for a prefix, then look from there for a " character - ignoring that this doesn't cater for https, upper/lower case etc) using only native string indices:
```
func extractAllLinks(text: String) -> [String] {
    var links: [String] = []
    let prefix = "http://"
    let prefixLen = count(prefix)

    for var idx = text.startIndex; idx != text.endIndex; ++idx {
        let candidate = text[idx..
```
Even this could be further optimized (the advance(idx, count()) is a little inefficient) if there were other helpers such as findFromIndex etc. or a willingness to do without string slices and hand-roll the search for the end character.
0 讨论(0) 查看其它7个回答发布评论: 提交评论加载中...