Speed up large string data parser function

前端 未结 1 1859
情深已故
情深已故 2021-01-28 07:15

I currently have a file with 1 million characters.. the file is 1 MB in size. I am trying to parse data with this old function that still works but very slow.

st         


        
1条回答
  •  北恋
    北恋 (楼主)
    2021-01-28 07:23

    The reason that is so slow is because you keep destroying and recreating a 1 MB string over and over. Strings are immutable, so strData = Mid(strData... creates a new string and copies the remaining of the 1 MB string data to a new strData variable over and over and over. Interestingly, even VB6 allowed for a progressive index.

    I would have processed the disk file LINE BY LINE and plucked out the info as it was read (see streamreader.ReadLine) to avoid working with a 1MB string. Pretty much the same method could be used there.

    ' 1 MB textbox data (!?)
    Dim sData As String = TextBox1.Text
    ' start/stop - probably fake
    Dim sStart As String = "start"
    Dim sStop As String = "end"
    
    ' result
    Dim sbResult As New StringBuilder
    ' progressive index
    Dim nNDX As Integer = 0
    
    ' shortcut at least as far as typing and readability
    Dim MagicNumber As Integer = sStart.Length
    ' NEXT index of start/stop after nNDX
    Dim i As Integer = 0
    Dim j As Integer = 0
    
    ' loop as long as string remains 
     Do While (nNDX < sData.Length) AndAlso (i >= 0)
        i = sData.IndexOf(sStart, nNDX)             ' start index
        j = sData.IndexOf(sStop, i)                 ' stop index
    
        ' Extract and append bracketed substring 
        sbResult.Append(sData.Substring(i + MagicNumber, j - (i + MagicNumber)))
        ' add a cute comma
        sbResult.Append(",")
    
        nNDX = j                               ' where we start next time
        i = sData.IndexOf(sStart, nNDX)
     Loop
    
     ' remove last comma
     sbResult.Remove(sbResult.ToString.Length - 1, 1)
    
     ' show my work
     Console.WriteLine(sbResult.ToString)
    

    EDIT: Small mod for the ad hoc test data

    0 讨论(0)
提交回复
热议问题