What is the fastest way to Parse a line in Delphi?

前端 未结 9 1901
深忆病人
深忆病人 2020-12-13 01:18

I have a huge file that I must parse line by line. Speed is of the essence.

Example of a line:

Token-1   Here-is-the-Next-Token      La         


        
相关标签:
9条回答
  • 2020-12-13 01:53

    The fastest way to write the code would probably be to create a TStringList and assign each line in your text file to the CommaText property. By default, white space is a delimiter, so you will get one StringList item per token.

    MyStringList.CommaText := s;
    for i := 0 to MyStringList.Count - 1 do
    begin
      // process each token here
    end;
    

    You'll probably get better performance by parsing each line yourself, though.

    0 讨论(0)
  • 2020-12-13 01:55

    If speed is of the essence, custom code is the answer. Check out the Windows API that will map your file into memory. You can then just use a pointer to the next character to do your tokens, marching through as required.

    This is my code for doing a mapping:

    procedure TMyReader.InitialiseMapping(szFilename : string);
    var
    //  nError : DWORD;
        bGood : boolean;
    begin
        bGood := False;
        m_hFile := CreateFile(PChar(szFilename), GENERIC_READ, 0, nil, OPEN_EXISTING, 0, 0);
        if m_hFile <> INVALID_HANDLE_VALUE then
        begin
            m_hMap := CreateFileMapping(m_hFile, nil, PAGE_READONLY, 0, 0, nil);
            if m_hMap <> 0 then
            begin
                m_pMemory := MapViewOfFile(m_hMap, FILE_MAP_READ, 0, 0, 0);
                if m_pMemory <> nil then
                begin
                    htlArray := Pointer(Integer(m_pMemory) + m_dwDataPosition);
                    bGood := True;
                end
                else
                begin
    //              nError := GetLastError;
                end;
            end;
        end;
        if not bGood then
            raise Exception.Create('Unable to map token file into memory');
    end;
    
    0 讨论(0)
  • 2020-12-13 01:59

    Rolling your own is the fastest way for sure. For more on this topic, you could see Synedit's source code which contains lexers (called highlighters in the project's context) for about any language on the market. I suggest you take one of those lexers as a base and modify for your own usage.

    0 讨论(0)
提交回复
热议问题