How to merge 2 string array in Delphi

后端 未结 3 1076
有刺的猬
有刺的猬 2021-01-04 20:11

I have 2 or more dynamic string array that fill with some huge data , i want to merge this 2 array to one array , i know i can do it with a for loop like this :



        
相关标签:
3条回答
  • 2021-01-04 20:30

    You can use built-in Move function which moves a block of memory to another location. Parameters are source and target memory blocks and size of data to be moved.

    Because you are copying strings, source arrays must be destroyed after the merging by filling them with zeroes. Otherwise refcounts for strings will be all wrong causing havoc and destruction later in the program.

    var
      Arr1, Arr2, MergedArr: Array of string;
      I: Integer;
    begin
      SetLength(Arr1, 5000000);
      for I := Low(Arr1) to High(Arr1) do
        Arr1[I] := IntToStr(I);
    
      SetLength(Arr2, 5000000);
      for I := Low(Arr2) to High(Arr2) do
        Arr2[I] := IntToStr(I);
    
      // Set length of MergedArr to length of ( Arra1 + Arr2 )+ 2
      SetLength(MergedArr, High(Arr1)+ High(Arr2)+2);
    
      // Add Arr1 to MergedArr
      Move(Arr1[Low(Arr1)], MergedArr[Low(MergedArr)], Length(Arr1)*SizeOf(Arr1[0]));
    
      // Add Arr2 to MergedArr
      Move(Arr2[Low(Arr2)], MergedArr[High(Arr1)+1], Length(Arr2)*SizeOf(Arr2[0]));
    
      // Cleanup Arr1 and Arr2 without touching string refcount.
      FillChar(Arr1[Low(Arr1)], Length(Arr1)*SizeOf(Arr1[0]), 0);
      FillChar(Arr2[Low(Arr2)], Length(Arr2)*SizeOf(Arr2[0]), 0);
    
      // Test
      for I := Low(Arr1) to High(Arr1) do begin
        Assert(MergedArr[I] = IntToStr(I));
        Assert(MergedArr[I] = MergedArr[Length(Arr1) + I]);
      end;
    
      // Clear the array to see if something is wrong with refcounts
      for I := Low(MergedArr) to High(MergedArr) do
        MergedArr[I] := '';
    end;
    
    0 讨论(0)
  • 2021-01-04 20:35

    First of all string is special, so it should be treated specially: Don't try outsmarting the compiler, keep your code unchanged. String is special because it's reference counted. Every time you copy a string from one place to an other it's reference count is incremented. When the reference count reaches 0, the string is destroyed. Your code plays nice because it lets the compiler know what you're doing, and in turn the compiler gets the chance to properly increment all reference counts.

    Sure, you can play all sorts of tricks as suggested in the comments to gabr's answer, like filling the old arrays with zero's so the reference count in the new array remains valid, but you can't do that if you actually need the old arrays as well. And this is a bit of a hack (albeit one that will probably be valid for the foreseeable future). (and to be noted, I actually like this hack).

    Anyway, and this is the important part of my answer, your code is most likely not slow in the copying of the strings from one array to the other, it's most likely slowly somewhere else. Here's a short console application that creates two arrays, each with 5M random strings, then merges the two arrays into a third and displays the time it took to create the merge. Merging only takes about 300 milliseconds on my machine. Filling the array takes much longer, but I'm not timing that:

    program Project26;

    {$APPTYPE CONSOLE}
    
    uses SysUtils, Windows;
    
    var a, b, c: array of string;
        i: Integer;
    
        Freq: Int64;
        Start, Stop: Int64;
        Ticks: Cardinal;
    
    const count = 5000000;
    
    begin
      SetLength(a,count);
      SetLength(b,count);
      for i:=0 to count-1 do
      begin
        a[i] := IntToStr(Random(1));
        b[i] := IntToStr(Random(1));
      end;
    
      WriteLn('Moving');
    
      QueryPerformanceFrequency(Freq);
      QueryPerformanceCounter(Start);
    
      SetLength(c, Length(a) + Length(b));
      for i:=0 to High(a) do
        c[i] := a[i];
      for i:=0 to High(b) do
        c[i+Length(a)] := b[i];
    
      QueryPerformanceCounter(Stop);
      WriteLn((Stop - Start) div (Freq div 1000), ' milliseconds');
      ReadLn;
    
    end.
    
    0 讨论(0)
  • 2021-01-04 20:40

    An excellent maxim is that the fastest code is that which never runs. Since copying is expensive you should look to avoid the cost of copying.

    You can do this with a virtual array. Create a class which holds an array of array of string. In your example the outer array would hold two string arrays.

    • Add a Count property that returns the total number of strings in all of the arrays.
    • Add a default indexed property that operates by working out which of the outer arrays the index refers to and then returns the appropriate value from the inner array.
    • For extra points implement an enumerator to make for in work.
    0 讨论(0)
提交回复
热议问题