.NET compression of XML to store in SQL Server database

后端 未结 4 1283
北海茫月
北海茫月 2020-12-19 17:26

Currently our .NET application constructs XML data in memory that we persist to a SQL Server database. The XElement object is converted to a string using ToString() and then

相关标签:
4条回答
  • 2020-12-19 17:43

    This article may help you get a start.

    The following snippet can compress a string and return a base-64 coded result:

    public static string Compress(string text)
    {
     byte[] buffer = Encoding.UTF8.GetBytes(text);
     MemoryStream ms = new MemoryStream();
     using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
     {
      zip.Write(buffer, 0, buffer.Length);
     }
    
     ms.Position = 0;
     MemoryStream outStream = new MemoryStream();
    
     byte[] compressed = new byte[ms.Length];
     ms.Read(compressed, 0, compressed.Length);
    
     byte[] gzBuffer = new byte[compressed.Length + 4];
     System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
     System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
     return Convert.ToBase64String (gzBuffer);
    }
    

    EDIT: As an aside, you may want to use CLOB formats even when storing XML as text because varchars have a very limited length - which XML can often quickly exceed.

    0 讨论(0)
  • 2020-12-19 18:00

    I know you tagged the question SQL 2005, but you should consider upgrading to SQL 2008 and using the wonderful new compression capabilities that come with it. Is out-of-the-box, transparent for your application and will save you a huge implementation/test/support cost.

    0 讨论(0)
  • 2020-12-19 18:04

    I think you should also re-test the XML column. It stores in binary, I know, not as text. It could be smaller, and may not perform badly, even if you don't actually need the additional features.

    0 讨论(0)
  • 2020-12-19 18:04

    Besides possibly compressing the string itself (perhaps using LBushkin's Base64 method above), you probably want to start with making sure you kill all the whitespace. The default XElement.ToString() method saves the element with "indenting". You need to use the ToString(SaveOptions options) method (using SaveOptions.DisableFormatting) if you want to make sure you've just got the tags and data.

    0 讨论(0)
提交回复
热议问题