How could I encode a string of 1s and 0s for transport?

后端 未结 5 1779
后悔当初
后悔当初 2020-12-21 12:02

For a genetic algorithm application, I\'m using a whole load of binary strings. Most of the time they literally take the form of 01001010110, so that they can b

相关标签:
5条回答
  • 2020-12-21 12:24

    What about converting it to it's base 10 integer equivalent?

    int myBin = Convert.ToInt32("01001010110", 2);
    

    Convert.ToInt32() documentation

    0 讨论(0)
  • 2020-12-21 12:28

    Or implement Run length encoding or Huffman coding. Both are fairly easy to implement. RLE is by far the easiest, but will in most cases have worse compression ratio. If your data typically has many consecutive characters of the same value, it could still provide a substantial improvement.

    0 讨论(0)
  • 2020-12-21 12:35

    I would just store them as an array of bytes and use a helper function to translate between the byte array version and the string version.

    0 讨论(0)
  • 2020-12-21 12:44

    The simplest would be to take each digit and treat it as a bit. Each group of 8 bits can be stored in a byte. Then you can send it as a stream of bytes. You will also need to store the length of the original string so that you can distinguish between "0" and "00".

    Here is one way you could write the conversion from string to a byte array:

    byte[] convertToBytes(string s)
    {
        byte[] result = new byte[(s.Length + 7) / 8];
    
        int i = 0;
        int j = 0;
        foreach (char c in s)
        {
            result[i] <<= 1;
            if (c == '1')
                result[i] |= 1;
            j++;
            if (j == 8)
            {
                i++;
                j = 0;
            }
        }
        return result;
    }
    

    Reversing the operation is very similar.

    If you need to transmit the data as a string you can base 64 encode the resulting byte array.

    You may also want to consider keeping it in this form in memory too. This will be much more efficient than storing it as a string where each digit is stored as a 2 byte character. You are using roughly 16 times more memory than you need to for storing your data. The disadvtange is that it is slightly more difficult to use in this form, so if you have enough memory then what you are currently doing might be just fine.

    0 讨论(0)
  • 2020-12-21 12:44

    Abe Miessler's answer is a good one, but with caveat mentioned in comments.

    If 64 bits is not enough to represent your string, then consider using a BigInt class http://www.codeproject.com/KB/cs/BigInt.aspx (you would probably want to add to/fromBinary() extension methods to it. Alternatively represent this as a ... linked list of bytes.

    Either approach has the problem of discarding any leading zeroes, so you would want to store the original length as well.

    0 讨论(0)
提交回复
热议问题