I have an array of bytes (any length), and I want to encode this array into string using my own base encoder. In .NET
is standard Base64
encoder, but w
If performance is not an issue, use the BigInteger class in the background. You have a constructor for BigInteger that takes byte array, and you can then manually run loops of division and modulus to get the representation in other non-standard bases.
Also take a look at this.
Here is the sample code snippet to convert byte array to base64. There is a very good article on this, I took reference from this.
public class Test {
private static final char[] toBase64URL = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '-', '_'
};
public static void main(String[] args) {
byte[] mess = "ABC123".getBytes();
byte[] masks = { -128, 64, 32, 16, 8, 4, 2, 1 };
StringBuilder builder = new StringBuilder();
for(int i = 0; i < mess.length; i++) {
for (byte m : masks) {
if ((mess[i] & m) == m) {
builder.append('1');
} else {
builder.append('0');
}
}
}
System.out.println(builder.toString());
int i =0;
StringBuilder output = new StringBuilder();
while (i < builder.length()){
String part = builder.substring(i, i+6);
int index = Integer.parseInt(part, 2);
output.append(toBase64URL[index]);
i += 6;
}
System.out.println(output.toString());
}
}
Another example to look at is Ascii85, used in Adobe PostScript and PDF documents. In Ascii85, 5 characters are used to encode 4 bytes. You can figure out the efficiency of this coding as (256^4)/(85^5) = 96.8%. This is the fraction of bit combinations that will actually be used.
So, for whatever new base you would want to use to encode your data, you want to look for a power that will get it just above a power of 256 if you're trying to maximize coding efficiency. This might not be easy for every base. Checking base 53 shows that the best you'll probably get is using 7 bytes to encode 5 bytes (93.6% efficiency), unless you feel like using 88 bytes to encode 63 bytes.