I have a binary string that I am encoding in Base 64. Now, I need to know before hand the size of the final Base 64 encoded string will be.
Is there any way to calcu
If you do Base64 exactly right, and that includes padding the end with =
characters, and you break it up with a CR LF
every 72 characters, the answer can be found with:
code_size = ((input_size * 4) / 3);
padding_size = (input_size % 3) ? (3 - (input_size % 3)) : 0;
crlfs_size = 2 + (2 * (code_size + padding_size) / 72);
total_size = code_size + padding_size + crlfs_size;
In C, you may also terminate with a \0
-byte, so there'll be an extra byte there, and you may want to length-check at the end of every code as you write them, so if you're just looking for what you pass to malloc()
, you might actually prefer a version that wastes a few bytes, in order to make the coding simpler:
output_size = ((input_size * 4) / 3) + (input_size / 96) + 6;
I ran into a similar situation in python, and using codecs.iterencode(text, "base64") the correct calculation was:
adjustment = 3 - (input_size % 3) if (input_size % 3) else 0
code_padded_size = ( (input_size + adjustment) / 3) * 4
newline_size = ((code_padded_size) / 76) * 1
return code_padded_size + newline_size
Here is a simple C implementation (without modulus and trinary operators) for raw base64 encoded size (with standard '=' padding):
int output_size;
output_size = ((input_size - 1) / 3) * 4 + 4;
To that you will need to add any additional overhead for CRLF if required. The standard base64 encoding (RFC 3548 or RFC 4648) allows CRLF line breaks (at either 64 or 76 characters) but does not require it. The MIME variant (RFC 2045) requires line breaks after every 76 characters.
For example, the total encoded length using 76 character lines building on the above:
int final_size;
final_size = output_size + (output_size / 76) * 2;
See the base64 wikipedia entry for more variants.