I have a string that I\'m encoding into base64 to conserve space. Is it a big deal if I remove the equal sign at the end? Would this significantly decrease entropy? What can I d
It's fine to remove the equals signs, as long as you know what they do.
Base64 outputs 4 characters for every 3 bytes it encodes (in other words, each character encodes 6 bits). The padding characters are added so that any base64 string is always a multiple of 4 in length, the padding chars don't actually encode any data. (I can't say for sure why this was done - as a way of error checking if a string was truncated, to ease decoding, or something else?).
In any case, that means if you have x
base64 characters (sans padding), there will be 4-(x%4)
padding characters. (Though x%4=1
will never happen due the factorization of 6 and 8). Since these contain no actual data, and can be recovered, I frequently strip these off when I want to save space, e.g. the following::
from base64 import b64encode, b64decode
# encode data
raw = b'\x00\x01'
enc = b64encode(raw).rstrip("=")
# func to restore padding
def repad(data):
return data + "=" * (-len(data)%4)
raw = b64decode(repad(enc))