8B EC 56 8B F4 68 00 70 40 00 FF 15 BC 82 40
A senquence like above can be segmented in various ways,each segment can be translated to correspon
Maybe you find it interesting to think about the other direction: How would you have to design your code to be easy to segment for others? You could require the most significant bit of the byte starting a sequence to be zero, and those in the middle of a sequence to be one, like UTF-8 does it. Then if you start from a random position – assuming you know where the bytes are – it is easy to find the next sequence. Going one step further, how would you code a pure bit stream such that the start of a sequence is easy to find. How many bits were wasted by such a coding?
Since you asked about the maths, I think the relevant topics are “Coding Theory”, “Variable-length codes” or “Prefix codes”.
How do you find a gene in a sequence of base pairs?