Ok, I am reading in dat files into a byte array. For some reason, the people who generate these files put about a half meg\'s worth of useless null bytes at the end of the
I agree with Jon. The critical bit is that you must "touch" every byte from the last one until the first non-zero byte. Something like this:
byte[] foo;
// populate foo
int i = foo.Length - 1;
while(foo[i] == 0)
--i;
// now foo[i] is the last non-zero byte
byte[] bar = new byte[i+1];
Array.Copy(foo, bar, i+1);
I'm pretty sure that's about as efficient as you're going to be able to make it.
@Factor Mystic,
I think there is a shortest way:
var data = new byte[] { 0x01, 0x02, 0x00, 0x03, 0x04, 0x00, 0x00, 0x00, 0x00 };
var new_data = data.TakeWhile((v, index) => data.Skip(index).Any(w => w != 0x00)).ToArray();
There is always a LINQ answer
byte[] data = new byte[] { 0x01, 0x02, 0x00, 0x03, 0x04, 0x00, 0x00, 0x00, 0x00 };
bool data_found = false;
byte[] new_data = data.Reverse().SkipWhile(point =>
{
if (data_found) return false;
if (point == 0x00) return true; else { data_found = true; return false; }
}).Reverse().ToArray();
You could just count the number of zero at the end of the array and use that instead of .Length when iterating the array later on. You could encapsulate this however you like. Main point is you don't really need to copy it into a new structure. If they are big, it may be worth it.
if in the file null bytes can be valid values, do you know that the last byte in the file cannot be null. if so, iterating backwards and looking for the first non-null entry is probably best, if not then there is no way to tell where the actual end of the file is.
If you know more about the data format, such as there can be no sequence of null bytes longer than two bytes (or some similar constraint). Then you may be able to actually do a binary search for the 'transition point'. This should be much faster than the linear search (assuming that you can read in the whole file).
The basic idea (using my earlier assumption about no consecutive null bytes), would be:
var data = (byte array of file data...);
var index = data.length / 2;
var jmpsize = data.length/2;
while(true)
{
jmpsize /= 2;//integer division
if( jmpsize == 0) break;
byte b1 = data[index];
byte b2 = data[index + 1];
if(b1 == 0 && b2 == 0) //too close to the end, go left
index -=jmpsize;
else
index += jmpsize;
}
if(index == data.length - 1) return data.length;
byte b1 = data[index];
byte b2 = data[index + 1];
if(b2 == 0)
{
if(b1 == 0) return index;
else return index + 1;
}
else return index + 2;
Given the extra questions now answered, it sounds like you're fundamentally doing the right thing. In particular, you have to touch every byte of the file from the last 0 onwards, to check that it only has 0s.
Now, whether you have to copy everything or not depends on what you're then doing with the data.
The "you have to read every byte between the truncation point and the end of the file" is the critical part though.