Finding position of '1's efficiently in an bit array

问题

I'm wiring a program that tests a set of wires for open or short circuits. The program, which runs on an AVR, drives a test vector (a walking '1') onto the wires and receives the result back. It compares this resultant vector with the expected data which is already stored on an SD Card or external EEPROM.

Here's an example, assume we have a set of 8 wires all of which are straight through i.e. they have no junctions. So if we drive 0b00000010 we should receive 0b00000010.

Suppose we receive 0b11000010. This implies there is a short circuit between wire 7,8 and wire 2. I can detect which bits I'm interested in by 0b00000010 ^ 0b11000010 = 0b11000000. This tells me clearly wire 7 and 8 are at fault but how do I find the position of these '1's efficiently in an large bit-array. It's easy to do this for just 8 wires using bit masks but the system I'm developing must handle up to 300 wires (bits). Before I started using macros like the following and testing each bit in an array of 300*300-bits I wanted to ask here if there was a more elegant solution.

 #define BITMASK(b) (1 << ((b) % 8))
 #define BITSLOT(b) ((b / 8))
 #define BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
 #define BITCLEAR(a,b) ((a)[BITSLOT(b)] &= ~BITMASK(b))
 #define BITTEST(a,b) ((a)[BITSLOT(b)] & BITMASK(b))
 #define BITNSLOTS(nb) ((nb + 8 - 1) / 8)

Just to further show how to detect an open circuit. Expected data: 0b00000010, received data: 0b00000000 (the wire isn't pulled high). 0b00000010 ^ 0b00000000 = 0b0b00000010 - wire 2 is open.

NOTE: I know testing 300 wires is not something the tiny RAM inside an AVR Mega 1281 can handle, that is why I'll split this into groups i.e. test 50 wires, compare, display result and then move forward.

回答1:

Many architectures provide specific instructions for locating the first set bit in a word, or for counting the number of set bits. Compilers usually provide intrinsics for these operations, so that you don't have to write inline assembly. GCC, for example, provides __builtin_ffs, __builtin_ctz, __builtin_popcount, etc., each of which should map to the appropriate instruction on the target architecture, exploiting bit-level parallelism.

If the target architecture doesn't support these, an efficient software implementation is emitted by the compiler. The naive approach of testing the vector bit by bit in software is not very efficient.

If your compiler doesn't implement these, you can still code your own implementation using a de Bruijn sequence.

回答2:

How often do you expect faults? If you don't expect them that often, then it seems pointless to optimize the "fault exists" case -- the only part that will really matter for speed is the "no fault" case.

To optimize the no-fault case, simply XOR the actual result with the expected result and a input ^ expected == 0 test to see if any bits are set.

You can use a similar strategy to optimize the "few faults" case, if you further expect the number of faults to typically be small when they do exist -- mask the input ^ expected value to get just the first 8 bits, just the second 8 bits, and so on, and compare each of those results to zero. Then, you just need to search for the set bits within the ones that are not equal to zero, which should narrow the search space to something that can be done pretty quickly.

回答3:

You can use a lookup table. For example log-base-2 lookup table of 255 bytes can be used to find the most-significant 1-bit in a byte:

uint8_t bit1 = log2[bit_mask];

where log2 is defined as follows:

uint8_t const log2[] = {
   0,               /* not used log2[0] */
   0,               /* log2[0x01] */
   1, 1             /* log2[0x02], log2[0x03] */
   2, 2, 2, 2,      /* log2[0x04],..,log2[0x07] */
   3, 3, 3, 3, 3, 3, 3, 3, /* log2[0x08],..,log2[0x0F */ 
   ... 
}

On most processors a lookup table like this will go to ROM. But AVR is a Harvard machine and to place data in code space (ROM) requires special non-standard extension, which depends on the compiler. For example the IAR AVR compiler would need use the extended keyword __flash. In WinAVR (GNU AVR) you would need to use the PROGMEM attribute, but it's more complex than that, because you would also need to use special macros to to read from the program space.

回答4:

I think there is only one way to do this:

Create an array out "outdata". Each item of the array can for example correspond an 8-bit port register.
Send the outdata on the wires.
Read back this data as "indata".
Store the indata in an array mapped exactly as the outdata.
In a loop, XOR each byte of outdata with each byte of indata.

I would strongly recommend inline functions instead of those macros.

Why can't your MCU handle 300 wires?

300/8 = 37.5 bytes. Rounded to 38. It needs to be stored twice, outdata and indata, 38*2 = 76 bytes.

You can't spare 76 bytes of RAM?

回答5:

I think you're missing the forest through the trees. Seems like a bed of nails test. First test some assumptions: 1) You know which pins should be live for each pin tested/energized. 2) you have a netlist translated for step 1 into a file on sd

If you operate on a byte level as well as bit, it simplifies the issue. If you energize a pin, there is an expected pattern out stored in your file. First find the mismatched bytes; identify mismatched pins in the byte; finally store the energized pin with the faulty pin numbers.

You don't need an array for searching, or results. general idea:

numwires=300;

numbytes=numwires/8 + (numwires%8)?1:0;

for(unsigned char currbyte=0; currbyte<numbytes; currbyte++)
{
   unsigned char testbyte=inchar(baseaddr+currbyte)
  unsigned char goodbyte=getgoodbyte(testpin,currbyte/*byte offset*/);
  if( testbyte ^ goodbyte){
  // have a mismatch report the pins
    for(j=0, mask=0x01; mask<0x80;mask<<=1, j++){
       if( (mask & testbyte) != (mask & goodbyte)) // for clarity
          logbadpin(testpin, currbyte*8+j/*pin/wirevalue*/, mask & testbyte /*bad value*/);

     }

}

来源：https://stackoverflow.com/questions/9295938/finding-position-of-1s-efficiently-in-an-bit-array

标签

embedded

bit-manipulation

avr