counting number of zeros and ones in a byte

后端 未结 2 874
死守一世寂寞
死守一世寂寞 2021-01-28 03:41

I previously posted a program to find the total number of 1s in a byte. Now I am trying to find the number of 0s in a byte. Following is my code:

MOV AL,1
MOV CX         


        
相关标签:
2条回答
  • 2021-01-28 03:55

    The bit which is shifted-out from AL goes to the carry-flag, not to the zero-flag. Change your conditional jumps:

        MOV AL,1     ; An investigated byte.    
        MOV CX,08H   ; Number of bits in the byte. 
        MOV BX,0000H ; Result: number of 1s.
        MOV DX,0000H ; Result: number of 0s.
    Zero:SHR AL,01H  ; Shift the byte, least significant bit to CF.
        JNC ZrO
    ero:INC BX      ; Count 1s. 
        JMP Skip
    ZrO:INC DX      ; Count 0s.
    Skip:LOOP Zero   ; Repeat CX times.
        hlt
    

    BTW there is a specialized instruction for this task on new Intel processors (NEHALEM): https://www.felixcloutier.com/x86/popcnt

        MOV AL,1     ; An investigated byte.  
        XOR AH,AH    
        POPCNT BX,AX ; Count the number of 1s in AX and put result to BX.
        MOV DX,8
        SUB DX,BX    ; The number of 0s in AL is 8-BX.
    
    0 讨论(0)
  • 2021-01-28 03:58

    Often when you write in assembly, it's fun to look at possible optimizations. Using a loop, you use the Z and C flags like so:

        MOV AL, <your value>
        MOV BL, 8
        CLC
    Loop:
        SBB BL, 0
        SHR AL, 1
        JNZ Loop
    Done:
        SBB BL, 0
        ; result is in BL
        HLT
    

    A faster way on older processors is to have a 256 bytes table and do a look up. As mentioned by vitsoft, on modern processors, use the POPCNT instruction is probably the fastest (it takes one clock cycle to counts all the bits of a 64 bit register in hardware.)

    Now, if you need to know the exact timing, my loop is not practical because it will vary depending on AL. Another way to make it go fast is to unroll the loop:

        MOV AL, <your value>
        MOV BL, 8
        SHR AL, 1    ; 1
        SBB BL, 0
        SHR AL, 1    ; 2
        SBB BL, 0
        SHR AL, 1    ; 3
        SBB BL, 0
        SHR AL, 1    ; 4
        SBB BL, 0
        SHR AL, 1    ; 5
        SBB BL, 0
        SHR AL, 1    ; 6
        SBB BL, 0
        SHR AL, 1    ; 7
        SBB BL, 0
        SHR AL, 1    ; 8
        SBB BL, 0
        HLT
    

    This is practical because you have zero branches. Modern processors love that much (although in this case since we're dealing with just 2 registers, I don't think it's a huge advantage.)

    0 讨论(0)
提交回复
热议问题