Incomprehensible behavior of the CF flag

问题

Let's say there is a piece of code:

mov al, 12
mov bl, 4
sub al, bl

In this case, the CF = 0 flag, but in my opinion it should be equal to 1, since the subtraction operation is implemented on an addition operation and the processor does not know what we are giving it as input, be it signed or unsigned numbers, it just does its job.

That is, the code above is equivalent to the following:

Enter the value 12 into the al register, i.e. 0000 1100

Enter the value 4 into the bl register, i.e. 0000 0100

Next comes the subtraction operation, since the first operand is positive, there are no conversions to the additional code. Since the second is also positive, there are no transformations either, but since the subtraction operation is performed, the second operand is translated into additional code and the processor performs the addition operation (subtraction is implemented by addition), that is:

12: 0000 1100
-4: 1111 1100



12 - 4 = 12 + (-4) = 0000 1100 + 1111 1100 = 1 0000 1000

That is, we got the correct result - 8, but CF = 0 if debug it. Why is that? The one that went beyond the bit grid is placed in CF, but CF = 0.

回答1:

Reading the manual, the CF flag indicates unsigned overflow. However, this doesn't mean overflow in addition.

You assume that the bit 8 being set indicates overflow, however, it is the opposite case for subtraction. If it was not set, this would indicate that borrow occurred from a bit up.

If we replace 4 with 16, so that we would see overflow:

00001100b - 00010000b = 00001100b + 11110000b = 11111100b = 252d

You can see that there is no carry up to bit 8, though unsigned overflow occurred. The CF flag is not the result of carry on the addition of the negative. It simply indicates overflow in subtraction. In this case, this means that it is the opposite of that overflow bit.

Different architectures treat the carry flag differently. Some use it to indicate that a borrow occurred while some use it to represent that bit that would be borrowed from. They often have an instruction (sbb in x86) that allows chaining by using the flags as part of the input.

回答2:

From grade school a - b = a + (-b). From intro to programming -a = ~a + 1. So in logic

        1
 00001100
+11111011
=========

and finish it

111111111
 00001100
+11111011
=========
 00001000

Then comes a question of architecture, on the way in you invert the carry in and the second operand for a subtract. Some architectures invert the carry out to have it represent a borrow flag, some leave it untouched as a not-borrow flag. 12 - 4 does not borrow so we can predict the borrow, but have to look at the architectural documents to see. In particular things like less than and greater than and what flags are examined...

Also a side not the carry in and carry out of the msbit are the same so there is no signed overflow in this case (+8 can be represented in 8 bits).

Wikipedia indicates that x86 uses a borrow flag

The first uses the bit as a borrow flag, setting it if a<b when computing a−b, and a borrow must be performed.

 00111
  0010
+ 1011
======
  1110

2 - 4, 2 < 4, carry is a 0, so set the borrow flag

This plus your evidence indicates that x86 inverts the carry bit on the way out of a subtraction to indicate borrow.

Everything looks good here with your experiment.

Note this is the beauty of twos complement that the add and subtraction operations work with the same logic and it does not have to know signed from unsigned. Multiply and divide when using uneven sizes (x bits times x bits = x+x bits. x+x bits divided by x bits) do care about signed vs unsigned. They do often add a signed overflow along with the unsigned overflow to aid with signed vs unsigned conditional branches which do very much care about signed vs unsigned a signed conditional branch and an unsigned of the same flavor are not the same flags most of the time.

Edit

We are still not understanding your confusion. The processor knows what operation it is even if implemented as an add it is still a subtract operation. Think about how you would do it if writing it in software.

My old print copy of an 8088 Intel manual says.

If CF (the carry flag) is set, there has been a carry out of, or a borrow into, the high order bit of the result (8- or 16-bit). The flag is used by instructions that add and subtract multi-byte numbers. Rotation instructions can also isolate a bit in memory or a register by placing it in the carry flag.

For an add operation cdf is set if there has been a carry out of (1). For a sub operation as in this case the cf is set if there is been a borrow into. Independent of how you implement the operation in logic this is the definition of the flag, so if you build subtract logic (unlikely) then you still end up inverting something if you use add logic to do a subtract you end up inverting the carry out. (both cases you end up inverting the carry out).

Using 4 bits because it all scales to n number of bits.

In logic you would do something like

result = 0aaaa + 0iiii + 1;

result = 1aaaa - 0bbbb;

where iiii = ~b;

Let's take 1 - 4 as an example

Using an adder to implement subtraction 1 - 4 = 1 + (-4) a math thing has nothing to do with computers, predates computers by centuries.

      1
  00001
+ 01011
========

fill it in

  00111
  00001
+ 01011
========
  01101

The carry out BIT is a 0 so the carry flag = ~0 = 1 to represent a borrow happened. The result is 1101 (-3).

Using grade school subtraction with a small twist to not have to flip operands and negate the result

 10001
-01011
=======

Because this is a text representation going to use a 2 to represent 10 binary since I cannot fit it.

 10001
-00100
=======

Doing it this way takes more work, twos complement specifically makes add and subtract easier because you can use an adder to do subtraction, so it will take me many steps to perform this using long subtraction.

I can get this far without having to borrow

 10001
-00100
=======
    01

Thanks to your questions and looking up borrow at Wikipedia, I now know that they call this the American method.

I need to borrow one from the 16s column to make the 8s column a 2 because the fours column has a zero, have to keep going till you hit a non-zero then borrow from there.

 02001
-00100
=======
    01

And then I have to borrow from the eights column to put something in the fours column.

 01201
-00100
=======
    01

Now we can continue

 01201
-00100
=======
 01101

We get the same result as using the addition method, the 16ths column 1 is needed to do a proper borrow without reversing the operands, in grade school we would do 1 - 4 by doing 4 - 1 then negating the result

  4
- 1
=====
  3

and we would turn that into -3, because how can you demonstrate a borrow (with the American method) with nothing to borrow from?

So both end up with 01101 which is a carry out of 0 and a result of -3. But because the definition says

If CF (the carry flag) is set, there has been ... a borrow into, the high order bit of the result

So we need the carry flag to be a 1 here, so

if(add) cf = carry_out;
if(sub) cf = ~carry_out;

You can implement all of this yourself in some C code very similar to what you would see in logic.

Say a 4 bit alu add and two possible forms of subtract, with the carry flag representing a borrow (if borrow then set).

unsigned int c_flag;
unsigned int n_flag;
unsigned int z_flag;
unsigned int v_flag;
unsigned int alu_add ( unsigned int a, unsigned int b )
{
    unsigned int c;

    c = a + b;
    c_flag = (c>>4)&1;
    c &= 0xF;
    if(c) z_flag = 1; else z_flag = 0;
    n_flag = (c>>3)&1;
    v_flag = 0;
    if((a&8)==(b&8)) if((b&8)==(c&8)) v_flag = 1;
    return(c);
}
unsigned int alu_sub ( unsigned int a, unsigned int b )
{
    unsigned int c;

    b = (~b) & 0xF;
    c = a + b + 1;
    c_flag = ((~c)>>4)&1;
    c &= 0xF;
    if(c) z_flag = 1; else z_flag = 0;
    n_flag = (c>>3)&1;
    v_flag = 0;
    if((a&8)==(b&8)) if((b&8)==(c&8)) v_flag = 1;
    return(c);
}
unsigned int alu_sub_alt ( unsigned int a, unsigned int b )
{
    unsigned int c;

    c = (0x10|a) - b;
    c_flag = ((~c)>>4)&1;
    c &= 0xF;
    if(c) z_flag = 1; else z_flag = 0;
    n_flag = (c>>3)&1;
    v_flag = 0;
    if((a&8)==(b&8)) if((b&8)==(c&8)) v_flag = 1;
    return(c);
}

A selection of test vectors

1 0  1( 1)  0( 0) add =  1( 1) cZnV sub =  1( 1) cZnv sub_alt =  1( 1) cZnV
1 1  1( 1)  1( 1) add =  2( 2) cZnV sub =  0( 0) cznv sub_alt =  0( 0) cznV
1 2  1( 1)  2( 2) add =  3( 3) cZnV sub = 15(-1) CZNv sub_alt = 15(-1) CZNv
1 3  1( 1)  3( 3) add =  4( 4) cZnV sub = 14(-2) CZNv sub_alt = 14(-2) CZNv
1 4  1( 1)  4( 4) add =  5( 5) cZnV sub = 13(-3) CZNv sub_alt = 13(-3) CZNv
1 5  1( 1)  5( 5) add =  6( 6) cZnV sub = 12(-4) CZNv sub_alt = 12(-4) CZNv
1 6  1( 1)  6( 6) add =  7( 7) cZnV sub = 11(-5) CZNv sub_alt = 11(-5) CZNv

None of us here can say we have access to the source code to actual intel chips in production as we would be protected in various ways NDA or employee contracts, etc.

But the code for today's processors is going to be written in some HDL I suspect Intel does not use Verilog directly but some other maybe in house language that then compiles to Verilog and then Verilog is fed to the synthesis tool and that tool ultimately decides how to handle things.

You could do something like this

assign fadder_out       = { 1'd0,a} + {1'd0,b_not} + 5'b00001;

or something like this

assign fadder_out       = { 1'd1,a} - {1'd0,b};

and then

assign cf = alu_op == add ? fadder_out[4] : 
            alu_op == sub ? ~fadder_out[4] ;

Why did Intel choose to represent the cf as a borrow flag instead of a raw carry out? Probably because the 8080 did it because the 8008 did it because the 4004 did it. I suspect the folks in the building at the time are no longer alive to ask, and if they were they are probably not able to be reached.

The choice is truly arbitrary, so long as your conditionals and subtract with borrow if you have one, all fall in line with the same choice, then either case works just fine as we know because some percentage of the processors in the world take the raw carry out and we can think of it is a not borrow (0 if borrow) the rest call it a borrow (1 if borrow).

Even as far back as the 8086 but further back, the layout/mask was done by hand on a drafting table, each layer, each transistor, drawn by hand. Do you implement subtract in logic using a subtract, do you do it using an adder, etc. A big reason for CISC before RISC was because you didn't implement the whole damn thing in raw gates, you implemented modules that had lots of wiring, and you fed that by a rom based state machine. The CISC instructions cause an offset in the rom to be executed, the microcode if you will or state machine inputs then cause gates to flip here and there, a subtract may mean invert b operand, invert carry in, latch adder output, latch inverted carry out. An add could be do not invert b operand, do not invert carry in, latch adder output, latch not-inverted carry out.

Or they may have implemented a separate block for the subtraction from the addition just like there is a separate one for XOR and OR and AND, etc. And they can all be fed with the a and b operands in whatever form (inverted or not) because their carry outs and zero flags and such are operation specific generally and in the case of x86.

So we do not know. The 4004 probably 8008 probably 8080 and probably some 8086's have been sliced and scanned by the folks that did the visual6502 if not others, and maybe you can actually look at the gates and see how it was implemented if you want to know the history. You can also go look at the 8080, the 8008 and the 4004 to see if they had flags (most likely) and if so did they specify a borrow flag or not and so on.

Whether it is implemented in logic as an add or a subtract if you define the cf as 1 for a borrow then you need to invert the carry out into a cf if the operation is a subtract, subtract with borrow, or compare.

Your following question where you investigated the difference between 4 + (-12) and 4 - 12. One uses an add instruction the other a subtract instruction as shown above if subtract cf = ~carry_out, if either of the two logic models is used above. If you instead have a result = 0aaaa - 0bbbb that results in a 1 for the result (Austrian method perhaps) then that logic can pass that bit out. I cannot visualize that nor demonstrate it.

来源：https://stackoverflow.com/questions/64934531/incomprehensible-behavior-of-the-cf-flag

标签

assembly

x86

carryflag

eflags