Delphi 2009 - Bug? Adding supposedly invalid values to a set

问题

First of all, I'm not a very experienced programmer. I'm using Delphi 2009 and have been working with sets, which seem to behave very strangely and even inconsistently to me. I guess it might be me, but the following looks like there's clearly something wrong:

unit test;

interface

uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls;

type
TForm1 = class(TForm)
  Button1: TButton;
  Edit1: TEdit;
  procedure Button1Click(Sender: TObject);
private
    test: set of 1..2;
end;

var Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
begin
  test := [3];
  if 3 in test then
    Edit1.Text := '3';
end;

end.

If you run the program and click the button, then, sure enough, it will display the string "3" in the text field. However, if you try the same thing with a number like 100, nothing will be displayed (as it should, in my opinion). Am I missing something or is this some kind of bug? Advice would be appreciated!

EDIT: So far, it seems that I'm not alone with my observation. If someone has some inside knowledge of this, I'd be very glad to hear about it. Also, if there are people with Delphi 2010 (or even Delphi XE), I would appreciate it if you could do some tests on this or even general set behavior (such as "test: set of 256..257") as it would be interesting to see if anything has changed in newer versions.

回答1:

I was curious enough to take a look at the compiled code that gets produced, and I figured out the following about how sets work in Delphi 2010. It explains why you can do test := [8] when test: set of 1..2, and why Assert(8 in test) fails immediately after.

How much space is actually used?

An set of byte has one bit for every possible byte value, 256 bits in all, 32 bytes. An set of 1..2 requires 1 byte but surprisingly set of 100..101 also requires one byte, so Delphi's compiler is pretty smart about memory allocation. On the othter hand an set of 7..8 requires 2 bytes, and set based on a enumeration that only includes the values 0 and 101 requires (gasp) 13 bytes!

Test code:

TTestEnumeration = (te0=0, te101=101);
TTestEnumeration2 = (tex58=58, tex101=101);

procedure Test;
var A: set of 1..2;
    B: set of 7..8;
    C: set of 100..101;
    D: set of TTestEnumeration;
    E: set of TTestEnumeration2;
begin
  ShowMessage(IntToStr(SizeOf(A))); // => 1
  ShowMessage(IntToStr(SizeOf(B))); // => 2
  ShowMessage(IntToStr(SizeOf(C))); // => 1
  ShowMessage(IntToStr(SizeOf(D))); // => 13
  ShowMessage(IntToStr(SizeOf(E))); // => 6
end;

Conclusions:

The basic model behind the set is the set of byte, with 256 possible bits, 32 bytes.
Delphi determines the required continuous sub-range of the total 32 bytes range and uses that. For the case set of 1..2 it probably only uses the first byte, so SizeOf() returns 1. For the set of 100.101 it probably only uses the 13th byte, so SizeOf() returns 1. For the set of 7..8 it's probably using the first two bytes, so we get SizeOf()=2. This is an especially interesting case, because it shows us that bits are not shifted left or right to optimize storage. The other interesting case is the set of TTestEnumeration2: it uses 6 bytes, even those there are lots of unusable bits around there.

What kind of code is generated by the compiler?

Test 1, two sets, both using the "first byte".

procedure Test;
var A: set of 1..2;
    B: set of 2..3;
begin
  A := [1];
  B := [1];
end;

For those understand Assembler, have a look at the generated code yourself. For those that don't understand assembler, the generated code is equivalent to:

begin
  A := CompilerGeneratedArray[1];
  B := CompilerGeneratedArray[1];
end;

And that's not a typo, the compiler uses the same pre-compiled value for both assignments. CompiledGeneratedArray[1] = 2.

Here's an other test:

procedure Test2;
var A: set of 1..2;
    B: set of 100..101;
begin
  A := [1];
  B := [1];
end;

Again, in pseudo-code, the compiled code looks like this:

begin
  A := CompilerGeneratedArray1[1];
  B := CompilerGeneratedArray2[1];
end;

Again, no typo: This time the compiler uses different pre-compiled values for the two assignments. CompilerGeneratedArray1[1]=2 while CompilerGeneratedArray2[1]=0; The compiler generated code is smart enough not to overwrite the bits in "B" with invalid values (because B holds information about bits 96..103), yet it uses very similar code for both assignments.

Conclusions

All set operations work perfectly well IF you test with values that are in the base-set. For the set of 1..2, test with 1 and 2. For the set of 7..8 only test with 7 and 8. I don't consider the set to be broken. It serves it's purpose very well all over the VCL (and it has a place in my own code as well).
In my opinion the compiler generates sub-optimal code for set assignments. I don't think the table-lookups are required, the compiler could generate the values inline and the code would have the same size but better locality.
My opinion is that the side-effect of having the set of 1..2 behave the same as set of 0..7 is the side-effect of the previous lack of optimization in the compiler.
In the OP's case (var test: set of 1..2; test := [7]) the compiler should generate an error. I would not classify this as a bug because I don't think the compiler's behavior is supposed to be defined in terms of "what to do on bad code by the programmer" but in terms of "what to do with good code by the programmer"; None the less the compiler should generate the Constant expression violates subrange bounds, as it does if you try this code:

(code sample)

procedure Test;
var t: 1..2;
begin
  t := 3;
end;

At runtime, if the code is compiled with {$R+}, the bad assignment should raise an error, as it does if you try this code:

(code sample)

procedure Test;
var t: 1..2;
    i: Integer;
begin
  {$R+}
  for i:=1 to 3 do
    t := i;
  {$R-}
end;

回答2:

According to the official documentation on sets (my emphasis):

The syntax for a set constructor is: [ item1, ..., itemn ] where each item is either an expression denoting an ordinal of the set's base type

Now, according to Subrange types:

When you use numeric or character constants to define a subrange, the base type is the smallest integer or character type that contains the specified range.

Therefore, if you specify

type
  TNum = 1..2;

then the base type will be byte (most likely), and so, if

type
  TSet = set of TNum;
var
  test: TSet;

then

test := [255];

will work, but not

test := [256];

all according to the official specification.

回答3:

I have no "inside knowledge", but the compiler logic seems rather transparent.

First, the compiler thinks that any set like set of 1..2 is a subset of set of 0..255. That is why set of 256..257 is not allowed.

Second, the compiler optimizes memory allocation - so it allocates only 1 byte for set of 1..2. The same 1 byte is allocated for set of 0..7, and there seems to be no difference between the both sets on binary level. In short, the compiler allocates as little memory as possible with alignment taken into account (that means for example that compiler never allocates 3 bytes for set - it allocates 4 bytes, even if set fits into 3 bytes, like set of 1..20).

There is some inconsistency in a way the compiler treats sets, which can be demonstrated by the following code sample:

type
   TTestSet = set of 1..2;
   TTestRec = packed record
     FSet: TTestSet;
     FByte: Byte;
   end;

var
  Rec: TTestRec;

procedure TForm9.Button3Click(Sender: TObject);
begin
  Rec.FSet:= [];
  Rec.FByte:= 1;           // as a side effect we set 8-th element of FSet
                           //   (FSet actually has no 8-th element - only 0..7)
  Assert(8 in Rec.FSet);   // The assert should fail, but it does not!
  if 8 in Rec.FSet then    // another display of the bug
    Edit1.Text := '8';
end;

回答4:

A set is stored as a number and can actually hold values that are not in the enumeration on which the set is based. I would expect an error, at least when Range Checking is on in the compiler options, but this doesn't seem to be the case. I'm not sure if this is a bug or by design though.

[edit]

It is odd, though:

type
  TNum = 1..2;
  TSet = set of TNum;

var
  test: TSet;
  test2: TNum;

test2 := 4;  // Not accepted
test := [4]; // Accepted

回答5:

From the top of my head, this was a side effect of allowing non contiguous enumeration types.

The same holds for .NET bitflags: because in both cases the underlying types are compatible with integer, you can insert any integer in it (in Delphi limited to 0..255).

--jeroen

回答6:

As far as I'm concerned, no bugs there.

For exemple, take the following code

var aByte: Byte;
begin
  aByte := 255;
  aByte := aByte + 1;
  if aByte = 0 then
    ShowMessage('Is this a bug?');
end;

Now, you can get 2 result from this code. If you compiled with Range Checking TRUE, an exception will be raise on the 2nd line. If you did NOT compile with Range Checking, the code will execute without any error and display the message dialogs.

The situation you encountered with the sets is similar, except that there is no compiler switch to force an exception to be raised in this situation (Well, as far as I know...).

Now, from your exemple:

private         
  test: set of 1..2;

That essentially declare a Byte sized set (If you call SizeOf(Test), it should return 1). A byte sized set can only contain 8 elements. In this case, it can contains [0] to [7].

Now, some exemple:

begin
  test := [8]; //Here, we try to set the 9th bit of a Byte sized variable. It doesn't work
  Test := [4]; //Here, we try to set the 5th bit of a Byte Sized variable. It works.      
end;

Now, I need to admit I would kind of expect the "Constant expression violates subrange bounds" on the first line (but not on 2nd)

So yeah... there might be a small issue with the compiler.

As for your result being inconsistent... I'm pretty sure using set values out of the set's subrange values isn't guaranteed to give consistent result over different version of Delphi (Maybe not even over different compiles... So if your range is 1..2, stick with [1] and [2].

来源：https://stackoverflow.com/questions/4839745/delphi-2009-bug-adding-supposedly-invalid-values-to-a-set

标签

Delphi

set

delphi-2009