Maybe a C# compiler bug in Visual Studio 2015

久未见 提交于 2019-11-27 01:41:01
Corey

This is not a bug in 2015 but a possibly a C# language bug. The discussion below relates to why instance members cannot introduce loops, and why a Nullable<T> will cause this error, but should not apply to static members.

I would submit it as a language bug, not a compiler bug.


Compiling this code in VS2013 gives the following compile error:

Struct member 'ConsoleApplication1.Program.MyStruct.Empty' of type 'System.Nullable' causes a cycle in the struct layout

A quick search turns up this answer which states:

It's not legal to have a struct that contains itself as a member.

Unfortunately the System.Nullable<T> type which is used for nullable instances of value types is also a value type and must therefore have a fixed size. It's tempting to think of MyStruct? as a reference type, but it really isn't. The size of MyStruct? is based on the size of MyStruct... which apparently introduces a loop in the compiler.

Take for instance:

public struct Struct1
{
    public int a;
    public int b;
    public int c;
}

public struct Struct2
{
    public Struct1? s;
}

Using System.Runtime.InteropServices.Marshal.SizeOf() you'll find that Struct2 is 16 bytes long, indicating that Struct1? is not a reference but a struct that is 4 bytes (standard padding size) longer than Struct1.


What's not happening here

In response to Julius Depulla's answer and comments, here is what is actually happening when you access a static Nullable<T> field. From this code:

public struct foo
{
    public static int? Empty = null;
}

public void Main()
{
    Console.WriteLine(foo.Empty == null);
}

Here is the generated IL from LINQPad:

IL_0000:  ldsflda     UserQuery+foo.Empty
IL_0005:  call        System.Nullable<System.Int32>.get_HasValue
IL_000A:  ldc.i4.0    
IL_000B:  ceq         
IL_000D:  call        System.Console.WriteLine
IL_0012:  ret         

The first instruction gets the address of the static field foo.Empty and pushes it on the stack. This address is guaranteed to be non-null as Nullable<Int32> is a structure and not a reference type.

Next the Nullable<Int32> hidden member function get_HasValue is called to retrieve the HasValue property value. This cannot result in a null reference since, as mentioned previously, the address of a value type field must be non-null, regardless of the value contained at the address.

The rest is just comparing the result to 0 and sending the result to the console.

At no point in this process is it possible to 'invoke a null on a type' whatever that means. Value types do not have null addresses, so method invocation on value types cannot directly result in a null object reference error. That's why we don't call them reference types.

First off, it is important when analyzing these issues to make a minimal reproducer, so that we can narrow down where the problem is. In the original code there are three red herrings: the readonly, the static and the Nullable<T>. None are necessary to repro the issue. Here's a minimal repro:

struct N<T> {}
struct M { public N<M> E; }
class P { static void Main() { var x = default(M); } }

This compiles in the current version of VS, but throws a type load exception when run.

  • The exception is not triggered by use of E. It is triggered by any attempt to access the type M. (As one would expect in the case of a type load exception.)
  • The exception reproduces whether the field is static or instance, readonly or not; this has nothing to do with the nature of the field. (However it must be a field! The issue does not repro if it is, say, a method.)
  • The exception has nothing whatsoever to do with "invocation"; nothing is being "invoked" in the minimal repro.
  • The exception has nothing whatsoever to do with the member access operator ".". It does not appear in the minimal repro.
  • The exception has nothing whatsoever to do with nullables; nothing is nullable in the minimal repro.

Now let's do some more experiments. What if we make N and M classes? I will tell you the results:

  • The behaviour only reproduces when both are structs.

We could go on to discuss whether the issue reproduces only when M in some sense "directly" mentions itself, or whether an "indirect" cycle also reproduces the bug. (The latter is true.) And as Corey notes in his answer, we could also ask "do the types have to be generic?" No; there is a reproducer even more minimal than this one with no generics.

However I think we have enough to complete our discussion of the reproducer and move on to the question at hand, which is "is it a bug, and if so, in what?"

Plainly something is messed up here, and I lack the time today to sort out where the blame ought to fall. Here are some thoughts:

  • The rule against structs containing members of themselves plainly does not apply here. (See section 11.3.1 of the C# 5 specification, which is the one I have present at hand. I note that this section could benefit from a careful rewriting with generics in mind; some of the language here is a bit imprecise.) If E is static then that section does not apply; if it is not static then the layouts of N<M> and M can both be computed regardless.

  • I know of no other rule in the C# language that would prohibit this arrangement of types.

  • It might be the case that the CLR specification prohibits this arrangement of types, and the CLR is right to throw an exception here.

So now let us sum up the possibilities:

  • The CLR has a bug. This type topology should be legal, and it is wrong of the CLR to throw here.

  • The CLR behaviour is correct. This type topology is illegal, and it is correct of the CLR to throw here. (In this scenario it may be the case that the CLR has a spec bug, in that this fact may not be adequately explained in the specification. I don't have time to do CLR spec diving today.)

Let us suppose for the sake of argument that the second is true. What can we now say about C#? Some possibilities:

  • The C# language specification prohibits this program, but the implementation allows it. The implementation has a bug. (I believe this scenario to be false.)

  • The C# language specification does not prohibit this program, but it could be made to do so at a reasonable implementation cost. In this scenario the C# specification is at fault, it should be fixed, and the implementation should be fixed to match.

  • The C# language specification does not prohibit the program, but detecting the problem at compile time cannot be done at reasonable cost. This is the case with pretty much any runtime crash; your program crashed at runtime because the compiler couldn't stop you from writing a buggy program. This is just one more buggy program; unfortunately, you had no reason to know it was buggy.

Summing up, our possibilities are:

  • The CLR has a bug
  • The C# spec has a bug
  • The C# implementation has a bug
  • The program has a bug

One of these four must be true. I do not know which it is. Were I asked to guess, I'd pick the first one; I see no reason why the CLR type loader ought to balk on this one. But perhaps there is a good reason that I do not know; hopefully an expert on the CLR type loading semantics will chime in.


UPDATE:

This issue is tracked here:

https://github.com/dotnet/roslyn/issues/10126

To sum up the conclusions from the C# team in that issue:

  • The program is legal according to both the CLI and C# specifications.
  • The C# 6 compiler allows the program, but some implementations of the CLI throw a type load exception. This is a bug in those implementations.
  • The CLR team is aware of the bug, and apparently it is hard to fix on the buggy implementations.
  • The C# team is considering making the legal code produce a warning, since it will fail at runtime on some, but not all, versions of the CLI.

The C# and CLR teams are on this; follow up with them. If you have any more concerns with this issue please post to the tracking issue, not here.

Now that we've had a lengthy discussion about what and why, here's a way to work around the issue without having to wait on the various .NET teams to track down the issue and determine what if anything will be done about it.

The issue appears to be restricted to field types that are value types which reference back to this type in some way, either as generic parameters or static members. For instance:

public struct A { public static B b; }
public struct B { public static A a; }

Ugh, I feel dirty now. Bad OOP, but it demonstrates that the problem exists without invoking generics in any way.

So because they are value types the type loader determines that there is a circularity involved that should be ignored because of the static keyword. The C# compiler was smart enough to figure it out. Whether it should have or not is up to the specs, on which I have no comment.

However, by changing either A or B to class the problem evaporates:

public struct A { public static B b; }
public class B { public static A a; }

So the problem can be avoided by using a reference type to store the actual value and convert the field to a property:

public struct MyStruct
{
    private static class _internal { public static MyStruct? empty = null; }
    public static MyStruct? Empty => _internal.empty;
}

This is a bunch slower because it's a property instead of a field and calls to it will invoke the get method, so I wouldn't use it for performance-critical code, but as a workaround it at least lets you do the job until a proper solution is available.

And if it turns out that this doesn't get resolved, at least we have a kludge we can use to bypass it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!