Is C# type system sound and decidable?

梦想的初衷 提交于 2020-03-17 03:50:04

问题


I know that Java's type system is unsound (it fails to type check constructs that are semantically legal) and undecidable (it fails to type check some construct).

For instance, if you copy/paste the following snippet in a class and compile it, the compiler will crash with a StackOverflowException (how apt). This is undecidability.

static class ListX<T> {}
static class C<P> extends ListX<ListX<? super C<C<P>>>> {}
ListX<? super C<Byte>> crash = new C<Byte>();

Java uses wildcards with type bounds, which are a form of use-site variance. C# on the other hand, uses declaration site variance annotation (with the in and out keywords). It is known that declaration-site variance is weaker than use-site variance (use-site variance can express everything declaration-site variance can, and more -- on the down side, it's much more verbose).

So my question is: Is C# type system sound and decidable? If not, why?


回答1:


Is C# type system decidable?

A type system is "decidable" if the compiler is in theory always able to decide whether the program type checks or not in finite time.

The C# type system is not decidable.

C# has "nominal" subtyping -- that is, you give classes and interfaces names and say what the base classes and interfaces are by name when you declare a class.

C# also has generic types, and, as of C# 4, covariance and contravariance of generic interfaces.

Those three things -- nominal subtyping, generic interfaces, and contravariance -- are sufficient to make a type system undecidable (in the absence of other restrictions on the ways that subtypes may mention each other.)

When this answer was originally written in 2014, that was suspected but not known. The history of this discovery is interesting.

First, the designers of the C# generic type system wondered the same thing, and wrote a paper in 2007 describing different ways in which type checking can go wrong, and what restrictions one can put on a nominal subtyping system that make it decidable.

https://www.microsoft.com/en-us/research/publication/on-decidability-of-nominal-subtyping-with-variance/

A more gentle introduction to the problem can be found on my blog, here:

http://blogs.msdn.com/b/ericlippert/archive/2008/05/07/covariance-and-contravariance-part-twelve-to-infinity-but-not-beyond.aspx

I have written about this subject on SE sites before; a researcher noticed the problem mentioned in that posting and solved it; we now know that nominal subtyping is in general undecidable if there is generic contravariance thrown into the mix. You can encode a Turing Machine into the type system and force the compiler to emulate its operation, and since the question "does this TM halt?" is undecidable, so must type checking be undecidable.

See https://arxiv.org/abs/1605.05274 for the details.

Is the C# type system sound?

A type system is "sound" if we are guaranteed that a program which type checks at compile time has no type errors at runtime.

The C# type system is not sound.

There are many reasons why it is not, but my least favourite is array covariance:

Giraffe[] giraffes = new[] { new Giraffe() };
Animal[] animals = giraffes; // This is legal!
animals[0] = new Tiger(); // crashes at runtime with a type error

The idea here is that most methods that takes arrays only read the array, they do not write it, and it is safe to read an animal out of an array of giraffes. Java allows this, and so the CLR allows it because the CLR designers wanted to be able to implement variations on Java. C# allows it because the CLR allows it. The consequence is that every time you write anything into an array of a base class, the runtime must do a check to verify that the array is not an array of an incompatible derived class. The common case gets slower so that the rare error case can get an exception.

That brings up a good point though: C# is at least well-defined as to the consequences of a type error. Type errors at runtime produce sane behaviour in the form of exceptions. It's not like C or C++ where the compiler can and will blithely generate code that does arbitrarily crazy things.

There are a few other ways in which the C# type system is unsound by design.

  • If you consider getting a null reference exception to be a kind of runtime type error, then C# pre C# 8 is very unsound in that it does almost nothing to prevent this kind of error. C# 8 has many improvements in support for detecting nullity errors statically, but the null reference type checking is not sound; it has both false positives and false negatives. The idea is that some compile-time checking is better than none, even if it is not 100% reliable.

  • Many cast expressions allow the user to override the type system and declare "I know this expression will be of a more specific type at runtime, and if I'm wrong, throw an exception". (Some casts mean the opposite: "I know this expression is of type X, please generate code to convert it to an equivalent value of type Y". Those are generally safe.) Since this is a place where the developer is specifically saying that they know better than the type system, one can hardly blame the type system for the resulting crash.

There are also a handful of features that generate cast-like behaviour even though there is no cast in the code. For example, if you have a list of animals you can say

foreach(Giraffe g in animals)

and if there is a tiger in there, your program will crash. As the specification notes, the compiler simply inserts a cast on your behalf. (If you want to loop over all the giraffes and ignore the tigers, that's foreach(Giraffe g in animals.OfType<Giraffe>()).)

  • The unsafe subset of C# makes all bets off; you can break the rules of the runtime arbitrarily with it. Turning off a safety system turns a safety system off, so it should not be surprising that C# is not sound when you turn off soundness checking.



回答2:


It's not particularly hard to create problems that the C# complier cannot solve in a reasonable amount of time. Some of the problems it is posed with (often related to generics/type inference) are NP-hard problems. Eric Lippert describes one such example here:

class MainClass
{
    class T{}
    class F{}
    delegate void DT(T t);
    delegate void DF(F f);
    static void M(DT dt)
    {
        System.Console.WriteLine("true");
        dt(new T());
    }
    static void M(DF df)
    {
        System.Console.WriteLine("false");
        df(new F());
    }
    static T Or(T a1, T a2, T a3){return new T();}
    static T Or(T a1, T a2, F a3){return new T();}
    static T Or(T a1, F a2, T a3){return new T();}
    static T Or(T a1, F a2, F a3){return new T();}
    static T Or(F a1, T a2, T a3){return new T();}
    static T Or(F a1, T a2, F a3){return new T();}
    static T Or(F a1, F a2, T a3){return new T();}
    static F Or(F a1, F a2, F a3){return new F();}
    static T And(T a1, T a2){return new T();}
    static F And(T a1, F a2){return new F();}
    static F And(F a1, T a2){return new F();}
    static F And(F a1, F a2){return new F();}
    static F Not(T a){return new F();}
    static T Not(F a){return new T();}
    static void MustBeT(T t){}
    static void Main()
    {
        // Introduce enough variables and then encode any Boolean predicate:
        // eg, here we encode (!x3) & ((!x1) & ((x1 | x2 | x1) & (x2 | x3 | x2)))
        M(x1=>M(x2=>M(x3=>MustBeT(
          And(
            Not(x3), 
            And(
              Not(x1), 
              And(
                Or(x1, x2, x1), 
                Or(x2, x3, x2))))))));
    }
}


来源:https://stackoverflow.com/questions/23939168/is-c-sharp-type-system-sound-and-decidable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!