My question is somewhat related to this one: How does a generic constraint prevent boxing of a value type with an implicitly implemented interface?, but different because it
The problem is that there's no such thing as a value or variable which is "just" an interface type; instead, when an attempt is made to define to such a variable or cast to such a value, the real type that is used is, effectively, "an Object
that implements the interface".
This distinction comes into play with generics. Suppose a routine accepts a parameter of type T
where T:IFoo
. If one passes such a routine a struct which implements IFoo, the passed-in parameter won't be a class type that inherits from Object, but will instead be the appropriate struct type. If the routine were to assign the passed-in parameter to a local variable of type T
, the parameter would be copied by value, without boxing. If it were assigned to a local variable of type IFoo
, however, the type of that variable would be "an Object
that implements IFoo
", and thus boxing would be required that that point.
It may be helpful to define a static ExecF<T>(ref T thing) where T:I
method which could then invoke the I.F()
method on thing
. Such a method would not require any boxing, and would respect any self-mutations performed by I.F()
.
The value doesn't necessarily get boxed. The C#-to-MSIL translation step usually doesn't do most of the cool optimizations (for a few reasons, at least some of which are really good ones), so you'll likely still see the box
instruction if you look at the MSIL, but the JIT can sometimes legally elide the actual allocation if it detects that it can get away with it. As of .NET Fat 4.7.1, it looks like the developers never invested in teaching the JIT how to figure out when this was legal. .NET Core 2.1's JIT does this (not sure when it was added, I just know that it works in 2.1).
Here are the results from a benchmark I ran to prove it:
BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-6850K CPU 3.60GHz (Skylake), 1 CPU, 12 logical and 6 physical cores
Frequency=3515626 Hz, Resolution=284.4444 ns, Timer=TSC
.NET Core SDK=2.1.302
[Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Clr : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Method | Job | Runtime | Mean | Error | StdDev | Gen 0 | Allocated |
---------------------- |----- |-------- |---------:|----------:|----------:|-------:|----------:|
ViaExplicitCast | Clr | Clr | 5.139 us | 0.0116 us | 0.0109 us | 3.8071 | 24000 B |
ViaConstrainedGeneric | Clr | Clr | 2.635 us | 0.0034 us | 0.0028 us | - | 0 B |
ViaExplicitCast | Core | Core | 1.681 us | 0.0095 us | 0.0084 us | - | 0 B |
ViaConstrainedGeneric | Core | Core | 2.635 us | 0.0034 us | 0.0027 us | - | 0 B |
Benchmark source code:
using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Attributes.Exporters;
using BenchmarkDotNet.Attributes.Jobs;
using BenchmarkDotNet.Running;
[MemoryDiagnoser, ClrJob, CoreJob, MarkdownExporterAttribute.StackOverflow]
public class Program
{
public static void Main() => BenchmarkRunner.Run<Program>();
[Benchmark]
public int ViaExplicitCast()
{
int sum = 0;
for (int i = 0; i < 1000; i++)
{
sum += ((IValGetter)new ValGetter(i)).GetVal();
}
return sum;
}
[Benchmark]
public int ViaConstrainedGeneric()
{
int sum = 0;
for (int i = 0; i < 1000; i++)
{
sum += GetVal(new ValGetter(i));
}
return sum;
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static int GetVal<T>(T val) where T : IValGetter => val.GetVal();
public interface IValGetter { int GetVal(); }
public struct ValGetter : IValGetter
{
public int _val;
public ValGetter(int val) => _val = val;
[MethodImpl(MethodImplOptions.NoInlining)]
int IValGetter.GetVal() => _val;
}
}
I think the answer is in the C# specification of how interfaces can be treated. From the Spec:
There are several kinds of variables in C#, including fields, array elements, local variables, and parameters. Variables represent storage locations, and every variable has a type that determines what values can be stored in the variable, as shown by the following table.
Under the table that follows it says for an Interface
A null reference, a reference to an instance of a class type that implements that interface type, or a reference to a boxed value of a value type that implements that interface type
It says explicitly that it will be a boxed value of a value type. The compiler is just obeying the specification
To add more information based upon the comment. The compiler is free to rewrite if it has the same effect but because the boxing occurs you make a copy of the value type not have the same value type. From the specification again:
A boxing conversion implies making a copy of the value being boxed. This is different from a conversion of a reference-type to type object, in which the value continues to reference the same instance and simply is regarded as the less derived type object.
This means it has to do the boxing every time or you'd get inconsistent behavior. A simple example of this can be shown by doing the following with the provided program:
public interface I { void F(); }
public struct C : I {
public int i;
public void F() { i++; }
public int GetI() { return i; }
}
class P
{
static void Main(string[] args)
{
C x = new C();
I ix = (I)x;
ix.F();
ix.F();
x.F();
((I)x).F();
Console.WriteLine(x.GetI());
Console.WriteLine(((C)ix).GetI());
Console.ReadLine();
}
}
I added an internal member to struct C
that is incremented by 1 every time that F()
is called on that object. This lets us see what is happening to the data of our value type. If boxing was not performed on x
then you would expect the program to write out 4 for both calls to GetI()
as we call F()
four times. However the actual result we get is 1 and 2. The reason is that the boxing has made a copy.
This shows us that there is a difference between if we box the value and if we don't box the value