Trace.CorrelationManager.LogicalOperationStack
enables having nested logical operation identifiers where the most common case is logging (NDC). Should it still work
If you're still interested in this, I believe it's a bug in how they flow LogicalOperationStack
and I think it's a good idea to report it.
They give special treatment to LogicalOperationStack
's stack here in LogicalCallContext.Clone, by doing a deep copy (unlike with other data stored via CallContext.LogicalSetData/LogicalGetData
, on which only a shallow copy is performed).
This LogicalCallContext.Clone
is called every time ExecutionContext.CreateCopy
or ExecutionContext.CreateMutableCopy
is called to flow the ExecutionContext
.
Based on your code, I did a little experiment by providing my own mutable stack for "System.Diagnostics.Trace.CorrelationManagerSlot"
slot in LogicalCallContext
, to see when and how many times it actually gets cloned.
The code:
using System;
using System.Collections;
using System.Diagnostics;
using System.Linq;
using System.Runtime.Remoting.Messaging;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication
{
class Program
{
static readonly string CorrelationManagerSlot = "System.Diagnostics.Trace.CorrelationManagerSlot";
public static void ShowCorrelationManagerStack(object where)
{
object top = "null";
var stack = (MyStack)CallContext.LogicalGetData(CorrelationManagerSlot);
if (stack.Count > 0)
top = stack.Peek();
Console.WriteLine("{0}: MyStack Id={1}, Count={2}, on thread {3}, top: {4}",
where, stack.Id, stack.Count, Environment.CurrentManagedThreadId, top);
}
private static void Main()
{
CallContext.LogicalSetData(CorrelationManagerSlot, new MyStack());
OuterOperationAsync().Wait();
Console.ReadLine();
}
private static async Task OuterOperationAsync()
{
ShowCorrelationManagerStack(1.1);
using (LogicalFlow.StartScope())
{
ShowCorrelationManagerStack(1.2);
Console.WriteLine("\t" + LogicalFlow.CurrentOperationId);
await InnerOperationAsync();
ShowCorrelationManagerStack(1.3);
Console.WriteLine("\t" + LogicalFlow.CurrentOperationId);
await InnerOperationAsync();
ShowCorrelationManagerStack(1.4);
Console.WriteLine("\t" + LogicalFlow.CurrentOperationId);
}
ShowCorrelationManagerStack(1.5);
}
private static async Task InnerOperationAsync()
{
ShowCorrelationManagerStack(2.1);
using (LogicalFlow.StartScope())
{
ShowCorrelationManagerStack(2.2);
await Task.Delay(100);
ShowCorrelationManagerStack(2.3);
}
ShowCorrelationManagerStack(2.4);
}
}
public class MyStack : Stack, ICloneable
{
public static int s_Id = 0;
public int Id { get; private set; }
object ICloneable.Clone()
{
var cloneId = Interlocked.Increment(ref s_Id); ;
Console.WriteLine("Cloning MyStack Id={0} into {1} on thread {2}", this.Id, cloneId, Environment.CurrentManagedThreadId);
var clone = new MyStack();
clone.Id = cloneId;
foreach (var item in this.ToArray().Reverse())
clone.Push(item);
return clone;
}
}
public static class LogicalFlow
{
public static Guid CurrentOperationId
{
get
{
return Trace.CorrelationManager.LogicalOperationStack.Count > 0
? (Guid)Trace.CorrelationManager.LogicalOperationStack.Peek()
: Guid.Empty;
}
}
public static IDisposable StartScope()
{
Program.ShowCorrelationManagerStack("Before StartLogicalOperation");
Trace.CorrelationManager.StartLogicalOperation();
Program.ShowCorrelationManagerStack("After StartLogicalOperation");
return new Stopper();
}
private static void StopScope()
{
Program.ShowCorrelationManagerStack("Before StopLogicalOperation");
Trace.CorrelationManager.StopLogicalOperation();
Program.ShowCorrelationManagerStack("After StopLogicalOperation");
}
private class Stopper : IDisposable
{
private bool _isDisposed;
public void Dispose()
{
if (!_isDisposed)
{
StopScope();
_isDisposed = true;
}
}
}
}
}
The result is quite surprising. Even though there're only two threads involved in this async workflow, the stack gets cloned as many as 4 times. And the problem is, the matching Stack.Push
and Stack.Pop
operations (called by StartLogicalOperation
/StopLogicalOperation
) operate on the different, non-matching clones of the stack, thus disbalancing the "logical" stack. That's where the bug lays in.
This indeed makes LogicalOperationStack
totally unusable across async calls, even though there's no concurrent forks of tasks.
Updated, I also did a little research about how it may behave for synchronous calls, to address these comments:
Agreed, not a dupe. Did you check if it works as expected on the same thread, e.g. if you replace await Task.Delay(100) with Task.Delay(100).Wait()? – Noseratio Feb 27 at 21:00
@Noseratio yes. It works of course, because there's only a single thread (and so a single CallContext). It's as if the method wasn't async to begin with. – i3arnon Feb 27 at 21:01
Single thread doesn't mean single CallContext
. Even for synchronous continuations on the same single thread the execution context (and its inner LogicalCallContext
) can get cloned. Example, using the above code:
private static void Main()
{
CallContext.LogicalSetData(CorrelationManagerSlot, new MyStack());
ShowCorrelationManagerStack(0.1);
CallContext.LogicalSetData("slot1", "value1");
Console.WriteLine(CallContext.LogicalGetData("slot1"));
Task.FromResult(0).ContinueWith(t =>
{
ShowCorrelationManagerStack(0.2);
CallContext.LogicalSetData("slot1", "value2");
Console.WriteLine(CallContext.LogicalGetData("slot1"));
},
CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
ShowCorrelationManagerStack(0.3);
Console.WriteLine(CallContext.LogicalGetData("slot1"));
// ...
}
Output (note how we lose "value2"
):
0.1: MyStack Id=0, Count=0, on thread 9, top: value1 Cloning MyStack Id=0 into 1 on thread 9 0.2: MyStack Id=1, Count=0, on thread 9, top: value2 0.3: MyStack Id=0, Count=0, on thread 9, top: value1
One of the solutions mentioned here and on the web is to call LogicalSetData on context:
CallContext.LogicalSetData("one", null);
Trace.CorrelationManager.StartLogicalOperation();
But in fact, just reading current execution context is enough:
var context = Thread.CurrentThread.ExecutionContext;
Trace.CorrelationManager.StartLogicalOperation();
Yes, LogicalOperationStack
should work with async-await
and it is a bug that it doesn't.
I've contacted the relevant developer at Microsoft and his response was this:
"I wasn't aware of this, but it does seem broken. The copy-on-write logic is supposed to behave exactly as if we'd really created a copy of the
ExecutionContext
on entry into the method. However, copying theExecutionContext
would have created a deep copy of theCorrelationManager
context, as it's special-cased inCallContext.Clone()
. We don't take that into account in the copy-on-write logic."
Moreover, he recommended using the new System.Threading.AsyncLocal<T> class added in .Net 4.6 instead which should handle that issue correctly.
So, I went ahead and implemented LogicalFlow
on top of an AsyncLocal
instead of the LogicalOperationStack
using VS2015 RC and .Net 4.6:
public static class LogicalFlow
{
private static AsyncLocal<Stack> _asyncLogicalOperationStack = new AsyncLocal<Stack>();
private static Stack AsyncLogicalOperationStack
{
get
{
if (_asyncLogicalOperationStack.Value == null)
{
_asyncLogicalOperationStack.Value = new Stack();
}
return _asyncLogicalOperationStack.Value;
}
}
public static Guid CurrentOperationId =>
AsyncLogicalOperationStack.Count > 0
? (Guid)AsyncLogicalOperationStack.Peek()
: Guid.Empty;
public static IDisposable StartScope()
{
AsyncLogicalOperationStack.Push(Guid.NewGuid());
return new Stopper();
}
private static void StopScope() =>
AsyncLogicalOperationStack.Pop();
}
And the output for the same test is indeed as it should be:
00000000-0000-0000-0000-000000000000
ae90c3e3-c801-4bc8-bc34-9bccfc2b692a
ae90c3e3-c801-4bc8-bc34-9bccfc2b692a
ae90c3e3-c801-4bc8-bc34-9bccfc2b692a
00000000-0000-0000-0000-000000000000