Processing multiple results from CLR stored procedure in T-SQL

问题

I have some complex algorithm written in C# as CLR Stored procedure. Procedure is not deterministic (it depends on current time). The result of procedure are two tables. I didn't found any solution how to process multi-results from stored procedures in T-SQL. The performance of this procedure is key (procedure is called every ~2 seconds).

I found the fastest way how to update tables is:

UPDATE [db-table] SET ... SELECT * FROM [clr-func]

It's much faster then update db-table from CLR procedure via ADO.NET.

I used static field to store results and query it after the execution of clr stored procedure.

The call stack is:

T-SQL proc
    -> CLR proc (MyStoredProcedure)
        -> T-SQL proc (UpdateDataFromMyStoredProcedure)
            -> CLR func (GetFirstResultOfMyStoredProcedure)
            -> CLR func (GetSecondResultOfMyStoredProcedure)

The problem is, sometimes CLR functions has null in static field result, but in CLR procedure is result not null. I found, sometimes the CLR functions are called in another AppDomain than CLR procedure. However CLR procedure is still running and can do next operations and no exception is thrown.

Is there some way, how to force CLR functions to be called in same AppDomain as "parent" CLR procedure?

Or is there some another way, how to achieve my intention?

P.S.: Originally the complex algorithm was written in T-SQL, but performance was poor (~100x slower than algorithm in C#).

Thanks!

Simplified code:

// T-SQL
CREATE PROC [dbo].[UpdateDataFromMyStoredProcedure] AS BEGIN
    UPDATE [dbo].[tblObject]
        SET ...
        SELECT * FROM [dbo].[GetFirstResultOfMyStoredProcedure]()
    UPDATE [dbo].[tblObjectAction]
        SET ...
        SELECT * FROM [dbo].[GetSecondResultOfMyStoredProcedure]()
END

// ... somewhere else
EXEC [dbo].[MyStoredProcedure]

// C#
public class StoredProcedures {
    // store for result of "MyStoredProcedure ()"
    private static MyStoredProcedureResult result;
    [SqlProcedure]
    public static int MyStoredProcedure() {
        result = null;
        result = ComputeComplexAlgorithm();
        UpdateDataFromMyStoredProcedure();
        result = null;
    }
    [SqlFunction(...)]
    public static IEnumerable GetFirstResultOfMyStoredProcedure() {
        return result.First;
    }
    [SqlFunction(...)]
    public static IEnumerable GetSecondResultOfMyStoredProcedure() {
        return result.Second;
    }
    private static void UpdateDataFromMyStoredProcedure() {
        using(var cnn = new SqlConnection("context connection=true")) {
            using(var cmd = cnn.CreateCommand()) {
                cmd.CommandType = System.Data.CommandType.StoredProcedure;
                cmd.CommandText = "[dbo].[UpdateDataFromMyStoredProcedure]";
                cmd.ExecuteNonQuery();
            }
        }
    }
}

回答1:

According to Bob Beauchemin "SQLCLR creates one appdomain per assembly owner, not one appdomain per database" Do both your SQLCLR Assemblies have the same owner?

回答2:

There are two possibilities:

The more likely scenario has to do with App Domains getting unloaded due to memory pressure. Generally speaking, there is only a single App Domain for a particular Assembly (and hence for the code in it) since App Domains are per-Database, per-Owner. So your code is not being called in two App Domains, at least not conceptually.

However, there is a particular sequence-of-events nuance to how SQL Server handles unloading App Domains that you are experiencing. What is happening is that your system is experiencing memory pressure and is marking the App Domain to be unloaded. This can be seen in the SQL Server logs as it will tell you the exact name of the App Domain that is being unloaded.

AppDomain 61 ({database_name}.{owner_name}[runtime].60) is marked for unload due to memory pressure.

When an App Domain is marked for unload, it is allowed to continue running until all currently running process complete. At this point it has a "state" of E_APPDOMAIN_DOOMED instead of the normal E_APPDOMAIN_SHARED. If another process is started, even from within the doomed App Domain, a new App Domain is created. And this is what causes the behavior you are experiencing. The sequence of events is as follows (and yes, I have reproduced this behavior):
1. Execute MyStoredProcedure: App Domain 1 is created if not already existing. App Domain 1 "state" is E_APPDOMAIN_SHARED. result is set to null.
  1. result is populated as expected
  2. MyStoredProcedure executes GetFirstResultOfMyStoredProcedure: App Domain 1 "state" is still E_APPDOMAIN_SHARED. result is retrieved as expected.
  3. App Domain 1 is marked for unload: App Domain 1 "state" is changed to E_APPDOMAIN_DOOMED
  4. MyStoredProcedure executes GetSecondResultOfMyStoredProcedure: App Domain 1 "state" is still E_APPDOMAIN_DOOMED and so cannot be used. App Domain 2 is created. App Domain 2 "state" is E_APPDOMAIN_SHARED. result is set to null. This is why you sometimes get nothing back: this process is in App Domain 2 (even though it was initiated from App Domain 1), with no access to App Domain 1.
2. MyStoredProcedure completes: App Domain 1 is unloaded.
And there is another possibility of how this sequence of events could occur: App Domain 1 could be marked for unload prior to executing GetFirstResultOfMyStoredProcedure. In this case, App Domain 2 is created when GetFirstResultOfMyStoredProcedure is executed, and both it and GetSecondResultOfMyStoredProcedure run in App Domain 2 and return nothing.

Hence, if you want / need an error to be thrown under these conditions, then your Get*ResultOfMyStoredProcedure methods need to check if result == null before attempting to retrieve, and if it is null, then throw an error. OR, if it is possible to recalculate the value of what is being stored in the static variable, then if it is null simply repopulate it (e.g. call ComputeComplexAlgorithm again).
A less likely possibility is that since the App Domain is shared by all sessions / callers to that code, it is possible, if you have not otherwise ensured that there is only ever 1 execution of this process at a time, that someone else or a SQL Agent job or something, executed MyStoredProcedure which would null out the static variable as it starts.

Since you have already accepted using an UNSAFE Assembly in order to get the updateable static variable, you might as well add a locking mechanism to ensure that MyStoredProcedure is single-threaded.

Aside from those areas to look at, this process could most likely be done even faster and in a less convoluted manner. You can use Table-Valued Parameters (TVPs) to stream data back to SQL Server, just like you would from app code. Just create one or two User-Defined Table Types (UDTTs) that match the structures of the two result sets being returned by the TVFs (GetFirstResultOfMyStoredProcedure and GetSecondResultOfMyStoredProcedure). Please see my answer here regarding how to properly stream in the results. By using this model, you can:

do both updates inline in the MyStoredProcedure CLR proc
get rid of the static variable
Possibly no more need for UNSAFE (if it was only being used for the static variable). You might still need EXTERNAL_ACCESS if you can't pass back the results over the Context Connection, in which case you would use a regular connection (i.e. connection string uses either "Server=(local)" or doesn't specify "Server").
get rid of UpdateDataFromMyStoredProcedure method
get rid of UpdateDataFromMyStoredProcedure T-SQL proc
get rid of GetFirstResultOfMyStoredProcedure CLR function
get rid of GetSecondResultOfMyStoredProcedure CLR function
free up all of the memory that is currently being used by that static variable to hold two result sets!!

Not only is this approach easier to maintain and most likely faster, it also does not allow for the new App Domain with uninitialized static variable issue that you are running into here :-).

来源：https://stackoverflow.com/questions/6151289/processing-multiple-results-from-clr-stored-procedure-in-t-sql

标签

sql-server

performance

stored-procedures

sqlclr