In .NET there is the null
reference, which is used everywhere to denote that an object reference is empty, and then there is the DBNull
, which is used by database drivers (and few others) to denote... pretty much the same thing. Naturally, this creates a lot of confusion and conversion routines have to be churned out, etc.
So why did the original .NET authors decide to make this? To me it makes no sense. Their documentation makes no sense either:
The DBNull class represents a nonexistent value. In a database, for example, a column in a row of a table might not contain any data whatsoever. That is, the column is considered to not exist at all instead of merely not having a value. A DBNull object represents the nonexistent column. Additionally, COM interop uses the DBNull class to distinguish between a VT_NULL variant, which indicates a nonexistent value, and a VT_EMPTY variant, which indicates an unspecified value.
What's this crap about a "column not existing"? A column exists, it just doesn't have a value for the particular row. If it didn't exist, I'd get an exception trying to access the specific cell, not a DBNull
! I can understand the need to differentiate between VT_NULL
and VT_EMPTY
, but then why not make a COMEmpty
class instead? That would be a much neater fit in the whole .NET framework.
Am I missing something? Can anyone shed some light why DBNull
was invented and what problems it helps to solve?
The point is that in certain situations there is a difference between a database value being null and a .NET Null.
For example. If you using ExecuteScalar (which returns the first column of the first row in the result set) and you get a null back that means that the SQL executed did not return any values. If you get DBNull back it means a value was returned by the SQL and it was NULL. You need to be able to tell the difference.
I'm going to disagree with the trend here. I'll go on record:
I do not agree that
DBNull
serves any useful purpose; it adds unnecessary confusion, while contributing virtually no value.
The argument is often put forward that null
is an invalid reference, and that DBNull
is a null object pattern; neither is true. For example:
int? x = null;
this is not an "invalid reference"; it is a null
value. Indeed null
means whatever you want it to mean, and frankly I have absolutely no problem working with values that may be null
(indeed, even in SQL we need to correctly work with null
- nothing changes here). Equally, the "null object pattern" only makes sense if you are actually treating it as an object in OOP terms, but if we have a value that can be "our value, or a DBNull
" then it must be object
, so we can't be doing anything useful with it.
There are so many bad things with DBNull
:
- it forces you to work with
object
, since onlyobject
can holdDBNull
or another value - there is no real difference between "could be a value or
DBNull
" vs "could be a value ornull
" - the argument that it stems from 1.1 (pre-nullable-types) is meaningless; we could use
null
perfectly well in 1.1 - most APIs have "is it null?" methods, for example
DBDataReader.IsDBNull
orDataRow.IsNull
- neither of which actually requireDBNull
to exist in terms of the API DBNull
fails in terms of null-coalescing;value ?? defaultValue
doesn't work if the value isDBNull
DBNull.Value
can't be used in optional parameters, since it isn't a constant- the runtime semantics of
DBNull
are identical to the semantics ofnull
; in particular,DBNull
actually equalsDBNull
- so it does not do the job of representing the SQL semantic - it often forces value-type values to be boxed since it over-uses
object
- if you need to test for
DBNull
, you might as well have tested for justnull
- it causes huge problems for things like command-parameters, with a very silly behaviour that if a parameter has a
null
value it isn't sent... well, here's an idea: if you don't want a parameter sent - don't add it to the parameters collection - every ORM I can think of works perfectly well without any need or use of
DBNull
, except as an extra nuisance when talking to the ADO.NET code
The only even remotely compelling argument I've ever seen to justify the existence of such a value is in DataTable
, when passing in values to create a new row; a null
means "use the default", a DBNull
is explicitly a null - frankly this API could have had a specific treatment for this case - an imaginary DataRow.DefaultValue
for example would be much better than introducing a DBNull.Value
that infects vast swathes of code for no reason.
Equally, the ExecuteScalar
scenario is... tenuous at best; if you are executing a scalar method, you expect a result. In the scenario where there are no rows, returning null
doesn't seem too terrible. If you absolutely need to disambiguate between "no rows" and "one single null returned", there's the reader API.
This ship has sailed long ago, and it is far far too late to fix it. But! Please do not think that everyone agrees that this is an "obvious" thing. Many developers do not see value in this odd wrinkle on the BCL.
I actually wonder if all of this stems from two things:
- having to use the word
Nothing
instead of something involving "null" in VB - being able to us the
if(value is DBNull)
syntax which "looks just like SQL", rather than the oh-so-trickyif(value==null)
Summary:
Having 3 options (null
, DBNull
, or an actual value) is only useful if there is a genuine example where you need to disambiguate between 3 different cases. I have yet to see an occasion where I need to represent two different "null" states, so DBNull
is entirely redundant given that null
already exists and has much better language and runtime support.
DbNull
represents a box with no contents; null
indicates the non-existence of the box.
You use DBNull for missing data. Null in the .NET language means that there is no pointer for an object/variable.
DBNull missing data: http://msdn.microsoft.com/en-us/library/system.dbnull.value.aspx
The effects of missing data on statistics:
There are some differences between a CLR null and a DBNull. First, null in relational databases has different "equals" semantics: null is not equal to null. CLR null IS equal to null.
But I suspect the main reason is to do with the way parameter default values work in SQL server and the implementation of the provider.
To see the difference, create a procedure with a parameter that has a default value:
CREATE PROC [Echo] @s varchar(MAX) = 'hello'
AS
SELECT @s [Echo]
Well-structured DAL code should separate command creation from use (to enable using the same command many times, for example to invoke a stored procedure many times efficiently). Write a method that returns a SqlCommand representing the above procedure:
SqlCommand GetEchoProc()
{
var cmd = new SqlCommand("Echo");
cmd.Parameters.Add("@s", SqlDbType.VarChar);
return cmd;
}
If you now invoke the command without setting the @s parameter, or set its value to (CLR) null, it will use the default value 'hello'. If on the other hand you set the parameter value to DBNull.Value, it will use that and echo DbNull.Value.
Since there's two different results using CLR null or database null as parameter value, you can't represent both cases with only one of them. If CLR null was to be the only one, it'd have to work the way DBNull.Value does today. One way to indicate to the provider "I want to use the default value" could then be to not declare the parameter at all (a parameter with a default value of course makes sense to describe as an "optional parameter"), but in a scenario where the command object is cached and reused this does lead to removing and re-adding the parameter.
I'm not sure if I think DBNull was a good idea or not, but a lot of people are unaware of the things I've mentioned here, so I figured it worth mentioning.
来源:https://stackoverflow.com/questions/4488727/what-is-the-point-of-dbnull