Single SQL query to find null values in all columns in a data base

杀马特。学长 韩版系。学妹 提交于 2021-02-08 12:05:58

问题


I would like to identify the number of null values in each column in all tables.I have a data base it consist of around 250 tables.Most of them are in use.The problem is almost all tables contain unwanted columns which created for some short term use.Now we want to identify columns with null values for all tables.Since the count of table is large and time is less.I would like to know an easiest way to identify null record count on each table in column wise.

I tried this query which i got from internet.But in this i have to give each table name manually.

DECLARE @cols1 NVARCHAR(MAX);
DECLARE @sql NVARCHAR(MAX);

SELECT @cols1 = STUFF((
    SELECT ', COUNT(CASE WHEN ISNULL(CONVERT(NVARCHAR(MAX), [' + t1.NAME + ']),'''' ) = '''' THEN 1 END) AS ' + t1.name
    FROM sys.columns AS t1
    WHERE t1.object_id = OBJECT_ID('Area')
    -- ORDER BY ', COUNT([' + t1.name + ']) AS ' + t1.name
    FOR XML PATH('')
), 1, 2, '');

SET @sql = '
SELECT ' + @cols1 + '
FROM Area
'
EXEC(@sql)

Please help me get a improved query get Result.

Thanku


回答1:


This is a mess, but it works:

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

CREATE TABLE #NullCounts (SchemaName sysname,
                          TableName sysname,
                          ColumnName sysname,
                          NULLCount bigint);

DECLARE @Delimiter nchar(3) = ',' +@CRLF;

SET @SQL = STUFF((SELECT @CRLF + @CRLF +
                         N'WITH Counts AS(' + @CRLF +
                         N'    SELECT N' + QUOTENAME(s.[name],'''') +N' AS SchemaName,' + @CRLF +
                         N'           N' + QUOTENAME(t.[name],'''') +N' AS TableName,' + @CRLF +
                         STRING_AGG(N'           COUNT_BIG(CASE WHEN N' + QUOTENAME(c.[name],'''') + N' IS NULL THEN 1 END) AS ' + QUOTENAME(c.[name]),@Delimiter) WITHIN GROUP(ORDER BY c.column_id) + @CRLF +
                         N'    FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name]) + N' T)' + @CRLF +
                         N'INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)' + @CRLF +
                         N'SELECT SchemaName,' + @CRLF +
                         N'       TableName,' + @CRLF +
                         N'       V.ColumnName,' + @CRLF +
                         N'       V.NULLCount' + @CRLF +
                         N'FROM Counts C' + @CRLF +
                         N'     CROSS APPLY (VALUES' +
                         STUFF(STRING_AGG(N'                        (N' + QUOTENAME(c.[name], '''') + N', C.' + QUOTENAME(c.[name]) + N')',@Delimiter) WITHIN GROUP (ORDER BY c.column_id),1,24,N'') + N')V(ColumnName,NULLCount);'
                  FROM sys.schemas s
                       JOIN sys.tables t ON s.schema_id = t.schema_id
                       JOIN sys.columns c ON t.object_id = c.object_id
                  GROUP BY s.[name], t.[name]
                  FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,4,N'');

--PRINT @SQL; --This is gunna be way longer than 4,000 characters, so you'll want SELECT

EXEC sys.sp_executesql @SQL;

GO

SELECT *
FROM #NullCounts
ORDER BY SchemaName,
         TableName,
         ColumnName;

GO

DROP TABLE #NullCounts;

Yes, I mix STRING_AGG and FOR XML PATH, yes it's an eyesore, but the printed (selected) SQL produces some very nice statements. See below:

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'PerformanceTest' AS TableName,
           COUNT_BIG(CASE WHEN N'TestID' IS NULL THEN 1 END) AS [TestID],
           COUNT_BIG(CASE WHEN N'TestTarget' IS NULL THEN 1 END) AS [TestTarget],
           COUNT_BIG(CASE WHEN N'TestName' IS NULL THEN 1 END) AS [TestName],
           COUNT_BIG(CASE WHEN N'TimeStart' IS NULL THEN 1 END) AS [TimeStart],
           COUNT_BIG(CASE WHEN N'TimeEnd' IS NULL THEN 1 END) AS [TimeEnd],
           COUNT_BIG(CASE WHEN N'TimeTaken_ms' IS NULL THEN 1 END) AS [TimeTaken_ms],
           COUNT_BIG(CASE WHEN N'TotalRows' IS NULL THEN 1 END) AS [TotalRows],
           COUNT_BIG(CASE WHEN N'RowSets' IS NULL THEN 1 END) AS [RowSets],
           COUNT_BIG(CASE WHEN N'AvgRowsPerSet' IS NULL THEN 1 END) AS [AvgRowsPerSet]
    FROM [dbo].[PerformanceTest] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'TestID', C.[TestID]),
                        (N'TestTarget', C.[TestTarget]),
                        (N'TestName', C.[TestName]),
                        (N'TimeStart', C.[TimeStart]),
                        (N'TimeEnd', C.[TimeEnd]),
                        (N'TimeTaken_ms', C.[TimeTaken_ms]),
                        (N'TotalRows', C.[TotalRows]),
                        (N'RowSets', C.[RowSets]),
                        (N'AvgRowsPerSet', C.[AvgRowsPerSet]))V(ColumnName,NULLCount);

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'someTable' AS TableName,
           COUNT_BIG(CASE WHEN N'id' IS NULL THEN 1 END) AS [id],
           COUNT_BIG(CASE WHEN N'SomeCol' IS NULL THEN 1 END) AS [SomeCol]
    FROM [dbo].[someTable] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'id', C.[id]),
                        (N'SomeCol', C.[SomeCol]))V(ColumnName,NULLCount);

And yes, I really spent the last 45 minutes writing all that...

Honestly, this is not entry level, and if you don't understand it, you shouldn't be using it; but also, you I very much doubt you'll find a different solution that is entry level and that is as performant as this. A CURSOR, for example, although probably easier to understand would be really slow doing this.

Caveat: If you have any deprecated data types in your database (i.e. text) this will fail. If that is the case, you will need to ensure you eliminate them from the query in the WHERE. However, I suggest that you fix your data types (text, for example, has been deprecated for 15 years).




回答2:


For tabular results:

declare @sql  nvarchar(max) = 
(
select 'union all select (select object_schema_name('+ cast(tableobjectid as varchar(20))+') +''.''+ object_name('+ cast(tableobjectid as varchar(20))+') as "table/@name", count(*) as "table/@rowcount", ' + cols_concat + ' from ' + tablename + ' for xml path(''''), type)' as 'text()'
from
(
select 
    t.object_id as tableobjectid,
    quotename(schema_name(t.schema_id)) + '.' + quotename(t.name) as tablename,
    stuff( (select ', col_name(' + cast(c.object_id as varchar(20)) + ',' + cast(c.column_id as varchar(20)) + ') as "table/col/@name", count(*)-count('+case when type_name(c.system_type_id) in ('text', 'ntext', 'image') then ' case when ' + quotename(c.name) + ' is not null then 1 end' else quotename(c.name) end + ') as "table/col", null as "table"'
            from sys.columns as c where c.object_id = t.object_id for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '') as cols_concat
from sys.tables as t
where t.is_ms_shipped = 0
) as tbl
for xml path(''), type).value('.', 'nvarchar(max)')
;

select @sql = 'select @xml = (select * from (select cast(null as xml) as "*"  '+ @sql + ') as u for xml path('''') )';
declare @x xml;

set transaction isolation level read uncommitted;
exec sp_executesql @sql, N'@xml xml output', @xml = @x output;
set transaction isolation level read committed;

--shred 
select 
    t.col.value('@name[1]', 'varchar(200)') as tablename,
    t.col.value('@rowcount[1]', 'int') as tablerowcount,    
    r.col.value('@name[1]', 'varchar(200)') as columnname,  
    r.col.value('.[1]', 'int') as null_values
from @x.nodes('table') as t(col)
cross apply t.col.nodes('col') as r(col);

For multiple resultsets (one per table):

  declare @sql  nvarchar(max) = 
    (
    select 'select object_schema_name('+ cast(tableobjectid as varchar(20))+') +''.''+ object_name('+ cast(tableobjectid as varchar(20))+') as tablename, ' + cols_concat + ' from ' + tablename + ';' as 'text()'
    from
    (
    select 
        t.object_id as tableobjectid,
        quotename(schema_name(t.schema_id)) + '.' + quotename(t.name) as tablename,
        stuff( (select ', count(*)-count('+case when type_name(c.system_type_id) in ('text', 'ntext', 'image') then ' case when ' + quotename(c.name) + ' is not null then 1 end' else quotename(c.name) end + ') as "NULL_' + replace(quotename(name), '"', '""')+'"'  from sys.columns as c where c.object_id = t.object_id for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '') as cols_concat
    from sys.tables as t
    where t.is_ms_shipped = 0
    ) as tbl
    for xml path(''), type).value('.', 'nvarchar(max)');

    set transaction isolation level read uncommitted;
    exec(@sql);    
    set transaction isolation level read committed;


来源:https://stackoverflow.com/questions/60186884/single-sql-query-to-find-null-values-in-all-columns-in-a-data-base

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!