Find all those columns which have only null values, in a MySQL table

后端 未结 6 1408
太阳男子
太阳男子 2020-12-08 07:43

The situation is as follows:

I have a substantial number of tables, with each a substantial number of columns. I need to deal with this old and to-be-deprecated data

相关标签:
6条回答
  • 2020-12-08 08:22
    select column_name
    from user_tab_columns
    where table_name='Table_name' and num_nulls>=1;
    

    Just by simple query you will get those two columns.

    0 讨论(0)
  • 2020-12-08 08:33

    I am not an expert in SQL procedures, hence giving general idea using SQL queries and a PHP/python script.

    • use SHOW TABLES or some other query on INFORMATION_SCHEMA database to get all tables in your database MY_DATABASE

    • do a query to generate a statement to get all column names in a particular table, this will be used in next query.

     SELECT Group_concat(Concat( "MAX(", column_name, ")" ))
             FROM   information_schema.columns
             WHERE  table_schema = 'MY_DATABSE'
                    AND table_name = 'MY_TABLE'
             ORDER  BY table_name,ordinal_position
    
    • You will get an output like MAX(column_a),MAX(column_b),MAX(column_c),MAX(column_d)

    • Use this output to generate final query :

    SELECT Max(column_a), Max(column_b), Max(column_c), Max(column_d) FROM MY_DATABASE.MY_TABLE

    The output would be :

       MAX(column_a)    MAX(column_b)   MAX(column_c)   MAX(column_d)
         NULL            1           NULL                1
    
    • All the columns with Max value as NULL are the ones which have all values NULL
    0 讨论(0)
  • 2020-12-08 08:41

    SQL Fiddle Demo Link

    I have created 4 tables. Three for demo and one nullcolumns is the compulsory part of solution. Among three tables, only salary and dept have columns with all values null (you may have a look at their script).

    The compulsory table and the procedure are given at the end

    You can copy paste and run (the compulsory part or all) as sql (just you have to change the delimiter to //) in your desired database on your localhost and then --- call get(); and see the results

    CREATE TABLE IF NOT EXISTS `dept` (
      `did` int(11) NOT NULL,
      `dname` varchar(50) DEFAULT NULL,
      PRIMARY KEY (`did`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    
    INSERT INTO `dept` (`did`, `dname`) VALUES
    (1, NULL),
    (2, NULL),
    (3, NULL),
    (4, NULL),
    (5, NULL);
    
    CREATE TABLE IF NOT EXISTS `emp` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `ename` varchar(50) NOT NULL,
      `did` int(11) NOT NULL,
      PRIMARY KEY (`ename`),
      KEY `deptid` (`did`),
      KEY `id` (`id`)
    ) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;
    
    
    INSERT INTO `emp` (`id`, `ename`, `did`) VALUES
    (1, 'e1', 4),
    (2, 'e2', 4),
    (3, 'e3', 2),
    (4, 'e4', 4),
    (5, 'e5', 3);
    
    
    CREATE TABLE IF NOT EXISTS `salary` (
      `EmpCode` varchar(50) NOT NULL,
      `Amount` int(11) DEFAULT NULL,
      `Date` int(11) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    INSERT INTO `salary` (`EmpCode`, `Amount`, `Date`) VALUES
    ('1', 344, NULL),
    ('2', NULL, NULL);
    
    ------------------------------------------------------------------------
    ------------------------------------------------------------------------
    
    CREATE TABLE IF NOT EXISTS `nullcolumns` (
      `Table_Name` varchar(100) NOT NULL,
      `Column_Name` varchar(100) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    --Only one procedure Now
    CREATE PROCEDURE get(dn varchar(100))
    BEGIN
    declare c1 int; declare b1 int default 0; declare tn varchar(30);
    declare c2 int; declare b2 int; declare cn varchar(30);
    
    select count(*) into c1 from information_schema.tables where table_schema=dn;
    delete from nullcolumns;
    while b1<c1 do
    select table_name into tn from information_schema.tables where
    table_schema=dn limit b1,1;        
    
    select count(*) into c2 from information_schema.columns where
    table_schema=dn and table_name=tn;
    set b2=0;
    while b2<c2 do
    select column_name into cn from information_schema.columns where
    table_schema=dn and table_name=tn limit b2,1;
    
    set @nor := 0;
    set @query := concat("select count(*) into @nor from ", dn,".",tn);
    prepare s1 from @query;
    execute s1;deallocate prepare s1;
    
    if @nor>0 then set @res := 0;
    set @query := concat("select ((select max(",cn,") from ", dn,".",tn,")
    is NULL) into @res");
    prepare s1 from @query;
    execute s1;deallocate prepare s1;
    
    if @res=1 then
    insert into nullcolumns values(tn,cn);
    end if; end if;
    
    set b2=b2+1;
    end while;
    
    set b1=b1+1;
    end while;
    select * from nullcolumns;
    END;
    

    You can easily execute stored procedure easily as sql in your phpmyadin 'as it is' just change the Delimiters (at the bottom of SQL quesry box) to // Then

    call get();
    

    And Enjoy :)

    You can see Now the table nullcolumns showing all columns having 100/100 null values along with the table Names

    In procedure code if @nor>0 restricts that no empty table should be included in results you can remove that restriction.

    0 讨论(0)
  • 2020-12-08 08:42

    You can avoid using a procedure by dynamically creating (from the INFORMATION_SCHEMA.COLUMNS table) a string that contains the SQL you wish to execute, then preparing a statement from that string and executing it.

    The SQL we wish to build will look like:

    SELECT * FROM (
      SELECT 'tableA' AS `table`,
             IF(COUNT(`column_a`), NULL, 'column_a') AS `column`
      FROM   tableA
    UNION ALL
      SELECT 'tableB' AS `table`,
             IF(COUNT(`column_b`), NULL, 'column_b') AS `column`
      FROM   tableB
    UNION ALL
      -- etc.
    ) t WHERE `column` IS NOT NULL
    

    This can be done using the following:

    SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation
    
    SELECT CONCAT(
             'SELECT * FROM ('
           ,  GROUP_CONCAT(
                'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
              , 'IF('
              ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
              ,   'NULL,'
              ,    QUOTE(COLUMN_NAME)
              , ') AS `column` '
              , 'FROM `', REPLACE(TABLE_NAME, '`', '``'), '`'
              SEPARATOR ' UNION ALL '
             )
           , ') t WHERE `column` IS NOT NULL'
           )
    INTO   @sql
    FROM   INFORMATION_SCHEMA.COLUMNS
    WHERE  TABLE_SCHEMA = DATABASE();
    
    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
    

    See it on sqlfiddle.

    0 讨论(0)
  • 2020-12-08 08:46

    You can take advantage of the behavior of COUNT aggregate function regarding NULLs. By passing the field as argument, the COUNT function returns the number of non-NULL values while COUNT(*) returns the total number of rows. Thus you can calculate the ratio of NULL to "acceptable" values.

    I will give an example with the following table structure:

    CREATE TABLE `t1` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
       `col_1` int(10) unsigned DEFAULT NULL,
       `col_2` int(10) unsigned DEFAULT NULL,
       PRIMARY KEY (`id`)
    ) ;
    
    -- let's fill the table with random values
    INSERT INTO t1(col_1,col_2) VALUES(1,2);
    INSERT INTO t1(col_1,col_2) 
    SELECT 
    IF(RAND() > 0.5, NULL ,FLOOR(RAND()*1000), 
    IF(RAND() > 0.5, NULL ,FLOOR(RAND()*1000) FROM t1;
    
    -- run the last INSERT-SELECT statement a few times
    SELECT COUNT(col_1)/COUNT(*) AS col_1_ratio, 
    COUNT(col_2)/COUNT(*) AS col_2_ratio FROM t1;
    

    You can write a function that automatically constructs a query from the INFORMATION_SCHEMA database by passing the table name as input variable. Here's how to obtain the structure data directly from INFORMATION_SCHEMA tables:

    SET @query:=CONCAT("SELECT @column_list:=GROUP_CONCAT(col) FROM (
    SELECT CONCAT('COUNT(',c.COLUMN_NAME,')/COUNT(*)') AS col
    FROM INFORMATION_SCHEMA.COLUMNS c 
    WHERE NOT COLUMN_KEY IN('PRI') AND TABLE_SCHEMA=DATABASE() 
    AND TABLE_NAME='t1' ORDER BY ORDINAL_POSITION ) q");
    PREPARE COLUMN_SELECT FROM @query;
    EXECUTE COLUMN_SELECT;
    SET @null_counters_sql := CONCAT('SELECT ',@column_list, ' FROM t1');
    PREPARE NULL_COUNTERS FROM @null_counters_sql;
    EXECUTE NULL_COUNTERS;
    
    0 讨论(0)
  • 2020-12-08 08:46

    I think you can do this with GROUP_CONCAT and GROUP BY:

    select length(replace(GROUP_CONCAT(my_col), ',', ''))
    from my_table
    group by my_col
    

    (untested)

    EDIT: the docs don't seem to state that GROUP_CONCAT needs a corresponding GROUP BY, so try this:

    select 
        length(replace(GROUP_CONCAT(col_a), ',', '')) as len_a
        , length(replace(GROUP_CONCAT(col_b), ',', '')) as len_b
        , length(replace(GROUP_CONCAT(col_c), ',', '')) as Len_c
    from my_table
    
    0 讨论(0)
提交回复
热议问题