Data load to huge partitioned table

前端 未结 2 1418
有刺的猬
有刺的猬 2021-02-10 14:25

I have a huge table. First range partitioned by price_date, then hash partitioned by fund_id. The table has 430 million rows. Every day I have a batch job in which insert 1.5 mi

相关标签:
2条回答
  • 2021-02-10 15:07

    I am not sure which database you are using,

    in case of SQL SERVER try creating a staging table, load data into that table, create index and contraints on this temp table and use

    ALTER TABLE with SWITCH clause to add this as a new partition to your current table.

    0 讨论(0)
  • 2021-02-10 15:22

    Go read this:

    http://www.evdbt.com/TGorman%20TD2005%20DWScale.doc

    This works.

    Do you have the challenge of having the staging area be accessible to online query, or late-arriving data (for example, can you get a row today for any day than today/yesterday)?

    I've got code which does scan through my data set of records which I'm going to be loading, and marks the local index subpartitions if the table subpartition is going to be modified. (I'm using this instead of Tim Gorman's reference above because I've got late-arriving data and the need to have the staging area and the warehouse proper available to end users simultaneously.)

    My table is range/list, not range/hash. so you're going to have to modify it some, probably using the ORA_HASH function to find the right subpartition(s). I also write out to a table which subpartitions I'm going to mark as unusable, so I can do all of that in a single pass. It may be slightly more efficient to mark all the subpartition's indexes as unusable in a single ALTER TABLE statement; I was originally only disabling the BITMAP indexes, but even having a single B*tree indexes offline during the data load improved efficiency significiantly.

      procedure DISABLE_LOCAL_INDEXES as
         l_part_name varchar2(30);
         l_subpart_name varchar2(30);
         l_sql varchar2(2000);
         type partition_rec_type is record
         (table_name         varchar2(30),
          partition_name     varchar2(30),
          subpartition_name  varchar2(30),
          list_value         varchar2(10),
          min_ts             timestamp,
          max_ts             timestamp);
         type partition_recs_type
                             is table of partition_rec_type;
         l_partition_recs    partition_recs_type := partition_recs_type();
         l_partition_rec     partition_rec_type;
         l_subpart_id        number := 1;
         l_start_ts          timestamp;
         l_end_ts            timestamp;
         l_found_list_part boolean;
       begin
         -- build set of subpartitions
         l_start_ts := to_timestamp ('1970-01-01', 'yyyy-mm-dd');
         for i in (select p.table_name, p.partition_name, sp.subpartition_name,
                          p.high_value as part_high_value, 
                          sp.high_value as subpart_high_value,
                          p.partition_position, sp.subpartition_position
                     from user_tab_subpartitions sp
                          inner join user_tab_partitions p
                             on p.table_name     = sp.table_name
                            and p.partition_name = sp.partition_name
                    where p.table_name = 'MY_TARGET_TABLE'
                    order by p.partition_position, sp.subpartition_position)
         loop
           if ( (i.partition_position <> 1) and (i.subpartition_position = 1) ) then
             l_start_ts    := l_end_ts + to_dsinterval('0 00:00:00.000000001');
           end if;
           if (i.subpartition_position = 1) then
             l_end_ts := high_val_to_ts (i.part_high_value);
             l_end_ts := l_end_ts - to_dsinterval('0 00:00:00.000000001');
           end if;
           l_partition_rec.table_name        := i.table_name;
           l_partition_rec.partition_name    := i.partition_name;
           l_partition_rec.subpartition_name := i.subpartition_name;
           l_partition_rec.list_value        := i.subpart_high_value;
           l_partition_rec.min_ts            := l_start_ts;
           l_partition_rec.max_ts            := l_end_ts;
           l_partition_recs.extend();
           l_partition_recs(l_subpart_id) := l_partition_rec;
           l_subpart_id := l_subpart_id + 1;
         end loop;
         -- for every combination of list column and date column
         -- which is going to be pushed to MY_TARGET_TABLE
         -- find the subpartition
         -- otherwise find the partition and default subpartition
         for i in (select distinct LIST_COLUMN, DATE_COLUMN as DATE_VALUE
                     from MY_SOURCE_TABLE
                    where IT_IS_BEING_MOVED_TO_TARGET IS TRUE)
         loop
           -- iterate over the partitions
           l_found_list_part := false;
           for k in l_partition_recs.first..l_partition_recs.last
           loop
             -- find the right partition / subpartition for list_value / date_value
             if (    (i.DATE_VALUE >= l_partition_recs(k).min_ts)
                 and (i.DATE_VALUE <= l_partition_recs(k).max_ts) ) then
               if (l_found_list_value = false) then
                 if (to_char(i.LIST_COLUMN, '9999') = l_partition_recs(k).LIST_COLUMN) then
                   l_found_list_value := true;
                 elsif (l_partition_recs(k).LIST_COLUMN = 'DEFAULT') then
                   l_partition_rec := l_partition_recs(k);
                 end if;
               end if;
             end if;
           end loop;  -- over l_partition_recs
           -- log those partitions for later index rebuild
           begin
             insert into index_subpart_rebuild
               (table_name, partition_name, subpartition_name)
             values
               (l_partition_rec.table_name, l_partition_rec.partition_name,
                l_partition_rec.subpartition_name);
           exception
             when dup_val_on_index then null;
             when others then raise;
           end;
         end loop;  -- over MY_TARGET_TABLE.DATE_VALUE values
         commit;
         for i in (select ui.index_name, uis.subpartition_name
                     from user_indexes ui
                          inner join user_ind_subpartitions uis
                             on ui.index_name = uis.index_name
                          inner join index_subpart_rebuild re
                             on re.subpartition_name = uis.subpartition_name
                    where ui.table_name = 'MY_TARGET_TABLE')
         loop
           l_sql := 'alter index ' || i.index_name ||
                    ' modify subpartition ' || i.subpartition_name || ' unusable';
           execute immediate l_sql;
         end loop;
       end DISABLE_LOCAL_INDEXES;
    
    0 讨论(0)
提交回复
热议问题