How do I merge two similar database-schema in PL/SQL?

问题

The database-schema (Source and target) are very large (each has over 350 tables). I have got the task to somehow merge these two tables into one. The data itself (whats in the tables) has to be migrated. I have to be careful that there are no double entries for primary keys before or while merging the schemata. Has anybody ever done that already and would be able to provide me his solution or could anyone help me get a approach to the task? My approaches all failed and my advisor just tells me to get help online :/

To my approach: I have tried using the "all_constraints" table to get all pks from my db.

SELECT cols.table_name, cols.column_name, cols.position, cons.status, cons.owner
FROM all_constraints cons, all_cons_columns cols
WHERE cols.owner = 'DB'
AND cons.constraint_type = 'P'  
AND cons.constraint_name = cols.constraint_name
AND cons.owner = cols.owner
ORDER BY cols.table_name, cols.position;

I also "know" that there has to be a sequence for the primary keys to add values to it:

CREATE SEQUENCE seq_pk_addition
MINVALUE 1
MAXVALUE 99999999999999999999
START WITH 1
INCREMENT BY 1
CACHE 20;

Because I am a noob if it comes to pl/sql (or sql in general) So how/what I should do next? :/

Here is a link for an ERD of the database: https://ufile.io/9tdoj

virus scan: https://www.virustotal.com/#/file/dbe5f418115e50313a2268fb33a924cc8cb57a43bc85b3bbf5f6a571b184627e/detection

回答1:

As promised to help in my comment, i had prepared a dynamic code which you can try to get the data merged with the source and target tables. The logic is as below:

Step1: Get all the table names from the SOURCE schema. In the query below you can you need to replace the schema(owner) name respectively. For testing purpose i had taken only 1 table so when you run it,remove the table name filtering clause.

Step2: Get the constrained columns names for the table. This is used to prepared the ON clause which would be later used for MERGE statement.

Step3: Get the non-constrainted column names for the table. This would be used in UPDATE clause while using MERGE.

Step4: Prepare the insert list when the data doesnot match ON conditon of MERGE statement.

Read my inline comments to understand each step.

CREATE OR REPLACE PROCEDURE COPY_TABLE
AS
Type OBJ_NME is table of varchar2(100) index by pls_integer;

--To hold Table name
v_obj_nm OBJ_NME ;

--To hold Columns of table
v_col_nm OBJ_NME;

v_othr_col_nm OBJ_NME;
on_clause VARCHAR2(2000);
upd_clause VARCHAR2(4000);
cntr number:=0;
v_sql VARCHAR2(4000);

col_list1  VARCHAR2(4000);
col_list2  VARCHAR2(4000);
col_list3  VARCHAR2(4000);
col_list4  varchar2(4000);
col_list5  VARCHAR2(4000);
col_list6  VARCHAR2(4000);
col_list7  VARCHAR2(4000);
col_list8  varchar2(4000);

BEGIN

--Get Source table names
SELECT OBJECT_NAME
BULK COLLECT INTO v_obj_nm
FROM all_objects 
WHERE owner LIKE  'RU%' -- Replace `RU%` with your Source schema name here
AND object_type = 'TABLE'
and object_name ='TEST'; --remove this condition if you want this to run for all tables

FOR I IN 1..v_obj_nm.count
loop
--Columns with Constraints 
  SELECT column_name
  bulk collect into v_col_nm 
  FROM user_cons_columns
  WHERE table_name = v_obj_nm(i);  

--Columns without Constraints remain columns of table
SELECT *
BULK COLLECT INTO v_othr_col_nm
from (
      SELECT column_name 
      FROM user_tab_cols
      WHERE table_name = v_obj_nm(i)
      MINUS
      SELECT column_name  
      FROM user_cons_columns
      WHERE table_name = v_obj_nm(i));

--Prepare Update Clause  
 FOR l IN 1..v_othr_col_nm.count
  loop
   cntr:=cntr+1;
   upd_clause := 't1.'||v_othr_col_nm(l)||' = t2.' ||v_othr_col_nm(l);    
   upd_clause:=upd_clause ||' and ' ;

   col_list1:= 't1.'||v_othr_col_nm(l) ||',';
   col_list2:= col_list2||col_list1;   

   col_list5:= 't2.'||v_othr_col_nm(l) ||',';
   col_list6:= col_list6||col_list5;

   IF (cntr = v_othr_col_nm.count)
   THEN 
    --dbms_output.put_line('YES');
     upd_clause:=rtrim(upd_clause,' and');
     col_list2:=rtrim( col_list2,',');
     col_list6:=rtrim( col_list6,',');
   END IF;
     dbms_output.put_line(col_list2||col_list6); 
   --dbms_output.put_line(upd_clause);
   End loop;
  --Update caluse ends     

   cntr:=0; --Counter reset  

 --Prepare ON clause  
  FOR k IN 1..v_col_nm.count
  loop
   cntr:=cntr+1;
   --dbms_output.put_line(v_col_nm.count || cntr);
   on_clause := 't1.'||v_col_nm(k)||' = t2.' ||v_col_nm(k);    
   on_clause:=on_clause ||' and ' ;

   col_list3:= 't1.'||v_col_nm(k) ||',';
   col_list4:= col_list4||col_list3;    

   col_list7:= 't2.'||v_col_nm(k) ||',';
   col_list8:= col_list8||col_list7;    

   IF (cntr = v_col_nm.count)
   THEN 
    --dbms_output.put_line('YES');
    on_clause:=rtrim(on_clause,' and');
    col_list4:=rtrim( col_list4,',');
    col_list8:=rtrim( col_list8,',');
   end if;

   dbms_output.put_line(col_list4||col_list8);
 -- ON clause ends

 --Prepare merge Statement

    v_sql:= 'MERGE INTO '|| v_obj_nm(i)||' t1--put target schema name before v_obj_nm
              USING (SELECT * FROM '|| v_obj_nm(i)||') t2-- put source schema name befire v_obj_nm here 
              ON ('||on_clause||')
              WHEN MATCHED THEN
              UPDATE
              SET '||upd_clause ||              
              ' WHEN NOT MATCHED THEN
              INSERT  
              ('||col_list2||','
                ||col_list4||
              ')
              VALUES
              ('||col_list6||','
                ||col_list8||          
               ')';

      dbms_output.put_line(v_sql);   
      execute immediate v_sql;
  end loop;    
End loop;
END;
/

Execution:

exec COPY_TABLE

Output:

anonymous block completed

PS: i have tested this with a table with 2 columns out of which i was having unique key constraint .The DDL of table is as below:

At the end i wish you could understand my code(you being a noob) and implement something similar if the above fails for your requirement.

 CREATE TABLE TEST
       (    COL2 NUMBER, 
            COLUMN1 VARCHAR2(20 BYTE), 
            CONSTRAINT TEST_UK1 UNIQUE (COLUMN1)  
       ) ;

回答2:

Oh dear! Normally, such a question would be quickly closed as "too broad", but we need to support victims of evil advisors!

As for the effort, I would need a week full time for an experienced expert plus two days quality checking for an experierenced QA engineer.

First of all, there is no way that such a complex data merge will work on the first try. That means that you'll need test copies of both schemas that can be easily rebuild. And you'll need a place to try it out. Normally this is done with an export of both schemas and an empty dev database.

Next, you need both schemas close enough to be able to compare the data. I'd do it with an import of the export files mentione above. If the schema names are identical than rename one during import.

Next, I'd doublecheck if the structure is really identical, with queries like

 SELECT a.owner, a.table_name, b.owner, b.table_name
   FROM all_tables a 
   FULL JOIN all_tables b 
     ON a.table_name = b.table_name
    AND a.owner = 'SCHEMAA' 
    AND b.owner = 'SCHEMAB'
  WHERE a.owner IS NULL or b.owner IS NULL;

Next, I'd check if the primary and unique keys have overlaps:

 SELECT id FROM schemaa.table1
 INTERSECT
 SELECT id FROM schemab.table1;

As there are 300+ tables, I'd generate those queries:

 DECLARE 
   stmt VARCHAR2(30000);
   n NUMBER;
   schema_a CONSTANT VARCHAR2(128 BYTE) := 'SCHEMAA';
   schema_b CONSTANT VARCHAR2(128 BYTE) := 'SCHEMAB';
 BEGIN
   FOR c IN (SELECT owner, constraint_name, table_name,
                    (SELECT LISTAGG(column_name,',') WITHIN GROUP (ORDER BY position)
                       FROM all_cons_columns c
                      WHERE s.owner = c.owner
                        AND s.constraint_name = c.constraint_name) AS cols
               FROM all_constraints s
              WHERE s.constraint_type IN ('P') 
                AND s.owner = schema_a) 
   LOOP
     dbms_output.put_line('Checking pk '||c.constraint_name||' on table '||c.table_name);
     stmt := 'SELECT count(*) FROM '||schema_a||'.'||c.table_name
          ||' JOIN '||schema_b||'.'||c.table_name
          || ' USING ('||c.cols||')';
     --dbms_output.put_line('Query '||stmt);
     EXECUTE IMMEDIATE stmt INTO n;
     dbms_output.put_line('Found '||n||' overlapping primary keys in table '||c.table_name);
   END LOOP;
 END;
 /

回答3:

1st of all, for 350 tables, most probably, would need an dynamic SQL.

Declare a CURSOR or a COLLECTION - table of VARCHAR2 with all table names.
Declare a string variable to build the dynamic SQL.
loop through the entire list of the tables name and, for each table generates a string which will be executed as SQL with EXECUTE IMMEDIATE command.
The dynamic SQL which will be built, should insert the values from source table into the target table. In case the PK already exists, in target table, should be checked the field which represents the last updated date and if in source table it is bigger than in target table, then perform an update in target table, else, do nothing.

来源：https://stackoverflow.com/questions/51059882/how-do-i-merge-two-similar-database-schema-in-pl-sql

标签

Oracle

primary-key

database-schema