问题
I make a query with INNER JOIN and the result was 12 millions lines. I like to put this in a table. I did some tests and when I created the table using clause AS SELECT was more faster than, create the table first and run a INSERT with SELECT after. I don't understand why. Somebody can explain for me? Tks
回答1:
If you use 'create table as select' (CTAS)
CREATE TABLE new_table AS
SELECT *
FROM old_table
you automatically do a direct-path insert of the data. If you do an
INSERT INTO new_table AS
SELECT *
FROM old_table
you do a conventional insert. You have to use the APPEND-hint, if you want to do a direct path insert instead. So you have to do
INSERT /*+ APPEND */ INTO new_table AS
SELECT *
FROM old_table
to get a similar performance as in 'CREATE TABLE AS SELECT'.
How does the usual conventional insert work?
Oracle checks the free list of the table for an already used block of the table segment that has still free space. If the block isn't in the buffer cache it is read into the buffer cache. Eventually this block is read back to the disk. During this process undo for the block is written (only a small amount of data is necessary here), data structures are updated, e.g. if necessary, the free list,that esides in the segment header and all these changes are written to the redo-buffer, too.
How does a direct-path insert work?
The process allocates space above the high water mark of the table, that is, beyond the already used space. It writes the data directly to the disk, without using a buffer cache. And it is also written to the redo buffer. When the session is committed, the highwater mark is raised beyond the new written data and this data is now visible to other sessions.
How can I improve CTAS and direct-path inserts?
- You can create he tale in NOLOGGING mode, than no redo information is written. If you do this, you should make a backup of the tablespace that contains the table after the insert, otherwisse you can not recover the table if you need this.
You can do the select in parallel
You can do the insert in parallel
If you have to maintain indexes and constraints or even triggers during an insert operation this can slow down your insert operation drastically. So you should avoid this and create indexes after the insert and maybe create constraints with novalidata.
回答2:
With SELECT STATEMENT The table you create has no primary key, index, identity ... the columns are always allow NULL. And It does not have to be written to the transaction log (and therefore does not rollback). It's seem like a "Naked Table".
With INSERT ... SELECT then table must be created before so when you create table you can define key, index, identity ... And it will use transaction logs When applied to large amounts of data, it is very slow.
来源:https://stackoverflow.com/questions/51432405/why-create-table-as-select-is-more-faster-than-insert-with-select