Bulk extraction of Oracle BLOBS into files - advice/tuning help needed

问题

I am working on a project that needs migration of existing Oracle blobs into files. The environment to read from is a shared Oracle 10gR2 server. Currently I have a script using UTL_FILE. However, the process is pretty slow. It takes around 3 hours to extract 25 GB of sample data. The actual data to be moved is in the order of 1 TB. I need help/advice in tuning this significantly.

Here is my process:

Open a cursor to fetch the list of blob ids and names
Start a loop to go thru each blob
Extract the blob using BLOB2FILE, a custom stored procedure (picked it up from a website and modified it slightly)

here is the code:

create or replace
PROCEDURE BLOB2File(
    lngBlobID IN NUMBER,
    sFileName IN VARCHAR2,
    sDir      IN VARCHAR2)
AS
  iFileLen INTEGER;
  iLineLen INTEGER := 32000; -- max line size for utl_file
  vStart   NUMBER  := 1;
  vBlob BLOB;
  l_output utl_file.file_type;
  my_vr RAW(32000);
  iTmp INTEGER;
BEGIN
  -- get blob details
  LOG_IT('Entered. Blob Id: ' || lngBlobID || ', File Name: ' || sFileName || ', Directory: ' || sDir);
  SELECT blobData,
    lengthb(blobData)
  INTO vBlob,
    iFileLen
  FROM blobTable
  WHERE id = lngBlobID;
  LOG_IT('Acquired the blob. Blob size: ' || TO_CHAR(iFileLen));
  l_output := utl_file.fopen(sDir, sFileName,'wb', iLineLen);
  vStart   := 1;
  iTmp     := iFileLen;
  -- if small enough for a single write
  IF iFileLen < iLineLen THEN
    utl_file.put_raw(l_output,vBlob);
    utl_file.fflush(l_output);
  ELSE -- write in pieces
    vStart      := 1;
    WHILE vStart < iFileLen AND iLineLen > 0
    LOOP
      dbms_lob.read(vBlob,iLineLen,vStart,my_vr);
      utl_file.put_raw(l_output,my_vr);
      utl_file.fflush(l_output);
      -- set the start position for the next cut
      vStart := vStart + iLineLen;
      -- set the end position if less than 32000 bytes
      iTmp       := iTmp - iLineLen;
      IF iTmp     < iLineLen THEN
        iLineLen := iTmp;
      END IF;
    END LOOP;
  END IF;
  utl_file.fclose(l_output);
  LOG_IT('Exited');

  EXCEPTION
  WHEN OTHERS THEN
  LOG_IT('**ERROR** ' || SQLERRM, SQLCODE, DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
END;

LOG_IT is a stored proc logging to a table. There should not be any significant hit there. I tried optimizing Step 1 by using BULK FETCH instead of a normal FETCH. However, it didn't yield any significant result.

Can anybody suggest any ideas for improvement or, even better, a more performant way of approaching this?

回答1:

Assuming that your hardware is sufficient to handle far more than 8 GB/hour of sustained writes to sDir (and to handle reading a similar amount from blobTable and to handle whatever other I/O your system needs), the simplest option would likely to be to spawn a few parallel sessions each of which is calling this procedure. For example, if you wanted to run three jobs in parallel each of which was extracting one LOB, you could do something like this.

DECLARE
  l_jobno INTEGER;
BEGIN
  dbms_job.submit( l_jobno, 'begin BLOB2File( 1, ''1.lob'', ''DIRECTORY'' ); end;', sysdate + interval '5' second );
  dbms_job.submit( l_jobno, 'begin BLOB2File( 2, ''2.lob'', ''DIRECTORY'' ); end;', sysdate + interval '5' second );
  dbms_job.submit( l_jobno, 'begin BLOB2File( 3, ''3.lob'', ''DIRECTORY'' ); end;', sysdate + interval '5' second );
  commit;
END;

You probably don't want to have a separate thread for every BLOB in reality-- you probably want to generate a smaller number of jobs and give them each a range of lngBlobID values to work on. The number of jobs Oracle will run at any one time is limited by the JOB_QUEUE_PROCESSES parameter so you could submit thousands of jobs and just let Oracle limit how many it will run simultaneously.

来源：https://stackoverflow.com/questions/10456048/bulk-extraction-of-oracle-blobs-into-files-advice-tuning-help-needed

标签

optimization

plsql

oracle10g

blob

utl-file