Oracle: loading a large xml file?

后端 未结 3 688
猫巷女王i
猫巷女王i 2020-11-27 18:30

So now that I have a large bit of XML data I\'m interested in:

http://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump

I\'d like to l

相关标签:
3条回答
  • 2020-11-27 18:47

    You can access the XML files on the server via SQL. With your data in the /tmp/tmp.xml, you would first declare the directory:

    SQL> create directory d as '/tmp';
    
    Directory created
    

    You could then query your XML File directly:

    SQL> SELECT XMLTYPE(bfilename('D', 'tmp.xml'), nls_charset_id('UTF8')) xml_data
      2    FROM dual;
    
    XML_DATA
    --------------------------------------------------------------------------------
    <?xml version="1.0" encoding="UTF-8"?>
    <badges>
      [...]
    

    To access the fields in your file, you could use the method described in another SO for example:

    SQL> SELECT UserId, Name, to_timestamp(dt, 'YYYY-MM-DD"T"HH24:MI:SS.FF3') dt
      2    FROM (SELECT XMLTYPE(bfilename('D', 'tmp.xml'), 
                                nls_charset_id('UTF8')) xml_data
      3            FROM dual),
      4         XMLTable('for $i in /badges/row
      5                              return $i'
      6                  passing xml_data
      7                  columns UserId NUMBER path '@UserId',
      8                          Name VARCHAR2(50) path '@Name',
      9                          dt VARCHAR2(25) path '@Date');
    
        USERID NAME       DT                         
    ---------- ---------- ---------------------------
          3718 Teacher    2008-09-15 08:55:03.923    
           994 Teacher    2008-09-15 08:55:03.957    
    
    0 讨论(0)
  • 2020-11-27 18:47

    Seems like you're talking about 2 issues -- first, getting the XML document to where Oracle can see it. And then maybe making it so that standard relational tools can be applied to the data.

    For the first, you or your DBA can create a table with a BLOB, CLOB, or BFILE column and load the data. If you have access to the server on which the database lives, you can define a DIRECTORY object in the database that points to an operating system directory. Then put your file there. And then either set it up as a BFILE or read it in. (CLOB and BLOB store in the database; BFILE stores a pointed to a file on the operating system side).

    Alternatively , use some tool that will let you directly write CLOBs to the database. Anyway, that gets you to the point where you can see the XML instance document in the database.

    So now you have the instance document visible. Step 1 is done.

    Depending on the version, Oracle has some pretty good tools for shredding the XML into relational tables.

    It can be pretty declarative. While this gets beyond what I've actually done (I have a project where I'll be trying it this fall), you can theoretically load your XML Schema into the database and annotate it with the crosswalk between the relational tables and the XML. Then take your CLOB or BFILE and convert it to an XMLTYPE column with the defined schema and you're done -- the shredding happens automatically, the data is all there, it's all relational, it's all available to standard SQL without the XQUERY or XML extensions.

    Of course, if you'd rather use XQUERY, then just take the CLOB or BFILE, convert it to an XMLTYPE, and go for it.

    0 讨论(0)
  • 2020-11-27 18:59

    I would do a simple:

    grep '<row' file.xml |\
    gawk -F '"' '{printf("insert into badges(userid,name,date) values (\"%s\",\"%s\",\"%s\");\n",$2,$4,$6); } > request.sql
    

    or you can create A java program using a SAX parser. Each time your handler finds a new Element 'row', you get the attributes and insert a new record in your database.

    0 讨论(0)
提交回复
热议问题