Wednesday, January 11, 2012

Load/Import XML file into database table (Oracle or SQL Server)

Load/Import XML file into database table (Oracle or SQL Server)

Let say you have some XML file and you want to load xml data into database table, there are tons of xml structures and way to do the same.
I have taken very simple type of XML so that it should be easy to understand the basic of this functionality. I have seen lots of example given to take input parameter as xml string not xml as file. I am giving example to take input as xml file.
I also have given example that works with Oracle database as well in SQL server, because database could be any for this requirement.

I have tested following example with
  • Oracle Express Edition 10.2
  • SQL Server 2008
  • Window 7

  XML File Name: employee.xml

<ROWSET>
  <ROW ID="10">
    <EmpName>RiteshEmpName>
    <EmpSal>10000EmpSal>
    <DeptNo>1010DeptNo>
    <JoinDate>03-Apr-2006JoinDate>
  ROW>
  <ROW ID="20">
    <EmpName>ArtiEmpName>
    <EmpSal>20000EmpSal>
    <DeptNo>2020DeptNo>
    <JoinDate>05-May-2007JoinDate>
  ROW>
ROWSET>

Note: “ID” value just taken here to demo to read attribute value.

Oracle Solution

1)  Create Virtual directory from Oracle SQL Prompt

          SQL>  CREATE directory test_dir  AS 'c:\Test';

2) Copy “employee.xml” file into “c:\Test”

3) Create database Table name “EmployeeXML”


On the above table for Oracle I have created table with all columns data type as varchar2, I could not find correct syntax in a SQL query to convert datatype from String to int etc.
But anyway this is good to start, I will update later if I get a time.

3) Oracle SQL query to load data into EmplyeeXML Table

INSERT INTO EmployeeXML(EmpID,EmpName,EmpSal,DeptNo,JoinDate)
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','employeexml.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'/ROW/@ID') empid          ///* READ ATTRIBUTES */
,extractValue(value(x),'/ROW/EMPNAME') empname  ///*  READ VALUES */
,extractValue(value(x),'ROW/EMPSAL') empsal
,extractValue(value(x),'ROW/DEPTNO') deptno
,extractValue(value(x),'ROW/JOINDATE') joindate
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/ROWSET/ROW'))) x;

SQL Server Solution

In SQL server mostly you pass xml as string to stored procedure input parameter, but I was looking to pass XML file name as input parameter and below I am giving example using Bulk Insert.
In SQL server we need to go with 2 steps process, add data into temp table and then load data from temp table to emplyeexml table.

1)  Create Temp table and load XML into it
--Create Temp Table
CREATE TABLE #TempTable (Data XML)

2)  Create ‘EmployeeXML’ table and load XML data into it
--Check EmployeeXML table if already exist then TRUNCATE
IF NOT EXISTS (SELECT *
                 FROM   sys.objects
                 WHERE  object_id = OBJECT_ID(N'[dbo].[EmployeeXML]')
                        AND type in (N'U'))
    CREATE TABLE [dbo].[EmployeeXML] (
       [EmpId]         [int]   NOT NULL
      ,[EmpName]       [nvarchar](200)   NULL
      ,[EmpSal]        [int]   NULL
      ,[DeptNo]        [int]   NULL
      ,[JoinDate]    [datetime]   NULL
      )  
  ELSE
    TRUNCATE TABLE dbo.[EmployeeXML] 

3)  Insert XML into Temp table 
  /* Populate the temp table with the employee.xml file */
  INSERT INTO #TempTable
  SELECT *
  FROM   OPENROWSET(BULK 'c:\Ritesh\employee.xml',SINGLE_BLOB) AS data

4)  Insert XML into Temp table
  /* Import the users records from the working table */
  DECLARE  @XML    AS XML
           ,@hDoc  AS INT
 
  SELECT @XML = Data
  FROM   #TempTable
 
  EXEC sp_xml_preparedocument
    @hDoc OUTPUT ,
    @XML
 
   /* Insert data into employeexml table from temptable */
  INSERT INTO dbo.EmployeeXML
             ( EmpId
              ,EmpName
              , EmpSal
              ,DeptNo
              ,JoinDate)
  SELECT  Id
         ,EmpName
         ,EmpSal
         ,DeptNo
         ,CAST(JoinDate AS DATETIME)
  FROM   OPENXML (@hDoc, '/ROWSET/ROW', 2)  /* 2- READ VALUE (Hint below) */
               WITH ID            INT '@ID', /* READ ATTRIBUTE*/
                       EmpName        NVARCHAR(200) 'EmpName', /*READ VALUES */
                       EmpSal         INT 'EmpSale',
                       DeptNo         INT 'DeptNo',
                       JoinDate       VARCHAR(50) 'JoinDate') 

  /* Clean up and empty out temporary table */
  EXEC sp_xml_removedocument  @hDoc
 
4)  Drop temp table
  DROP TABLE #TempTable 

RESULT
Data will be inserted into database table

Hint:
Open the XML Document and Insert into the Database, Here the OpenXML takes 3 parameters
  1. the Handle we created for the XML document in memory,
  2. the XPath to access the various elements of the XML document and
  3. a flag here 2 means access the XPath as elements
Troubleshooting

You might get below error when you run from client; I solved this error putting my “employee.xml” file into server “C:\Ritesh\” folder.
Msg 4861, Level 16, State 1, Line 2
Cannot bulk load because the file "C:\Ritesh\employee.xml" could not be opened. Operating system error code 3(The system cannot find the path specified.).

You can go to following site to see the cause and solution for this error.

More Examples

You want to do the same load from .NET application into Oracle/SQL Server see following link
Oracle

SQL Server
LOAD CSV FILE
you can also load CSV comma separator file into database see below link 

Good links to see more on this

20 comments:

Milán Szakács said...

I would like to use your solution in case of ORACLE(not SQL server), but my xml has more than one node. for examlpe: xykl

Can I use your solution in ORACLE? and if yes, what should I change to import the xml?

Thanks
Milan

Milán Szakács said...
This comment has been removed by the author.
Milán Szakács said...
This comment has been removed by the author.
Ritesh_Kesharwani said...

Hi Milan,
Could you send me your XML with below format, you can modify below XML as per your requirement, I can send you the Qracle solution for that
<ROWSET><ROW ID="10"> <EmpName>RiteshEmpName> <EmpSal>10000EmpSal> <DeptNo>1010DeptNo> <JoinDate>03-Apr-2006JoinDate> ROW> <ROW ID="20"> <EmpName>ArtiEmpName> <EmpSal>20000EmpSal> <DeptNo>2020DeptNo> <JoinDate>05-May-2007JoinDate> ROW>ROWSET>

Milán Szakács said...
This comment has been removed by the author.
Milán Szakács said...
This comment has been removed by the author.
Milán Szakács said...

SPECTRAEXCHANGE>
APPLICATION>
SV_SV_ID>kClongSV_SV_ID>
SS_SS_ID>kClongSS_SS_ID>
AP_NAME>kCstring (64)AP_NAME>
AP_PRJ_IDENT>kCstring (32)AP_PRJ_IDENT>
STATION>
TCS_NAME>kCstring (64)TCS_NAME>
TCS_CALL>kCstring (256)TCS_CALL>
HORIZONTAL_ELEVATIONS>
HORIZONTAL_ELEVATION>
HE_AZIMUT>kCdoubleHE_AZIMUT>
HE_ELEVATION>kCdoubleHE_ELEVATION>
HORIZONTAL_ELEVATION>
HORIZONTAL_ELEVATIONS>
TRANSMITTER>
EQP_EQUIP_NAME>kCstring (128)EQP_EQUIP_NAME>
EQP_EQUIP_TYPE>kCstring (16)EQP_EQUIP_TYPE>
FREQUENCY>
EFL_FREQ>kCdoubleEFL_FREQ>
COORDINATED_FREQUENCY>
COF_DAT>kWrDateCOF_DAT>
COORDINATED_FREQUENCY>
FREQUENCY>
TRANSMITTER>
STATION>
APPLICATION>
SPECTRAEXCHANGE>

I could load data to SQL till STATION node, I need the solution for 'child' node for example STATION, TRANSMITTER etc...

Ritesh_Kesharwani said...

Could you please copy your XML with tag stating with < "<" and ending with > ">"
& percentage lt;
& percentage gt;

Milán Szakács said...

<"<"SPECTRAEXCHANGE>">"
<"<"APPLICATION>">"
<"<"SV_SV_ID>">"kClong<"<"SV_SV_ID/>">"
<"<"SS_SS_ID>">"kClong<"<"SS_SS_ID/>">"
<"<"AP_NAME>">"kCstring (64)<"<"AP_NAME/>">"
<"<"AP_PRJ_IDENT>">"kCstring (32)<"<"AP_PRJ_IDENT/>">"
<"<"STATION>">"
<"<"TCS_NAME>">"kCstring (64)<"<"TCS_NAME/>">"
<"<"TCS_CALL>">"kCstring (256)<"<"TCS_CALL/>">"
<"<"HORIZONTAL_ELEVATIONS>">"
<"<"HORIZONTAL_ELEVATION>">"
<"<"HE_AZIMUT>">"kCdouble<"<"HE_AZIMUT/>">"
<"<"HE_ELEVATION>">"kCdouble<"<"HE_ELEVATION/>">"
<"<"HORIZONTAL_ELEVATION/>">"
<"<"HORIZONTAL_ELEVATIONS/>">"
<"<"TRANSMITTER>">"
<"<"EQP_EQUIP_NAME>">"kCstring (128)<"<"EQP_EQUIP_NAME/>">"
<"<"EQP_EQUIP_TYPE>">"kCstring (16)<"<"EQP_EQUIP_TYPE/>">"
<"<"FREQUENCY>">"
<"<"EFL_FREQ>">"kCdouble<"<"EFL_FREQ/>">"
<"<"COORDINATED_FREQUENCY>">"
<"<"COF_DAT>">"kWrDate<"<"COF_DAT/>">"
<"<"COORDINATED_FREQUENCY/>">"
<"<"FREQUENCY/>">"
<"<"TRANSMITTER/>">"
<"<"STATION/>">"
<"<"APPLICATION/>">"
<"<"SPECTRAEXCHANGE/>">"

Ritesh_Kesharwani said...

Follow the other info as same given in the document for creating XML file and place after that you need to change your Insert statement like below , I am assuming that all the info from the XML you are trying to insert into database table and that table have all the columns present as per XML tag.


INSERT INTO TABLENAME(SV_SV_ID,
SS_SS_ID,
AP_NAME,
AP_PRJ_IDENT,
TCS_NAME,
TCS_CALL,
HE_AZIMUT,
HE_ELEVATION,
EQP_EQUIP_NAME,
EQP_EQUIP_TYPE,
EFL_FREQ,COF_DAT)
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'/APPLICATION/SV_SV_ID') SV_SV_ID /* value will be kClong */
,extractValue(value(x),'/APPLICATION/SS_SS_ID') SS_SS_ID /* value will be kClong*/
,extractValue(value(x),'APPLICATION/AP_NAME') AP_NAME /* value will be kCstring (64)*/
,extractValue(value(x),'APPLICATION/AP_PRJ_IDENT') AP_PRJ_IDENT
,extractValue(value(x),'APPLICATION/STATION/TCS_NAME') TCS_NAME
,extractValue(value(x),'APPLICATION/STATION/TCS_CALL') TCS_CALL
,extractValue(value(x),'APPLICATION/STATION/HORIZONTAL_ELEVATIONS/HORIZONTAL_ELEVATION/HE_AZIMUT') HE_AZIMUT
,extractValue(value(x),'APPLICATION/STATION/HORIZONTAL_ELEVATIONS/HORIZONTAL_ELEVATION/HE_ELEVATION') HE_ELEVATION
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/EQP_EQUIP_NAME') EQP_EQUIP_NAME
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/EQP_EQUIP_TYPE') EQP_EQUIP_TYPE
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/FREQUENCY/EFL_FREQ') EFL_FREQ
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/FREQUENCY/COORDINATED_FREQUENCY/COF_DAT') COF_DAT
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION'))) x;

Milán Szakács said...

I got the following error messages:

19025. 00000 - "EXTRACTVALUE returns value of only one node"
*Cause: Given XPath points to more than one node.
*Action: Rewrite the query so that exactly one node is returned.

Ritesh_Kesharwani said...

ok, in this case I would suggest that either you can convert your xml into single node xml or write multiple insert statement to insert all node values into the table like below

INSERT INTO TABLENAME(SV_SV_ID,
SS_SS_ID,
AP_NAME,
AP_PRJ_IDENT )
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'/APPLICATION/SV_SV_ID') SV_SV_ID /* value will be kClong */
,extractValue(value(x),'/APPLICATION/SS_SS_ID') SS_SS_ID /* value will be kClong*/
,extractValue(value(x),'APPLICATION/AP_NAME') AP_NAME /* value will be kCstring (64)*/
,extractValue(value(x),'APPLICATION/AP_PRJ_IDENT') AP_PRJ_IDENT
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION'))) x;


INSERT INTO TABLENAME(
TCS_NAME,
TCS_CALL )
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'APPLICATION/STATION/TCS_NAME') TCS_NAME
,extractValue(value(x),'APPLICATION/STATION/TCS_CALL') TCS_CALL
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION'))) x;

INSERT INTO TABLENAME(
HE_AZIMUT,
HE_ELEVATION )
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'APPLICATION/STATION/HORIZONTAL_ELEVATIONS/HORIZONTAL_ELEVATION/HE_AZIMUT') HE_AZIMUT
,extractValue(value(x),'APPLICATION/STATION/HORIZONTAL_ELEVATIONS/HORIZONTAL_ELEVATION/HE_ELEVATION') HE_ELEVATION
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION/HORIZONTAL_ELEVATIONS/HORIZONTAL_ELEVATION'))) x;

INSERT INTO TABLENAME(
EQP_EQUIP_NAME,
EQP_EQUIP_TYPE )
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/EQP_EQUIP_NAME') EQP_EQUIP_NAME
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/EQP_EQUIP_TYPE') EQP_EQUIP_TYPE
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION/TRANSMITTER'))) x;


INSERT INTO TABLENAME(
EFL_FREQ,
COF_DAT)
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/FREQUENCY/EFL_FREQ') EFL_FREQ
,extractValue(value(x),'APPLICATION/STATION/TRANSMITTER/FREQUENCY/COORDINATED_FREQUENCY/COF_DAT') COF_DAT
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION/TRANSMITTER/FREQUENCY'))) x;

Ritesh_Kesharwani said...

Here you go , Write First INSERT statement and rest of the UPDATE statement, like below ,
Note: Correct the syntax error in update statement if any , i do not have oracle with me

INSERT INTO TABLENAME(SV_SV_ID,
SS_SS_ID,
AP_NAME,
AP_PRJ_IDENT )
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'/APPLICATION/SV_SV_ID') SV_SV_ID /* value will be kClong */
,extractValue(value(x),'/APPLICATION/SS_SS_ID') SS_SS_ID /* value will be kClong*/
,extractValue(value(x),'APPLICATION/AP_NAME') AP_NAME /* value will be kCstring (64)*/
,extractValue(value(x),'APPLICATION/AP_PRJ_IDENT') AP_PRJ_IDENT
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION'))) x;


UPDATE EmployeeXML SET TCS_NAME = extractValue(value(x),'/APPLICATION/STATION/TCS_NAME'),
TCS_CALL = extractValue(value(x),'/APPLICATION/STATION/TCS_CALL')
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION'))) x;

etc..

Milán Szakács said...

in UPDATE session i got error message:
00933. 00000 - "SQL command not properly ended"

Ritesh_Kesharwani said...

I told you in a note, I do not have Oracle to test the query right now, please fixed syntax error , it should be very simple, try following

UPDATE EmployeeXML SET
TCS_NAME = x.TCS_NAME,
TCS_CALL = x.TCS_CALL
WITH t AS (SELECT xmltype(bfilename('TEST_DIR','yourXMLFileName.xml'), nls_charset_id('WE8ISO8859P1')) xmlcol FROM dual)
SELECT
extractValue(value(x),'APPLICATION/STATION/TCS_NAME') TCS_NAME
,extractValue(value(x),'APPLICATION/STATION/TCS_CALL') TCS_CALL
FROM t,TABLE(XMLSequence(extract(t.xmlcol,'/SPECTRAEXCHANGE/APPLICATION/STATION'))) x;

Milán Szakács said...

I know that you mentioned in your previously post, but I am new with oracle I have a few days experience.

I will try to solve the UPDATE session...
Thanks a lot.

Milán Szakács said...

I couldn't fix the issue...

Sudhir DBAKings said...

Nice post very helpful

dbakings

Sikandar Rana said...




1
2




HOW TO READ IN ORACLE

Edgar F Delgado said...

Hi, your solution is good, but i think is simple,
i think does not work with complex XML files, like this:

http://www.treasury.gov/ofac/downloads/sdn.xml

Can you recommendme some free tools that automatize creating the schema and importing data into ORACLE normal tables (not XMLtype)

Greetings from Mexico.