I highly recommend reading the first three links on References, at the end of this post.
What says the official documentation: “Each group in an XML source definition is analogous to a relational table, and the Designer treats each group within the XML Source Qualifier transformation as a separate source of data.”
The best way to understand how you can consume an XML, like, getting data from XML to your database, is to think about relational data. Look this:
This is an XML file. You can see that we have Customers and Orders. That’s like 2 tables, so, when consuming this XML we will probably end up with those two tables.
Inside CUSTOMERS we have 4 customers, those are rows in our table. If we do our ETL from XML to Database, this is what we got:
Same thing happens for Orders:
Did you notice that we have an extra “table” inside Customers that’s FULLADDRESS, and in Orders it’s SHIPINFO.
Therefore, if you want to go as full normalized as you can, you may end up with 4 tables, can you picture this?
Customers can have many orders, we can find this information by looking again to our XML file:
XML Schema file – XSD
It is not everyday you will have XML schema file for your XML files, but if you do, that helps a lot. What’s a XSD file? Something like this:
XML Hands on Powercenter
When using Powercenter you can import your XML source from an XSD or XML file, however, regardless of how you import your sorce, it’s always a good idea to double check your XML Definition.
Here’s how you can import your XML source file in Powercenter. Go sources > import XML definition > browse your file
Then you get your file, but before hitting OK or OPEN, you can check the ADVANCED Options. Honestly, I don’t know what each and every button on this advanced option does, but the documentation can tell you.
Advanced option explained, from the official docs:
The next step is called XML import Wizard.
Next screen on the wizard should be hard to tell, but you already know how XML works like a relational database. What you need to tell the wizard here is how your XML file is laid out. In our case we know we have hierarchy, because FULLADDRESS and SHIPINFO have a parent-son relationship to CUSTOMERS and ORDERS.
Then you finish the wizard and this is what you get:
To adjust your XML views go > Edit XML Definition.
This is our XML views. As far as I know, XML views is synonym for table. Here we have two views; X_Order and X_Customer. Here you will adjust your views accordingly to your expectations. What I mean is, if a field is in the wrong place, you can correct it, or if a new view needs to be created, you can create here, also, if a view needs to be recreated from the ground up, you can also do it here.
By the way, if you noticed on XML Wizard first screen, we have a “do not generate XML views” option. If we take that option then all views will require manual creation.
In the “getting started” link in the References you can better understand how to work with views.
Using XML Source Qualifier Transformation in Mapping
In this mapping we have our XML source flowing data from it’s source to two different tables, and also sending the same data to two sort transformation, then joining them in a sorted join, and then storing it in a single table. I will not go in details of sort and join since I have already wrote about them.
Using XML Parser Transformation in Mapping
I won’t go into much details about the parser transformation, but what it does is, it receives and DATAINPUT that is a string with XML data and do the same thing the XML Source Qualifier do, like we already did in previous examples.
The XML Parser transformation is an active transformation.
Using XML Generator Transformation in Mapping
The XML Generator is the opposite of the parsers generator. Here the XML Generator will receive relational data and generate and XML out of it. Say you have a database and want to extract a table in XML format to a flat file, for that you will use this XML generator transformation.
The XML Generator transformation is an active transformation.