On a project a few weeks ago a coworker introduced me to a very easy way to generate XML documents. We were going to use it for a flat file to be distributed with a locally run application but the first thing that jumped to mind for me was AJAX applications. I remember my first experience with writing an AJAX application and thinking their had to be a better way that looping through data that was returned to me. Of course this was before you could store XML natively with SQL Server 2005. On to the show I suppose.
The command I would like to introduce is the SQL command "XML Explicit". At first glance many of the samples I saw seemed very complicated and it took me quite a few tries of experimentation to fully understand everything that was going on. Another reason why it can seem intimidating is that many people in the software world have become very comfortable with ORM tools and therefore their SQL skills get pretty rusty. I have to admit that I was one of these people, my SQL skills were average but if you put something pretty complicated in front of me it would take me a bit to grasp it. But since with my new employer I was thrust right into SQL world, no fluffy ORM tools to help me now.
So lets say you have SQL database with a table for employees, departments, positions, etc. and you want put them into a nice flat file with all their information and who works for who.
Employees
| id |
name |
position |
reports_to |
skills |
| 1 |
Tom |
Manager |
1 |
Good with money |
| 2 |
Brian |
Developer |
5 |
C++, C, MFC, C#, VB.Net, Java, HTML, XML |
| 3 |
Sharon |
Accountant |
2 |
Good with money |
| 4 |
Dave |
Developer |
5 |
C#, VB.Net, Java, HTML, XML |
| 5 |
Jim |
Manager |
1 |
Tech oriented |
| 6 |
Bob |
CEO |
0 |
Good at running companies |
| 7 |
Ron |
Manager |
1 |
Creative marketing ideas |
So we want to see an XML file that looks something like this:
<employees>
<employee id="1" name="Bob" position="CEO">
<skills>Good at running companies</skills>
<employee id="2" name="Tom" position="Manager">
<skills>Good with money</skills>
<employee id="6" name="Sharon" position="Accountant">
<skills>Good with money</skills>
</employee>
</employee>
<employee id="3" name="Jim" position="Manager">
<skills>Tech oriented</skills>
<employee id="7" name="Dave" position="Developer">
<skills>C#, VB.Net, Java, HTML, XML</skills>
</employee>
<employee id="8" name="Brian" position="Developer">
<skills>C++, C, MFC, C#, VB.Net, Java, HTML, XML</skills>
</employee>
</employee>
<employee id="4" name="Ron" position="Manager">
<skills>Creative marketing ideas</skills>
</employee>
</employee>
</employees>
Now on to the fun, we are going to make this file using the XMLExplicit command. One of the links that helped me while I was learning how to use this was this
article over at TopXML.com
To sum up the command it relies on making a "select" for each layer of your XML file and then unions all the different layers together. There are two required tags that will have to be used, "tag" and "parent". As you can probably imagine, they are used to define how the layers of your XML file are to be stacked. Here is a sample of the select for the root node, I will explain it below.
select
top 1
1 as tag,
null as parent,
null as [employees!1],
null as [employee!2!id],
null as [employee!2!name],,
null as [employee!2!position],
null as [employee!2!skills!element],
null as [employee!3!id],
null as [employee!3!name],
null as [employee!3!position],
null as [employee!3!skills!element]
from
employees
Now this looks pretty scary at first glance but lets break it apart. First lets look at defining how our XML document will work.
We will have 4 layers from outer most to the inner most. To define our root node we have this command [employees!1] where "employees" is the name of the element, "!" is a seperator, and "1" is the level of the node. 1 indicates that it is the outer most node. Now to define each employee for the second list we use several commands [employee!2!id] represents part of the second layer. "employee" is the tag name, "2" is the layer for it to be in, and "id" is the name of the attribute on the "employee" element to fill in. We have several like this to define each of the attributes we want on the element. To place a "skills" element within the employee element we use the notation "[employee!2!skills!element] where "employee" is the element name, 2 is the layer, "skills" is the name of the element within the employee element, and finally "element" to indicate that it is an element instead of an attribute.
Now using the tag and parent sections we will define how each layer of the XML file will be organized. The example below will show the outer most list of employees
select
2 as tag,
1 as parent,
null as [employees!1],
e.id as [employee!2!id],
e.name as [employee!2!name],
e.position as [employee!2!position],
e.skills as [employee!2!skills!element],
null as [employee!3!id],
null as [employee!3!name],
null as [employee!3!position],
null as [employee!3!skills!element],
null as [employee!4!id],
null as [employee!4!name],
null as [employee!4!position],
null as [employee!4!skills!element]
from
employees e
where
e.reports_to is NULL
As you can see the second layer looks much like the one above. Now the tag is set to "2" so that we will fill in the 2nd layer of the XML. The parent tag is set to "1" to point back to the "employees" element we did a select for before. Each of the layer 2 attributes are now set to the value of the query. This will grab "Bob" the CEO element and put it in the XML doc. In order to get our top layer node we say "where e.reports_to is null" to get the "Bob" record and everything else will percolate down from the upper most layer. The most important thing to remember when doing this is to keep your queries the same. It is fine to do joins with additional tables but you must be absolutely certain that the same data is queried each time. Now we need to select the employees for Bob and here is how we do it.
select
3 as tag,
2 as parent,
null as [employees!1],
e.id as [employee!2!id],
null as [employee!2!name],
null as [employee!2!position],
null as [employee!2!skills!element],
ee.id as [employee!3!id],
ee.name as [employee!3!name],
ee.position as [employee!3!position],
ee.skills as [employee!3!skills!element],
null as [employee!4!id],
null as [employee!4!name],
null as [employee!4!position],
null as [employee!4!skills!element]
from
employees e
inner join
employees ee
on
ee.reports_to = e.id
where
e.reports_to is null
Again, Nearly the same as above, with a few changes, the tag and parent values are now set to reflect the 3rd layer, and it is below the second layer. If you notice now for the [employee!2!id] we have id there instead of null. Now were looking for the "Jim" and "Ron" records. So do do that we will query for the same data from before but do a another join for the "reports_to" is the value of the id on the node above it. Using this method you can easily and simply create XML documents from SQL server with no heavy lifting required.
To put this all together you would now do a "UNION ALL" for each of your layers to combine them all together. The last thing we have to do is define an ORDER BY clause. This clause will sort the order that each layer is displayed. It is very important that the elements that you ORDER BY are included in the results of all the selects. So in this instance it would be the "id" of each employee.
ORDER BY
[employee!2!id], [employee!3!id],[employee!4!id]
for xml explicit, type
Now some trouble shooting tips. You are going to run into an error saying that "Tag # is not open". Its just gonna happen, so here are the things to check for. First, Make sure that data exists at all layers that a sublayer exists for. Second, Verify that any of the ORDER BY clauses are included in all the queries. If one of them does not exist it will throw this error.
I hope you learned something from this and let me know if there are any questions or comments. Here is a copy of the query in total.
SET NOCOUNT ON;
select
top 1
1 as tag,
null as parent,
null as [employees!1],
null as [employee!2!id],
null as [employee!2!name],
null as [employee!2!position],
null as [employee!2!skills!element],
null as [employee!3!id],
null as [employee!3!name],
null as [employee!3!position],
null as [employee!3!skills!element],
null as [employee!4!id],
null as [employee!4!name],
null as [employee!4!position],
null as [employee!4!skills!element]
from
employees
UNION ALL
select
2 as tag,
1 as parent,
null as [employees!1],
e.id as [employee!2!id],
e.name as [employee!2!name],
e.position as [employee!2!position],
e.skills as [employee!2!skills!element],
null as [employee!3!id],
null as [employee!3!name],
null as [employee!3!position],
null as [employee!3!skills!element],
null as [employee!4!id],
null as [employee!4!name],
null as [employee!4!position],
null as [employee!4!skills!element]
from
employees e
where
e.reports_to is NULL
UNION ALL
select
3 as tag,
2 as parent,
null as [employees!1],
e.id as [employee!2!id],
null as [employee!2!name],
null as [employee!2!position],
null as [employee!2!skills!element],
ee.id as [employee!3!id],
ee.name as [employee!3!name],
ee.position as [employee!3!position],
ee.skills as [employee!3!skills!element],
null as [employee!4!id],
null as [employee!4!name],
null as [employee!4!position],
null as [employee!4!skills!element]
from
employees e
inner join
employees ee
on
ee.reports_to = e.id
where
e.reports_to is null
UNION ALL
select
4 as tag,
3 as parent,
null as [employees!1],
e.id as [employee!2!id],
null as [employee!2!name],
null as [employee!2!position],
null as [employee!2!skills!element],
ee.id as [employee!3!id],
null as [employee!3!name],
null as [employee!3!position],
null as [employee!3!skills!element],
eee.id as [employee!4!id],
eee.name as [employee!4!name],
eee.position as [employee!4!position],
eee.skills as [employee!4!skills!element]
from
employees e
inner join
employees ee
on
ee.reports_to = e.id
inner join
employees eee
on
eee.reports_to = ee.id
where
e.reports_to is null
ORDER BY
[employee!2!id], [employee!3!id],[employee!4!id]
for xml explicit, type