Using XML with SQL Server
Over the past several releases, Microsoft SQL Server has become increasingly integrated with XML. In the current release, you can generate XML with SQL statements, using Microsoft T-SQL extensions to the SQL standard query language. You can also update SQL Server tables by sending properly formed XML messages, called DiffGrams, to a SQL Server database. In this section, you learn the basics of interacting with SQL Server via XML.
Generating XML with SQL Statements
SQL Server allows you to retrieve the results of any query as XML rather than as a SQL resultset. To do this, you use the Microsoft-specific FOR XML clause. You can use a variety of options in the FOR XML clause to customize the XML that SQL Server generates.
The first option is FOR XML RAW. When you use raw mode with FOR XML, SQL Server returns one element (always named row) for each row of the resultset, with the individual columns represented as attributes. For example, consider this query:
SELECT Customers.CustomerID, Customers.CompanyName, Orders.OrderID, Orders.OrderDate FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID WHERE Country = 'Brazil' AND OrderDate BETWEEN '1997-03-15' AND '1997-04-15' FOR XML RAW
If you execute this query (for example, using SQL Query Analyzer) in the Northwind sample database, you'll get back these results:
<row CustomerID="RICAR" CompanyName="Ricardo Adocicados" OrderID="10481" OrderDate="1997-03-20T00:00:00"/> <row CustomerID="QUEEN" CompanyName="Queen Cozinha" OrderID="10487" OrderDate="1997-03-26T00:00:00"/> <row CustomerID="COMMI" CompanyName="Comércio Mineiro" OrderID="10494" OrderDate="1997-04-02T00:00:00"/> <row CustomerID="TRADH" CompanyName="Tradiçaõ Hipermercados" OrderID="10496" OrderDate="1997-04-04T00:00:00"/>
NOTE
SQL Query Analyzer returns XML results as one long string. I've reformatted these results for easier display on the printed page. If you have trouble seeing all the results in SQL Query Analyzer, select Tools, Options, Results and increase the Maximum Characters Per Column setting.
The second variant of the FOR XML clause is FOR XML AUTO. When you use auto mode with FOR XML, nested tables in the returned data are represented as nested elements in the XML. Columns are still represented as attributes. For example, here's a query that uses FOR XML AUTO:
SELECT Customers.CustomerID, Customers.CompanyName, Orders.OrderID, Orders.OrderDate FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID WHERE Country = 'Brazil' AND OrderDate BETWEEN '1997-03-15' AND '1997-04-15' FOR XML AUTO
Here's the corresponding returned data:
<Customers CustomerID="RICAR" CompanyName="Ricardo Adocicados"> <Orders OrderID="10481" OrderDate="1997-03-20T00:00:00"/> </Customers> <Customers CustomerID="QUEEN" CompanyName="Queen Cozinha"> <Orders OrderID="10487" OrderDate="1997-03-26T00:00:00"/> </Customers> <Customers CustomerID="COMMI" CompanyName="Comércio Mineiro"> <Orders OrderID="10494" OrderDate="1997-04-02T00:00:00"/> </Customers> <Customers CustomerID="TRADH" CompanyName="Tradiçaõ Hipermercados"> <Orders OrderID="10496" OrderDate="1997-04-04T00:00:00"/> </Customers>
Note that in this output, the Orders element is nested within the Customers element for each order. If there were multiple orders for a single customer, the Orders element would repeat as many times as necessary.
There's a second variant of FOR XML AUTO. You can include the ELEMENTS option to represent columns as elements rather than as attributes. Here's a query that uses this option:
SELECT Customers.CustomerID, Customers.CompanyName, Orders.OrderID, Orders.OrderDate FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID WHERE Country = 'Brazil' AND OrderDate BETWEEN '1997-03-15' AND '1997-04-15' FOR XML AUTO, ELEMENTS
Here's the corresponding output:
<Customers> <CustomerID>RICAR</CustomerID> <CompanyName>Ricardo Adocicados</CompanyName> <Orders> <OrderID>10481</OrderID> <OrderDate>1997-03-20T00:00:00</OrderDate> </Orders> </Customers> <Customers> <CustomerID>QUEEN</CustomerID> <CompanyName>Queen Cozinha</CompanyName> <Orders> <OrderID>10487</OrderID> <OrderDate>1997-03-26T00:00:00</OrderDate> </Orders> </Customers> <Customers> <CustomerID>COMMI</CustomerID> <CompanyName>Comércio Mineiro</CompanyName> <Orders> <OrderID>10494</OrderID> <OrderDate>1997-04-02T00:00:00</OrderDate> </Orders> </Customers> <Customers> <CustomerID>TRADH</CustomerID> <CompanyName>Tradiçaõ Hipermercados</CompanyName> <Orders> <OrderID>10496</OrderID> <OrderDate>1997-04-04T00:00:00</OrderDate> </Orders> </Customers>
The final variant of FOR XML is FOR XML EXPLICIT. In explicit mode, you must construct your query so as to create a resultset with the first column named Tag and the second column named Parent. These columns create a self-join in the results that is used to determine the hierarchy of the created XML file. Here's a relatively simple query in explicit mode:
SELECT 1 AS Tag, NULL AS Parent, Customers.CustomerID AS [Customer!1!CustomerID], Customers.CompanyName AS [Customer!1!CompanyName], NULL AS [Order!2!OrderID], NULL AS [Order!2!OrderDate] FROM Customers WHERE COUNTRY = 'Brazil' UNION ALL SELECT 2, 1, Customers.CustomerID, Customers.CompanyName, Orders.OrderID, Orders.OrderDate FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID WHERE Country = 'Brazil' AND OrderDate BETWEEN '1997-03-15' AND '1997-04-15' ORDER BY [Customer!1!CustomerID], [Order!2!OrderID] FOR XML EXPLICIT
The resulting XML from this query is as follows:
<Customer CustomerID="COMMI" CompanyName="Comércio Mineiro"> <Order OrderID="10494" OrderDate="1997-04-02T00:00:00"/> </Customer> <Customer CustomerID="FAMIA" CompanyName="Familia Arquibaldo"/> <Customer CustomerID="GOURL" CompanyName="Gourmet Lanchonetes"/> <Customer CustomerID="HANAR" CompanyName="Hanari Carnes"/> <Customer CustomerID="QUEDE" CompanyName="Que Delícia"/> <Customer CustomerID="QUEEN" CompanyName="Queen Cozinha"> <Order OrderID="10487" OrderDate="1997-03-26T00:00:00"/> </Customer> <Customer CustomerID="RICAR" CompanyName="Ricardo Adocicados"> <Order OrderID="10481" OrderDate="1997-03-20T00:00:00"/> </Customer> <Customer CustomerID="TRADH" CompanyName="Tradiçaõ Hipermercados"> <Order OrderID="10496" OrderDate="1997-04-04T00:00:00"/> </Customer><Customer CustomerID="WELLI" CompanyName="Wellington Importadora"/>
Note that in this case, even customers without orders in the specified time period are included because the first half of the query retrieves all customers from Brazil. Explicit mode allows you the finest control over the generated XML, but it's also the most complex mode to use in practice. You should stick to raw or auto mode whenever possible.
Finally, you can generate schema information as part of a SQL Server query by including the XMLDATA option in the query. You can do this in any of the FOR XML modes. For example, here's a query you saw earlier in this section with the XMLDATA option added:
SELECT Customers.CustomerID, Customers.CompanyName, Orders.OrderID, Orders.OrderDate FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID WHERE Country = 'Brazil' AND OrderDate BETWEEN '1997-03-15' AND '1997-04-15' FOR XML AUTO, ELEMENTS, XMLDATA
The resulting XML is as follows:
<Schema name="Schema1" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="Customers" content="eltOnly" model="closed" order="many"> <element type="Orders" maxOccurs="*"/> <element type="CustomerID"/> <element type="CompanyName"/> </ElementType> <ElementType name="CustomerID" content="textOnly" model="closed" dt:type="string"/> <ElementType name="CompanyName" content="textOnly" model="closed" dt:type="string"/> <ElementType name="Orders" content="eltOnly" model="closed" order="many"> <element type="OrderID"/> <element type="OrderDate"/> </ElementType> <ElementType name="OrderID" content="textOnly" model="closed" dt:type="i4"/> <ElementType name="OrderDate" content="textOnly" \ model="closed" dt:type="dateTime"/>\ </Schema> <Customers xmlns="x-schema:#Schema1"> <CustomerID>RICAR</CustomerID> <CompanyName>Ricardo Adocicados</CompanyName> <Orders> <OrderID>10481</OrderID> <OrderDate>1997-03-20T00:00:00</OrderDate> </Orders> </Customers> <Customers xmlns="x-schema:#Schema1"> <CustomerID>QUEEN</CustomerID> <CompanyName>Queen Cozinha</CompanyName> <Orders> <OrderID>10487</OrderID> <OrderDate>1997-03-26T00:00:00</OrderDate> </Orders> </Customers> <Customers xmlns="x-schema:#Schema1"> <CustomerID>COMMI</CustomerID> <CompanyName>Comércio Mineiro</CompanyName> <Orders> <OrderID>10494</OrderID> <OrderDate>1997-04-02T00:00:00</OrderDate> </Orders> </Customers> <Customers xmlns="x-schema:#Schema1"> <CustomerID>TRADH</CustomerID> <CompanyName>Tradiçaõ Hipermercados</CompanyName> <Orders> <OrderID>10496</OrderID> <OrderDate>1997-04-04T00:00:00</OrderDate> </Orders> </Customers>
Using ExecuteXmlReader
ADO.NET provides a means to integrate SQL Server's XML capabilities with the .NET Framework classes. The ExecuteXmlReader method of the SqlCommand object allows you to retrieve an XmlReader directly from a SQL statement, provided that the SQL statement uses the FOR XML clause. Here's an example:
Dim cmd As SqlCommand = _ SqlConnection1.CreateCommand ' Create a command to retrieve XML cmd.CommandType = CommandType.Text cmd.CommandText = _ "SELECT Customers.CustomerID, " & _ "Customers.CompanyName," & _ "Orders.OrderID, Orders.OrderDate " & _ "FROM Customers INNER JOIN Orders " & _ "ON Customers.CustomerID = " & _ "Orders.CustomerID " & _ "WHERE Country = 'Brazil' AND " & _ "OrderDate BETWEEN '1997-03-15' " & _ "AND '1997-04-15' " & _ "FOR XML AUTO, ELEMENTS" SqlConnection1.Open() ' Read the XML into an XmlReader Dim xr As XmlReader = _ cmd.ExecuteXmlReader() ' Dump the contents of the reader Dim strNode As String Dim intI As Integer Do While xr.Read ' Do something with the nodes here Loop xr.Close() SqlConnection1.Close()
Updating SQL Server Data by Using XML
You can also update SQL Server data by using special XML messages called DiffGrams. You can think of a DiffGram as a before-and-after snapshot of a part of a SQL Server table. The .NET Framework uses DiffGrams internally as a means of serializing changes in a DataSet. For example, if you pass the changes in a DataSet from one tier to another, the .NET Framework will use a DiffGram to send the changes.
You can also use DiffGrams yourself to update data in a SQL Server. However, before you can do so, you'll need to install the SQLXML managed classes, an interface between SQL Server and .NET.
Installing SQLXML
Although SQL Server 2000 includes some XML support (for example, the FOR XML syntax is built in to the product), there have been many advances in XML since that version of SQL Server was released. Microsoft has kept SQL Server in tune with these advances by issuing a series of free upgrade packages with the general name of SQLXML. As of this writing, the current release of SQLXML is SQLXML 3.0 SP1.
To install SQLXML, you need to download the current release directly from Microsoft's Web site. You can always find the current release by starting at the SQLXML home page, http://msdn.microsoft.com/sqlxml.
Using DiffGrams
After you've installed SQLXML, you can use the SqlXmlCommand object to execute a DiffGram. A DiffGram is an XML file that includes changes to a DataSet. Here's a small example:
<?xml version="1.0" standalone="yes"?> <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr= "urn:schemas-microsoft-com:xml-diffgram-v1"> <NewDataSet> <Customers diffgr:id="Customers1" msdata:rowOrder="0" diffgr:hasChanges="modified"> <CustomerID>ALFKI</CustomerID> <ContactName>Maria Anderson</ContactName> </Customers> </NewDataSet> <diffgr:before> <Customers diffgr:id="Customers1" msdata:rowOrder="0"> <CustomerID>ALFKI</CustomerID> <ContactName>Maria Anders</ContactName> </Customers> </diffgr:before> </diffgr:diffgram>
In this case, the first part of the XML file lists a row in the Customers table and indicates that it has been modified. The second part of the DiffGram contains the original data from the SQL Server table. SQL Server can use this data to find the row to be modified.
In addition to the DiffGram, you'll also need a schema file that maps the element names in the DiffGram back to tables and columns in the SQL Server database. The sql:relation attribute in the schema file indicates the table mapping, whereas the sql:field attributes indicate the field mappings.
<xsd:schema xmlns: xsd="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema"> <xsd:element name="Customers" sql:relation="Customers" > <xsd:complexType> <xsd:sequence> <xsd:element name="CustomerID" sql:field="CustomerID" type="xsd:string" /> <xsd:element name="ContactName" sql:field="ContactName" type="xsd:string" /> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Given these two pieces, you can use SQLXML to apply the DiffGram to a database with code such as this:
Private Sub ApplyDiffGram() ' Connect to the SQL Server database Dim sxc As SqlXmlCommand = _ New SqlXmlCommand("Provider=SQLOLEDB;" & _ "Server=(local);database=Northwind;" & _ "Integrated Security=SSPI") ' Set up the DiffGram sxc.CommandType = SqlXmlCommandType.DiffGram sxc.SchemaPath = "diffgram.xsd" sxc.CommandStream = _ New FileStream("diffgram.xml", FileMode.Open) ' And execute it sxc.ExecuteNonQuery() MessageBox.Show("Database was updated!") End Sub
DiffGrams can insert or delete data as well as modify data. For an insertion, the DiffGram will contain the data for the new row and no old data. For a deletion, the DiffGram will contain the row to be deleted but no new row.