Microsoft SQLXML

Created: Oct 9, 1999; last updated: Feb 1, 2026

Introduction

In 1999, Adam Bosworth hired me into the WebData team at Microsoft to implement XSLT on top of SQL Server as part of the SQLXML project. By the time I started in October, however, the team had decided that was too ambitious, and wanted to start with just XPath. I implemented several features starting with XPath over Annotated Schemas, and eventually became Technical Lead for the team.

The idea behind SQLXML was that most of the world's data at the time was locked up in relational databases, and by enabling this data to be queried and retrieved as XML, we would enable applications to combine data from many sources. We were enabling "mashups" before that term had been invented. This idea later became my inspiration for the Yahoo Query Language.

Millions of customers around the world — including the European Environmental Protection Agency's website, Jim Gray's PhotoServer, and MSN Music Search, to name just a few — used SQLXML and XPath over Annotated XSD.

Annotated XML Schemas

SQLXML required you to first write an XML Schema with annotations to describe how your desired XML structure maps to your database. These were W3C standard XSD schemas (and before that, XDR) with additional namespaced attributes to describe the mapping.

Example 1: An Annotated Schema describing a Customer element that maps to the Customers table containing an Order element (joined with the Orders table).
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
  <xs:element name="Customer" sql:relation="Customers">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Order" sql:relation="Orders" >
          <xs:complexType>
            <xs:attribute name="OrderID" type="xs:int" />
          </xs:complexType>
          <xs:annotation>
            <xs:appinfo>
              <sql:relationship parent="Customer" child="Order"
                                   parent-key="CustomerID" child-key="CustomerID"/>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
      </xs:sequence>
      <xs:attribute name="Region" type="xs:string" />
    </xs:complexType>
  </xs:element>
</xs:schema>

You could materialize this entire document as XML by performing the root-level XPath /. However, it was more common to leave the document virtualized and extract only parts of it. For example, an XPath like /Customer[@Region='WA']/Order retrieves all the orders for all customers in WA state. Conceptually, this is similar to the SQL query shown in Example 2:

Example 2: A SQL query similar to the XPath /Customer[@Region='WA']
select * from Orders
where exists (
    select * from Customers
    where Customers.Region = 'WA'
      and Customers.CustomerID = Orders.CustomerID
)

Implementation

My initial C++ implementation consisted of a recursive-descent parser for XPath and then code that walked the ASTs for the XPath query and the Annotated Schema together. This worked ok, and shipped in the first few versions of SQLXML to customers.

However, I noticed that assigning an engineer to implement every combination of query lanuage (XPath, XSLT, XQuery) ✕ data store (in-memory XML, SQL Server databases) ✕ platform (SQL Server C++, .NET C#) was very inefficient, and a classic compiler problem. I invented the Common Query Runtime platform to solve this, which was we shipped as System.Xml.Query in beta versions of Microsoft's .NET Framework. In Whidbey Beta 2 (months after I left the team), Microsoft cut XQuery because the W3C took longer to standardize XQuery than the .NET shipping schedule.

One of the key insights was that what we were doing was related to the nested relational algebra and query/view composition. The nested relational algebra was later used in many other systems, including the Dremel query language.

The annotated schema and XPath above are conceptually similar to the XQuery shown in Example 3, with a hyptothetical sql:table() function that retrieves records from the database.

Example 3: An XQuery equivalent to the Annotated Schema and XPath above.
declare function sql:view() {
    for $i in sql:table("Customers")
    return <Customer Region="{$i/@Region}">{
               for $j in sql:table("Orders")
               where $i/@CustomerID = $j/@CustomerID
               return <Order ID="{$i/@OrderID}"/>
          }</Customer>
};

sql:view()/Customer[@Region='WA']/Order

Instead of executing one query to construct the entire document and then another query to extract only part of it, SQLXML combines (composes) the XPath with the XML View to create a more efficient query, and then converts only this query's smaller result set into XML, as shown in Example 4.

Example 4: An XQuery equivalent to the composed view and XPath.
for $i in sql:table("Customers")
where $i/@Region = "WA"
return
  for $j in sql:table("Orders")
  where $i/@CustomerID = $j/@CustomerID
  return <Order ID="{$i/@OrderID}"/>

Chris Suver, my mentor and at the time our team Architect, and I invented an XML query algebra I named query intermediate language (QIL) to represent views and queries from many different XML query languages and enable performing this composition in a more rigorous way.

Additional Reading

Websites and Newsgroups

Books

Presentations