• 4
name Punditsdkoslkdosdkoskdo

Any advantages of storing xml in a relational database?

I was poking around the AdventureWorks database today and I noticed that a number of tables (HumanResources.JobCandidate and Sales.Individualfor example) have a column which is storing xml data.

What I would to know is, what is the advantage of storing basically a database table row's worth of data in another table's column? Doesn't this make it difficult to query off of this information? Or is the assumption that the data won't need to be queried and just needs to be stored?

Often you get mixed data that is both XML and relational. (A fine example of this is a document store where each document can have metadata fields like title, date of creation, owner and so on.)

At this point you have to choose from three options:

  1. Store everything in a relational DB.
  2. Store everything in a native XML DB.
  3. Store data in two separate DBs, XML in native XML and metadata in relational.

Option 3 is probably the cleanest but also the most expensive and the hardest to implement, plus you don't necessarily want distributed transactions in a not-very-big system. Option 2 isn't very good as native XML databases are usually extremely poor at handling relational data (which you're more likely to use in searches) and the technology is overall less mature than relational DB.

So that leaves you with option 1 as certainly not the best solution but maybe the least bad.

  • 0
Reply Report

In my experience, the XML data is usually stored and rarely queried, but often extracted when necessary, usually when some other system needs an XML representation of some data that may be difficult or impossible to generate on-the-fly from relational data. The XML data might be pre-populated by some other process.

  • 0
Reply Report

Because not all data needs to be stored relationally and writing code to process data you've been passed as XML for relational storage is time consuming (and very very tedious). This is particularly true when a lot of XML data is coming from systems which are throwing out large generic responses.

I've frequently seen situations where a message is received from another system and we don't care about 98% of what it contains. So we parse it to split out the 2% we do care about, store that relationally and then store the whole message in case we do need any of the remaining 98% later.

And SQL Server gives you some OK-ish tools and syntax for working with XML in T-SQL so it's not as if it's totally beyond practical reach for ad-hoc queries in the way it might be if you were storing, say, the contents of a CSV.

And that excludes the possibility that what you actually want to store is XML (for instance for support and debug purposes)...

  • 3
Reply Report