I run a web service query daily. The entire resulting web service XML response is saved into an xml column in a new row each day. The results from one day to the next will largely overlap, save status updates, etc. For example, the results on one day might be:
<TransactionArray>
<Transaction>
<ID>1</ID>
<AmountPaid>99.00</AmountPaid>
<Buyer>
<BuyerInfo>
<ShippingAddress>
<Name>Larry</Name>
</ShippingAddress>
</BuyerInfo>
</Buyer>
<Item>
<ItemID>1001</ItemID>
<Title>Item 1</Title>
</Item>
<Status>Sold</Status>
</Transaction>
<Transaction>
<ID>2</ID>
<AmountPaid>99.00</AmountPaid>
<Buyer>
<BuyerInfo>
<ShippingAddress>
<Name>Curly</Name>
</ShippingAddress>
</BuyerInfo>
</Buyer>
<Item>
<ItemID>1002</ItemID>
<Title>Item 2</Title>
</Item>
<Status>Shipped</Status>
<PaidTime>2005-11-04</PaidTime>
<ShippedTime>2005-11-08</ShippedTime>
</Transaction>
</TransactionArray>
And the next day it might be:
<TransactionArray>
<Transaction>
<ID>1</ID>
<AmountPaid>99.00</AmountPaid>
<Buyer>
<BuyerInfo>
<ShippingAddress>
<Name>Larry</Name>
</ShippingAddress>
</BuyerInfo>
</Buyer>
<Item>
<ItemID>1001</ItemID>
<Title>Item 1</Title>
</Item>
<Status>Paid</Status>
<PaidTime>2005-11-30</PaidTime>
</Transaction>
</TransactionArray>
Note that the transaction with ID 1 went from Sold to Paid, and a PaidTime element was added.
What I want to do is an xquery across all the rows in the table but include only the most recent occurrence of a transaction within the entire table. I've got the CROSS APPLY .nodes() working so that each transaction becomes its own row in the result set. What I can't quite figure out is to only include the most recent occurence of, in my example, Transaction ID 1.
On a related note, I need the values of multiple elements within each transaction element returned, each as different columns in the result set. I know that I can brute-force it, e.g. :
select
txn.value('(Item/ItemID)[1]', 'nvarchar(max)')) as 'ItemID',
txn.value('(ID)[1]', 'nvarchar(max)') as 'TxnID',
txn.value('(AmountPaid)[1]', 'nvarchar(max)') as 'Sale Price'
from SoapResponses rslts
CROSS APPLY SoapResponseXML.nodes('//TransactionArray/Transaction') AS R(txn)
I'm wondering if there's a better way to get each of those values out of the transaction node, with the requirement that each show up as a different column.
Thanks for bearing with this XQuery newbie. ; )

Help with XQuery formulation - selecting *only* most recent element from multiple table rows
rrrpak67191
I'd gone the route you suggested for question #1. In my case, I created a view using CROSS APPLY .nodes(). Then when I need to find the latest, I use a subquery and criteria that lets me know what's the latest.
The advice on question #2 is helpful. Should I understand that the text() function is equivalent to InnerXml for the element in question
I appreciate the helpfulness.
Donnie Hale
Hermy
For question #1, you would do it in a similar fashion to how you would do it in plain old T-SQL... which would be to have a subquery in some where clause which finds the latest date for a given transaction and filter on that. Your usage of XML in this case doesn't really provide an alternative which immediately comes to mind. If instead you stored all given operations related to given transaction in the same XML instance, then you could probably take advantage of document ordering to provide you with the "latest" operation on that transaction.
Regarding your 2nd question: Yes, you have to use the multiple value() methods. We don't provide any functionality for performing automatic value extraction. Are your values typed using an XML Schema If not, then you should probably rewrite expressions like (Item/ItemID)[1] to (Item/ItemID/text())[1]. Otherwise you are incurring an extra self join for each value() method invocation because we need to aggregate all of the text nodes under "Item/ItemID" (including all descendants).
-John Gallardo
SQL Server Engine
darthkatzs
text() is not equivalent to .InnerXml (which is a DOM notion).
/foo/text() will grab the text nodes which are direct children of /foo. This is quite different then @x.value('/foo', 'nvarchar(max)') because .value() implies the XQuery data() accessor, which really means "get all of the descendant text nodes" and it is this "get all of the descendant" part which can be expensive due to the extra plan complexity. By putting /foo/text() you are effectively telling us "i only want the direct children of foo, so don't bother looking any deeper in the tree" which allows us to build a much cleaner faster execution plan.