split xml document into chunks
10,213
Solution 1
Another naive solution; this time for .NET 2.0. It should give you an idea of how to go about what you want. Uses Xpath expressions instead of Linq to XML. Chunks a 100 order docket into 10 dockets in under a second on my devbox.
public List<XmlDocument> ChunkDocket(XmlDocument docket, int chunkSize)
{
List<XmlDocument> newDockets = new List<XmlDocument>();
//
int orderCount = docket.SelectNodes("//docket/order").Count;
int chunkStart = 0;
XmlDocument newDocket = null;
XmlElement root = null;
XmlNodeList chunk = null;
while (chunkStart < orderCount)
{
newDocket = new XmlDocument();
root = newDocket.CreateElement("docket");
newDocket.AppendChild(root);
chunk = docket.SelectNodes(String.Format("//docket/order[position() > {0} and position() <= {1}]", chunkStart, chunkStart + chunkSize));
chunkStart += chunkSize;
XmlNode targetNode = null;
foreach (XmlNode c in chunk)
{
targetNode = newDocket.ImportNode(c, true);
root.AppendChild(targetNode);
}
newDockets.Add(newDocket);
}
return newDockets;
}
Solution 2
Naive, iterative, but works [EDIT: in .NET 3.5 only]
public List<XDocument> ChunkDocket(XDocument docket, int chunkSize)
{
var newDockets = new List<XDocument>();
var d = new XDocument(docket);
var orders = d.Root.Elements("order");
XDocument newDocket = null;
do
{
newDocket = new XDocument(new XElement("docket"));
var chunk = orders.Take(chunkSize);
newDocket.Root.Add(chunk);
chunk.Remove();
newDockets.Add(newDocket);
} while (orders.Any());
return newDockets;
}
Author by
ChrisCa
Updated on June 04, 2022Comments
-
ChrisCa almost 2 years
I have a large xml document that needs to be processed 100 records at a time
It is being done within a Windows Service written in c#.
The structure is as follows :
<docket xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="docket.xsd"> <order> <Date>2008-10-13</Date> <orderNumber>050758023</orderNumber> <ParcelID/> <CustomerName>sddsf</CustomerName> <DeliveryName>dsfd</DeliveryName> <Address1>sdf</Address1> <Address2>sdfsdd</Address2> <Address3>sdfdsfdf</Address3> <Address4>dffddf</Address4> <PostCode/> </order> <order> <Date>2008-10-13</Date> <orderNumber>050758023</orderNumber> <ParcelID/> <CustomerName>sddsf</CustomerName> <DeliveryName>dsfd</DeliveryName> <Address1>sdf</Address1> <Address2>sdfsdd</Address2> <Address3>sdfdsfdf</Address3> <Address4>dffddf</Address4> <PostCode/> </order> ..... ..... </docket>
There could be thousands of orders in a docket.
I need to chop this into 100 element chunks
However each of the 100 orders still need to be wrapped with the parent "docket" node and have the same namespace etc
is this possible?
-
Jim Burger over 15 yearsI know its horribly inefficient.