What are the differences between DOM, SAX and StAX XML parsers?

18,666

Solution 1

It mostly depends on your needs. Each has it's own features.

DOM - pull the whole thing into memory and walk around inside it. Good for comparatively small chunks of XML that you want to do complex stuff with. XSLT uses DOM.

SAX - Walk the XML as it arrives watching for things as they fly past. Good for large amounts of data or comparatively simple processing.

StAX - Much like SAX but instead of responding to events found in the stream you iterate through the xml - See When should I choose SAX over StAX? for discussion of which is best.

There's a good discussion here Parsing XML using DOM, SAX and StAX Parser in Java - By Mohamed Sanaulla. NB: There's a fault in his SAX parser - he should append characters, not replace them as character data is cumulative and may arrive in chunks.

  content = String.copyValueOf(ch, start, length);

should be

  content += String.copyValueOf(ch, start, length);

Also a blog post by Kaan Yamanyar Differences between DOM, SAX or StAX.

Solution 2

I don't know StAX, but I can say something to DOM and SAX:

Dom holds the XML-Data in memory as a Object-Model. The advantage is, that you can access and change the data in a convenient and fast way in Memory. The disadvantage is, that this has a high memory consumption.

SAX uses some kind of an event-pattern to read the data and doesn't keep any data in memory. The advantage is that this is relatively fast and doesn't need much memoryspace. The disadvantage is, that you have to create your own data-model if you want to change the data in a convenient way.

Dom is a little more complex to use compared to SAX.

Use SAX if you need to parse big data as a Stream. Use DOM if you want to hold the complete data in memory to work with it and the data-size is small enough to safely fit into memory.

For example: XSLT doesn't work with SAX because it needs to look forward in the data-stream while reading it. So it uses DOM even if that leads to memory-issues with big data.

Hope that helped :-)

Share:
18,666
user3067088
Author by

user3067088

Updated on June 27, 2022

Comments

  • user3067088
    user3067088 almost 2 years

    I'm developing a RSS feed aggregator with Apache Tomcat. I was wondering which parser to use in order to read RSS feeds. Should I use DOM, SAX or StAX? I know that there are libraries specific to read RSS feeds with java but since this is a university project I am not supposed to use those. Thank you.

  • Praba
    Praba over 10 years
    Add to that, DOM is much useful if we want to navigate back and forth the tree, like looking for siblings and such. With SAX, once we get past a node, it's lost, unless we have it in a user-defined data structure.
  • OldCurmudgeon
    OldCurmudgeon over 10 years
    @prabugp - That is what I meant by walk around inside as opposed to watching for things as they fly past. Thanks for highlighting the point though.