How to solve the XML parsing performance issue on Android

11,203

Solution 1

Original answer, in 2012

(note: make sure you read the 2016 update below!)

I just did some perf testing comparing parsers on Android (and other platforms). The XML file being parsed is only 500 lines or so (its a Twitter search Atom feed), but Pull and DOM parsing can churn through about 5 such documents a second on a Samsung Galaxy S2 or Motorola Xoom2. SimpleXML (pink in the chart) as used by the OP ties for slowest with DOM parsing.

SAX Parsing is an order of magnitude faster on both of my Android devices, managing 40 docs/sec single-threaded, and 65+/sec multi-threaded.

Android 2.3.4:

performance comparison of xml parsing methods on Android

The code is available in github, and a discussion here.

Update 18th March 2016

OK, so its been almost 4 years and the world has moved on. I finally got around to re-running the tests on:

  1. A Samsung Galaxy S3 running Android 4.1.2
  2. A Nexus7 (2012) running Android 4.4.4
  3. A Nexus5 running Android 6.0.1

Somewhere between Android 4.4.4 and Android 6.0.1 the situation changed drastically and we have a new winner: Pull Parsing FTW at more than twice the throughput of SAX. Unfortunately I don't know exactly when this change arrived as I don't have any devices running Android > 4.4.4 and < 6.0.1.

Android 4.1.2:

performance comparison of xml parsing methods on Android 4.1.2

Android 4.4.4:

performance comparison of xml parsing methods on Android 4.4.4

Android 6.0.1:

performance comparison of xml parsing methods on Android 6.0.1

Solution 2

I think the best way to work with XML on Android is use VDT-XML library

My XML file contains more then 60 000 lines and VDT-XML handles it as following:

Nexus 5 : 2055 millisec

Galaxy Note 4 : 2498 milisec

You can find more benchmark reports by link : VTD-XML Benchmark

Short example of XML file

 <database name="products">
        <table name="category">
            <column name="catId">20</column>
            <column name="catName">Fruit</column>
        </table>
        <table name="category">
            <column name="catId">31</column>
            <column name="catName">Vegetables</column>
        </table>
        <table name="category">
            <column name="catId">45</column>
            <column name="catName">Rice</column>
        </table>
        <table name="category">
            <column name="catId">50</column>
            <column name="catName">Potatoes</column>
        </table>
</database>

Configuration of "build.gradle" file

dependencies {
    compile files('libs/vtd-xml.jar')
}

Source code example:

import com.ximpleware.AutoPilot;
import com.ximpleware.VTDGen;
import com.ximpleware.VTDNav;


String fileName = "products.xml";

VTDGen vg = new VTDGen();

if (vg.parseFile(fileName, true)) {

     VTDNav vn = vg.getNav();
     AutoPilot table = new AutoPilot(vn);
     table.selectXPath("database/table");

     while (table.iterate()) {
        String tableName = vn.toString(vn.getAttrVal("name"));

        if (tableName.equals("category")) {
            AutoPilot column = new AutoPilot(vn);
            column.selectElement("column");

            while (column.iterate()) {
                 String text = vn.toNormalizedString(vn.getText());
                 String name = vn.toString(vn.getAttrVal("name"));

                 if (name.equals("catId")) {
                    Log.d("Category ID = " + text);
                 } else if (name.equals("catName")) {
                    Log.d("Category Name = " + text);
                 } 

            }
        }
     }
}

Result

Category ID = 20
Category Name = Fruit

Category ID = 31
Category Name = Vegetables

Category ID = 45
Category Name = Rice

Category ID = 50
Category Name = Potatoes

it works for me and hope it helps you.

Share:
11,203
Korbi
Author by

Korbi

Software Engineer from Munich

Updated on June 19, 2022

Comments

  • Korbi
    Korbi almost 2 years

    I have to read a XML file with about ~4000 lines on Android. First I tried the SimpleXML library because it's the easiest and it took about 2 minutes on my HTC Desire. So I thought maybe SimpleXML is so slow because of reflection and all the other magic that this library uses. I rewrote my parser and used the built-in DOM parsing method with some special attention for performance. That helped a bit but it still took about 60 seconds which is still totally unacceptable. After a bit of research I found this article on developer.com. There are some graphs that show that the other two available methods - the SAX parser and Android's XML Pull-Parser - are equally slow. And at the end of the article you'll find the following statement:

    The first surprise I had was at how slow all three methods were. Users don't want to wait long for results on mobile phones, so parsing anything more than a few dozen records may mandate a different method.

    What might be a "different method"? What to do if you have more than "a few dozen records"?

  • Korbi
    Korbi over 12 years
    thanks Graham, but there is no database involved. I just read every tag once, make some string comparisons and create some objects. That's it. Can't believe that the SAX parser is that much faster. But I'll give it a try.
  • Korbi
    Korbi over 12 years
    Hi Rob. Thanks for your tip, but I don't parse any dates. Just strings.
  • Rob
    Rob over 12 years
    I noticed in one of your other comments that you are doing string comparisons, these can be expensive if you are doing them alot so it might be worth investigating the use of a HashMap
  • Korbi
    Korbi over 12 years
    see the link to the developer.com article I posted for a proof.
  • Michael Kay
    Michael Kay over 12 years
    Both your figures and the ones on the developer.com site seem incredibly slow; it's interesting to contrast with the radically different figures (much closer to what I would expect) being given in other responses to this post. It would be really nice to know what's going on here.
  • Michael Kay
    Michael Kay over 12 years
    I'm inclined to back ng's theory, at least until proven otherwise: it's the download that's taking the time, not the parsing.
  • Julian Suarez
    Julian Suarez almost 12 years
    Hi Steve, thanks for putting together this comparison, any chance you could point us to some guide on how to install or reuse the tests you made?, I'm very familiar with android, Eclipse and Ant but I have little experience with maven
  • Stevie
    Stevie almost 12 years
    If you just want to run the tests on an Android device you can download the pre-built apk from github. If you want to tweak the tests its a bit more work, but there aren't many depenencies, so you could easily enough re-jig the eclipse classpath and not need maven.
  • lujop
    lujop over 11 years
    Thanks for the test Stevie. What Android version have you used for the test? Do you have any idea of why Pull parser is so slow in android? It's seems that has to be a bug at some place because I expected similar SAX performance. It can be a bug in XmlPullParser prior to ICs android-developers.blogspot.com.es/2011/12/…?
  • Stevie
    Stevie over 11 years
    @lujop At the time I was testing on 2.3.4 - I haven't tried re-running on newer versions. If I find some time I'll re-test on the same device running ICS and post the results.
  • Stevie
    Stevie about 8 years
    @vtd-xml-author vtd-xml sounds interesting! Do you have perf comparisons specifically on Android? (since that is what this question is about, and the big surprise here was the massive difference between SAX and everything else). Given that vtd-xml is "memory-based" my naive assumption would be that it would not outperform SAX on Android, but I'd love to be proven wrong ...
  • Stevie
    Stevie about 8 years
    The benchmarks on desktop are really impressive! It would be really great to see a comparison between, say, vtd-xml, sax, and pull parsing on Android. Without that its hard to understand the true benefit to Android apps - benchmarks taken on desktops don't necessarily reflect the reality of parsing on a device.
  • Almighty
    Almighty almost 6 years
    but be aware of the license (it's GNU)