Which Haskell XML library to use?

11,588

Solution 1

I would recommend:

  1. xml, if your task is simple
  2. haxml, if your task is complex
  3. hxt, if you like arrows
  4. hexpat if you need high performance

Solution 2

HXT's main problem, aside from the unusual arrow syntax, is performance and memory usage. I have an app that spends 1.2 seconds processing about 1.5MB of XML, consuming about 2.3GB (!) of memory in the process. Libxml2 takes a few milliseconds on the same data. Extracting data via the css function and arrow predicates also seems very slow compared to Libxml2.

Solution 3

I would personally recommend HXT because it uses arrows, which are a very useful and powerful tool to learn, and an XML parsing library is the perfect use for arrows (they were first invented to solve various parsing problems that monads couldn't). Arrows are also starting to be used outside of pure functional programming, such as Arrowlets in JavaScript.

Share:
11,588
sastanin
Author by

sastanin

Programmer, applied mathematician, Open Source enthusiast.

Updated on June 06, 2022

Comments

  • sastanin
    sastanin about 2 years

    I see that there is a few of XML processing libraries in Haskell.

    • HaXml seems to be the most popular (according to dons)
    • HXT seems to be the most advanced (but also the most difficult to learn thanks to arrows)
    • xml which seems to be just the basic parser
    • HXML seems to be abandoned
    • tagsoup and tagchup
    • libXML and libXML SAX bindings

    So, which library to choose if I want it

    • to be reasonably powerful (to extract data from XML and to modify XML)
    • likely to be supported long time in the future
    • to be a “community choice” (default choice)

    And while most of the above seem to be sufficient for my current needs, what are the reason to choose one of them over the others?

    UPD 20091222:

    Some notes about licenses:

  • sastanin
    sastanin almost 15 years
    Thanks, Will! That's why I started learning HXT, but I am also afraid that code written with HXT and arrows is less friendly for potential contributors. Also, it alarms me that HaXml is much more popular.
  • sastanin
    sastanin almost 15 years
    Thank you, Don. That's the kind of suggestion I was looking for.
  • Don Stewart
    Don Stewart almost 15 years
    "likely to be supported long time in the future" I would definitely use Haxml. It is 10 years old, and the authors are very active.
  • Tim Stewart
    Tim Stewart about 13 years
    I've really benefited from the tutorial at: haskell.org/haskellwiki/HXT/Practical. Unlike most of the other tutorials I found, this one started with a basic XML document, showed you how to parse it and then added complexities slowly.
  • Stephan Kulla
    Stephan Kulla over 10 years
    Another good hxt tutorial explaining also the concept of arrows very well: adit.io/posts/2012-04-14-working_with_HTML_in_haskell.html
  • Carbon
    Carbon almost 7 years
    Is this still true? I feel like I'm not smart enough to use HXT.
  • Julia Path
    Julia Path almost 3 years
    Dunno if that's the problem here, but whether or not optimisation (-O2) is enabled can make a huge difference in some cases.