Which Haskell XML library to use?
Solution 1
I would recommend:
- xml, if your task is simple
- haxml, if your task is complex
- hxt, if you like arrows
- hexpat if you need high performance
Solution 2
HXT's main problem, aside from the unusual arrow syntax, is performance and memory usage. I have an app that spends 1.2 seconds processing about 1.5MB of XML, consuming about 2.3GB (!) of memory in the process. Libxml2 takes a few milliseconds on the same data. Extracting data via the css
function and arrow predicates also seems very slow compared to Libxml2.
Solution 3
I would personally recommend HXT because it uses arrows, which are a very useful and powerful tool to learn, and an XML parsing library is the perfect use for arrows (they were first invented to solve various parsing problems that monads couldn't). Arrows are also starting to be used outside of pure functional programming, such as Arrowlets in JavaScript.
sastanin
Programmer, applied mathematician, Open Source enthusiast.
Updated on June 06, 2022Comments
-
sastanin about 2 years
I see that there is a few of XML processing libraries in Haskell.
- HaXml seems to be the most popular (according to dons)
- HXT seems to be the most advanced (but also the most difficult to learn thanks to arrows)
- xml which seems to be just the basic parser
- HXML seems to be abandoned
- tagsoup and tagchup
- libXML and libXML SAX bindings
So, which library to choose if I want it
- to be reasonably powerful (to extract data from XML and to modify XML)
- likely to be supported long time in the future
- to be a “community choice” (default choice)
And while most of the above seem to be sufficient for my current needs, what are the reason to choose one of them over the others?
UPD 20091222:
Some notes about licenses:
- BSD or MIT: hexpat, hxt, libxml, tagsoup, xml
- LGPL: HaXml
- GPLv2:
- GPLv3: libxml-sax, tagchup, tagsoup-ht
-
sastanin almost 15 yearsThanks, Will! That's why I started learning HXT, but I am also afraid that code written with HXT and arrows is less friendly for potential contributors. Also, it alarms me that HaXml is much more popular.
-
sastanin almost 15 yearsThank you, Don. That's the kind of suggestion I was looking for.
-
Don Stewart almost 15 years"likely to be supported long time in the future" I would definitely use Haxml. It is 10 years old, and the authors are very active.
-
Tim Stewart about 13 yearsI've really benefited from the tutorial at: haskell.org/haskellwiki/HXT/Practical. Unlike most of the other tutorials I found, this one started with a basic XML document, showed you how to parse it and then added complexities slowly.
-
Stephan Kulla over 10 yearsAnother good hxt tutorial explaining also the concept of arrows very well: adit.io/posts/2012-04-14-working_with_HTML_in_haskell.html
-
Carbon almost 7 yearsIs this still true? I feel like I'm not smart enough to use HXT.
-
Julia Path almost 3 yearsDunno if that's the problem here, but whether or not optimisation (-O2) is enabled can make a huge difference in some cases.