Best XML Parser for PHP

196,210

Solution 1

I would have to say SimpleXML takes the cake because it is firstly an extension, written in C, and is very fast. But second, the parsed document takes the form of a PHP object. So you can "query" like $root->myElement.

Solution 2

Have a look at PHP's available XML extensions.

The main difference between XML Parser and SimpleXML is that the latter is not a pull parser. SimpleXML is built on top of the DOM extensions and will load the entire XML file into memory. XML Parser like XMLReader will only load the current node into memory. You define handlers for specific nodes which will get triggered when the Parser encounters it. That is faster and saves on memory. You pay for that with not being able to use XPath.

Personally, I find SimpleXml quite limiting (hence simple) in what it offers over DOM. You can switch between DOM and SimpleXml easily though, but I usually dont bother and go the DOM route directly. DOM is an implementation of the W3C DOM API, so you might be familiar with it from other languages, for instance JavaScript.

Solution 3

This is a useful function for quick and easy xml parsing when an extension is not available:

<?php
/**
 * Convert XML to an Array
 *
 * @param string  $XML
 * @return array
 */
function XMLtoArray($XML)
{
    $xml_parser = xml_parser_create();
    xml_parse_into_struct($xml_parser, $XML, $vals);
    xml_parser_free($xml_parser);
    // wyznaczamy tablice z powtarzajacymi sie tagami na tym samym poziomie
    $_tmp='';
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_level!=1 && $x_type == 'close') {
            if (isset($multi_key[$x_tag][$x_level]))
                $multi_key[$x_tag][$x_level]=1;
            else
                $multi_key[$x_tag][$x_level]=0;
        }
        if ($x_level!=1 && $x_type == 'complete') {
            if ($_tmp==$x_tag)
                $multi_key[$x_tag][$x_level]=1;
            $_tmp=$x_tag;
        }
    }
    // jedziemy po tablicy
    foreach ($vals as $xml_elem) {
        $x_tag=$xml_elem['tag'];
        $x_level=$xml_elem['level'];
        $x_type=$xml_elem['type'];
        if ($x_type == 'open')
            $level[$x_level] = $x_tag;
        $start_level = 1;
        $php_stmt = '$xml_array';
        if ($x_type=='close' && $x_level!=1)
            $multi_key[$x_tag][$x_level]++;
        while ($start_level < $x_level) {
            $php_stmt .= '[$level['.$start_level.']]';
            if (isset($multi_key[$level[$start_level]][$start_level]) && $multi_key[$level[$start_level]][$start_level])
                $php_stmt .= '['.($multi_key[$level[$start_level]][$start_level]-1).']';
            $start_level++;
        }
        $add='';
        if (isset($multi_key[$x_tag][$x_level]) && $multi_key[$x_tag][$x_level] && ($x_type=='open' || $x_type=='complete')) {
            if (!isset($multi_key2[$x_tag][$x_level]))
                $multi_key2[$x_tag][$x_level]=0;
            else
                $multi_key2[$x_tag][$x_level]++;
            $add='['.$multi_key2[$x_tag][$x_level].']';
        }
        if (isset($xml_elem['value']) && trim($xml_elem['value'])!='' && !array_key_exists('attributes', $xml_elem)) {
            if ($x_type == 'open')
                $php_stmt_main=$php_stmt.'[$x_type]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
            else
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.' = $xml_elem[\'value\'];';
            eval($php_stmt_main);
        }
        if (array_key_exists('attributes', $xml_elem)) {
            if (isset($xml_elem['value'])) {
                $php_stmt_main=$php_stmt.'[$x_tag]'.$add.'[\'content\'] = $xml_elem[\'value\'];';
                eval($php_stmt_main);
            }
            foreach ($xml_elem['attributes'] as $key=>$value) {
                $php_stmt_att=$php_stmt.'[$x_tag]'.$add.'[$key] = $value;';
                eval($php_stmt_att);
            }
        }
    }
    return $xml_array;
}
?>

Solution 4

Hi I think the SimpleXml is very useful . And with it I am using xpath;

$xml = simplexml_load_file("som_xml.xml");

$blocks  = $xml->xpath('//block'); //gets all <block/> tags
$blocks2 = $xml->xpath('//layout/block'); //gets all <block/> which parent are   <layout/>  tags

I use many xml configs and this helps me to parse them really fast. SimpleXml is written on C so it's very fast.

Solution 5

It depends on what you are trying to do with the XML files. If you are just trying to read the XML file (like a configuration file), The Wicked Flea is correct in suggesting SimpleXML since it creates what amounts to nested ArrayObjects. e.g. value will be accessible by $xml->root->child.

If you are looking to manipulate the XML files you're probably best off using DOM XML

Share:
196,210

Related videos on Youtube

Anders Martinsson
Author by

Anders Martinsson

Updated on July 08, 2022

Comments

  • Anders Martinsson
    Anders Martinsson almost 2 years

    I have used the XML Parser before, and even though it worked OK, I wasn't happy with it in general, it felt like I was using workarounds for things that should be basic functionality.

    I recently saw SimpleXML but I haven't tried it yet. Is it any simpler? What advantages and disadvantages do both have? Any other parsers you've used?

    • Shog9
      Shog9 over 11 years
      Suggestion for anyone reading this: ask a question describing what you need to do with the XML (beyond simply parsing it) and you'll probably get a much better answer.
    • hakre
      hakre over 11 years
      Please see the following general reference question for the PHP tag: How to parse and process HTML/XML with PHP?
  • pleasedontbelong
    pleasedontbelong almost 14 years
    simplexml is the best. But is not that good working with namespaces, it can get tricky sometimes
  • Vahan
    Vahan over 12 years
    Yes I think it's best too. And I use xpath with it. $xml->xpath("//block");//THIS IS SUPER :)
  • Karol
    Karol about 12 years
    I dont think its best. It doesnt support xml version="1.1" and throws warning about this fact (my PHP version is 5.3.6). I know that you can disable warnings and it work fine, but I don't think it's a good solution. So imagine what will happen, if your API provider change xml document version from 1.0 to 1.1? Second think is what @Gordon pointed out. SimpleXML loads entire document to memory. It's good solution but certainly not best.
  • Jake Wilson
    Jake Wilson over 11 years
    Dealing with namespaced XML with SimpleXML sucks.
  • Brad Larson
    Brad Larson over 10 years
    You'll probably want to disclose that you are the author of this class.
  • Adam Pietrasiak
    Adam Pietrasiak over 9 years
    SimpleXML creates different structure when some node has one child and different when it has more children. It makes me sick!
  • Bet Lamed
    Bet Lamed over 8 years
    Do not use SimpleXml if you might, at any point in the future, have to change the XML. I have seen the resulting code... it is not a pretty sight.
  • Robert K
    Robert K over 8 years
    @BetLamed I agree nowadays, but if you've tried to write a parser with DOMDocument you'll be in for 10x the code and lots of complexity.
  • Phillip Harrington
    Phillip Harrington over 7 years
    PHPClasses.org is still a thing? Edit: Oh, I guess it was still back in '11
  • E Ciotti
    E Ciotti over 7 years
    worked like a charm, where simpleXml failed in a couple of scripts I'm working on, thanks
  • shfkktm
    shfkktm over 7 years
    getting error- Notice: Undefined variable: xml_array ?
  • jim smith
    jim smith over 7 years
    @jake-wilson It's SimpleXML - the clue is in the name
  • Pratik
    Pratik over 5 years
    so what do you use mostly?
  • Vilk
    Vilk almost 5 years
    thx, this solve my problem with simpleXml !
  • agoldev
    agoldev almost 5 years
    @BetLamed alternatives?
  • Bet Lamed
    Bet Lamed almost 5 years
    @agoldev Sorry I don't remember, I wrote this 4 years ago. In the project I was on, I think I ended up using SimpleXML and coding around the issues, because changing it would have been too much work...
  • Nigel Ren
    Nigel Ren about 4 years
    Sorry - just looking for info about the difference in the API's and came here. Both the devzone links are dead and not sure if they should be removed or updated.