PHP convert XML to JSON
Solution 1
I figured it out. json_encode handles objects differently than strings. I cast the object to a string and it works now.
foreach($xml->children() as $state)
{
$states[]= array('state' => (string)$state->name);
}
echo json_encode($states);
Solution 2
Json & Array from XML in 3 lines:
$xml = simplexml_load_string($xml_string);
$json = json_encode($xml);
$array = json_decode($json,TRUE);
Solution 3
Sorry for answering an old post, but this article outlines an approach that is relatively short, concise and easy to maintain. I tested it myself and works pretty well.
http://lostechies.com/seanbiefeld/2011/10/21/simple-xml-to-json-with-php/
<?php
class XmlToJson {
public function Parse ($url) {
$fileContents= file_get_contents($url);
$fileContents = str_replace(array("\n", "\r", "\t"), '', $fileContents);
$fileContents = trim(str_replace('"', "'", $fileContents));
$simpleXml = simplexml_load_string($fileContents);
$json = json_encode($simpleXml);
return $json;
}
}
?>
Solution 4
I guess I'm a bit late to the party but I have written a small function to accomplish this task. It also takes care of attributes, text content and even if multiple nodes with the same node-name are siblings.
Dislaimer: I'm not a PHP native, so please bear with simple mistakes.
function xml2js($xmlnode) {
$root = (func_num_args() > 1 ? false : true);
$jsnode = array();
if (!$root) {
if (count($xmlnode->attributes()) > 0){
$jsnode["$"] = array();
foreach($xmlnode->attributes() as $key => $value)
$jsnode["$"][$key] = (string)$value;
}
$textcontent = trim((string)$xmlnode);
if (count($textcontent) > 0)
$jsnode["_"] = $textcontent;
foreach ($xmlnode->children() as $childxmlnode) {
$childname = $childxmlnode->getName();
if (!array_key_exists($childname, $jsnode))
$jsnode[$childname] = array();
array_push($jsnode[$childname], xml2js($childxmlnode, true));
}
return $jsnode;
} else {
$nodename = $xmlnode->getName();
$jsnode[$nodename] = array();
array_push($jsnode[$nodename], xml2js($xmlnode, true));
return json_encode($jsnode);
}
}
Usage example:
$xml = simplexml_load_file("myfile.xml");
echo xml2js($xml);
Example Input (myfile.xml):
<family name="Johnson">
<child name="John" age="5">
<toy status="old">Trooper</toy>
<toy status="old">Ultrablock</toy>
<toy status="new">Bike</toy>
</child>
</family>
Example output:
{"family":[{"$":{"name":"Johnson"},"child":[{"$":{"name":"John","age":"5"},"toy":[{"$":{"status":"old"},"_":"Trooper"},{"$":{"status":"old"},"_":"Ultrablock"},{"$":{"status":"new"},"_":"Bike"}]}]}]}
Pretty printed:
{
"family" : [{
"$" : {
"name" : "Johnson"
},
"child" : [{
"$" : {
"name" : "John",
"age" : "5"
},
"toy" : [{
"$" : {
"status" : "old"
},
"_" : "Trooper"
}, {
"$" : {
"status" : "old"
},
"_" : "Ultrablock"
}, {
"$" : {
"status" : "new"
},
"_" : "Bike"
}
]
}
]
}
]
}
Quirks to keep in mind: Several tags with the same tagname can be siblings. Other solutions will most likely drop all but the last sibling. To avoid this each and every single node, even if it only has one child, is an array which hold an object for each instance of the tagname. (See multiple "" elements in example)
Even the root element, of which only one should exist in a valid XML document is stored as array with an object of the instance, just to have a consistent data structure.
To be able to distinguish between XML node content and XML attributes each objects attributes are stored in the "$" and the content in the "_" child.
Edit: I forgot to show the output for your example input data
{
"states" : [{
"state" : [{
"$" : {
"id" : "AL"
},
"name" : [{
"_" : "Alabama"
}
]
}, {
"$" : {
"id" : "AK"
},
"name" : [{
"_" : "Alaska"
}
]
}
]
}
]
}
Solution 5
A common pitfall is to forget that json_encode()
does not respect elements with a textvalue and attribute(s). It will choose one of those, meaning dataloss.
The function below solves that problem. If one decides to go for the json_encode
/decode
way, the following function is advised.
function json_prepare_xml($domNode) {
foreach($domNode->childNodes as $node) {
if($node->hasChildNodes()) {
json_prepare_xml($node);
} else {
if($domNode->hasAttributes() && strlen($domNode->nodeValue)){
$domNode->setAttribute("nodeValue", $node->textContent);
$node->nodeValue = "";
}
}
}
}
$dom = new DOMDocument();
$dom->loadXML( file_get_contents($xmlfile) );
json_prepare_xml($dom);
$sxml = simplexml_load_string( $dom->saveXML() );
$json = json_decode( json_encode( $sxml ) );
by doing so, <foo bar="3">Lorem</foo>
will not end up as {"foo":"Lorem"}
in your JSON.
Related videos on Youtube
Bryan Hadlock
Updated on February 24, 2022Comments
-
Bryan Hadlock about 2 years
I am trying to convert xml to json in php. If I do a simple convert using simple xml and json_encode none of the attributes in the xml show.
$xml = simplexml_load_file("states.xml"); echo json_encode($xml);
So I am trying to manually parse it like this.
foreach($xml->children() as $state) { $states[]= array('state' => $state->name); } echo json_encode($states);
and the output for state is
{"state":{"0":"Alabama"}}
rather than{"state":"Alabama"}
What am I doing wrong?
XML:
<?xml version="1.0" ?> <states> <state id="AL"> <name>Alabama</name> </state> <state id="AK"> <name>Alaska</name> </state> </states>
Output:
[{"state":{"0":"Alabama"}},{"state":{"0":"Alaska"}
var dump:
object(SimpleXMLElement)#1 (1) { ["state"]=> array(2) { [0]=> object(SimpleXMLElement)#3 (2) { ["@attributes"]=> array(1) { ["id"]=> string(2) "AL" } ["name"]=> string(7) "Alabama" } [1]=> object(SimpleXMLElement)#2 (2) { ["@attributes"]=> array(1) { ["id"]=> string(2) "AK" } ["name"]=> string(6) "Alaska" } } }
-
nikc.org over 12 yearsPlease include a snippet of the XML and the final array structure you have after parsing it. (A
var_dump
works fine.) -
Bryan Hadlock over 12 yearsadded input, output and var_dump
-
Peter Krauss over 7 yearsSome applications need "perfec XML-to-JSON map", that is jsonML, see solution here.
-
-
Bryan Hadlock over 12 yearslooks like the attributes are arrays but not $state->name
-
ethree over 10 yearsThis will not work if you have multiple instances of the same tag in your XML, json_encode will end up only serializing the last instance of the tag.
-
Sabbir almost 10 yearsthe best I get. BTW how about large xml around 150MB. How much memory it'll take?
-
iXcoder almost 10 yearssplit the big file to small
-
Richard Kiefer over 9 yearsDoes not compile and does not produce the described output if syntax-errors are corrected.
-
Jake Wilson over 9 yearsThis solution is not flawless. It completely discards XML attributes. So
<person my-attribute='name'>John</person>
is interpreted as<person>John</person>
. -
Jake Wilson over 9 yearsWhat is
$dom
? Where did that come from? -
useless about 9 yearsJackobud, well then you are talking about a specific structure. for the general purposes what Antonio provided is just great.
-
txyoji almost 9 years$xml = simplexml_load_string($xml_string,'SimpleXMLElement',LIBXML_NOCDATA); to flatten cdata elements.
-
Scott over 8 years$dom = new DOMDocument(); is where it comes from
-
Lawrence Cooke about 8 yearsLast line of code: $json = json_decode( json_encode( $sxml ) ) ); should be : $json = json_decode( json_encode( $sxml ) );
-
Volatil3 about 8 yearsCan it parse large XML data?
-
Octavio Perez Gallegos almost 8 yearsIt is a small and universal solution based on an array of data can be a JSON transformed json_decode ...lucky
-
Dan R almost 8 yearsIn what way does this answer the original question? Your answer seems more complicated than the original question, and also doesn't seem to even mention JSON anywhere.
-
Peter Krauss over 7 yearsThis solution is better because not discards XML attributes. See also why this complex structure is better than simplified ones, at xml.com/lpt/a/1658 (see "Semi-Structured XML").... Ops, for CDATA, as @txyoji suggested to flatten CDATA elements
$xml = simplexml_load_file("myfile.xml",'SimpleXMLElement',LIBXML_NOCDATA);
. -
Peter Krauss over 7 years@AntonioMax and others, try
<states> <state>Alabama</state> <p>John</p> <state>Alaska</state> </states>
, it lost tag order, so it is a bug... The solution is to change representation-map, see stackoverflow.com/a/39889010/287948 -
TheStoryCoder over 7 yearsI have made an improved version of this which also works with namespaces. See answer further below (stackoverflow.com/a/40866796/2404541)
-
Alex over 7 years@JakeWilson maybe it's the 2 years that have passed, and various version fixes, but on PHP 5.6.30, this method produces ALL of the data. Attributes are stored in the array under the
@attributes
key, so it works absolutely flawlessly, and beautifully. 3 short lines of code solve my problem beautifully. -
Alex over 7 yearsOne does not use Regex to parse XML, unless it's a simple XML with trivial structure and very predictable data. I can't stress enough how bad this solution is. This BREAKS DATA. Not to mention that it's incredibly slow (you parse with regex, and then you re-parse again?) and doesn't handle self-closing tags.
-
TheStoryCoder over 7 yearsI don't think you really looked at the function. It doesn't use regex to do the actual parsing, only as a simple fix to deal with namespaces - which has been working for all my xml cases - and that it is working is the most important, rather than being "politically correct". You're welcome to improve it if you want, though!
-
TheStoryCoder over 7 yearsVery unusual xml structure that I doubt would have real life use cases.
-
Alex about 7 yearsThe fact that it has worked for you doesn't mean it's right. It's code like this that generates bugs that are immensely hard to diagnose, and generates exploits. I mean even looking superficially at XML specs on sites like this w3schools.com/xml/xml_elements.asp show a lot of reasons why this solution wouldn't work. Like I said, it fails to detect self-closing tags like
<element/>
, fails to address elements that start with, or contain underscores, which is allowed in XML. Fails to detect CDATA. And as I've said, it's SLOW. It's an O(n^2) complexity because of inner parsing. -
Alex about 7 yearsThe thing is that dealing with namespaces wasn't even asked here, and there are PROPER ways to deal with namespaces. Namespaces exist as a helpful construction, NOT to be parsed like that and turned into an abomination that won't be processed by any reasonable parser. And all you needed to do for that is not to create the contender for the prize of "slowest algorithm of 2016", but to do a bit of searching, to come up with a myriad of actual solutions, like this one stackoverflow.com/questions/16412047/… And to call this an improvement? Wow.
-
jirislav over 6 yearsThis doesn't work if you have multiple namespaces, you can choose only one, which will pass into the $json_string :'(
-
ryabenko-pro over 6 yearsI used this approach, but JSON is empty. XML is valid.
-
nanocv over 6 years@AlexanderMP Not flawless, sorry. 3v4l.org/S3jP8 This solution requires attributes to be only on parent to work well.
-
Klesun over 5 yearsKeep in mind that with this solution, when there may be multiple nodes with same name, one node will result in a key just pointing to an element, but multiple nodes will result in key pointing to array of elements:
<list><item><a>123</a><a>456</a></item><item><a>123</a></item></list>
->{"item":[{"a":["123","456"]},{"a":"123"}]}
. A solution at php.net by ratfactor solves that issue by always storing elements in an array. -
TheStoryCoder about 5 years@AlexanderMP I'm running 7.1.15 and it still doesn't include the attributes in
<logentry revision="7"><paths><path action="M" text-mods="true" kind="file">module.php</path><path action="A" text-mods="true" kind="file">js/module.js</path></paths></logentry>
. It includes the one in <logentry> but not in <path>! -
Marc Pope over 4 years@txyoji This answer of stripping our CDATA was something I was looking for hours. Excellent answer.
-
lucifer63 over 4 yearsMany thanks for a custom function! It makes tuning pretty easy. Btw, added an edited version of your function that parses XML in a JS way: every entry has its own object (entries aren't stored in a single array if they have equal tagnames), thus the order is preserved.
-
KingRider over 4 yearsError
Fatal error: Uncaught Error: Call to a member function getName() on bool
.. i think a version php is fail :-( .. please help! -
aaron almost 4 yearsthis actually works for multi-namespace cases, better than other solutions, why got a down vote...
-
G Chris DCosta about 3 yearsAfter trying tens of solutions this one is the only one that worked for me, thank you so much!
-
Coreus almost 3 yearsTo everyone looking at this old answer: Please bear in mind the times this was written in, and perhaps consider more modern approaches.