php: SimpleXML Load File Invalid Character Error
Stripping the invalid chars before parsing would be the easiest fix:
function utf8_for_xml($string)
{
return preg_replace ('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $string);
}
From: PHP generated XML shows invalid Char value 27 message
IrfanClemson
Updated on June 09, 2022Comments
-
IrfanClemson almost 2 years
I have a php application which -sometimes- fails (depends on what data I load) and gives errors like:
parser error : PCDATA invalid Char value 11 Warning: simplexml_load_file(): ath>/datadrivenbestpractices/Data-driven Best Practices in Warning: simplexml_load_file(): ^ in
I am certain that there are some values which are causing the problem. I don't have control over data. I have tried solutions from: Error: "Input is not proper UTF-8, indicate encoding !" using PHP's simplexml_load_string and How to handle invalid unicode with simplexml and How to skip invalid characters in XML file using PHP but they have not helped.
The culprit strings are: 'Data Driven - Best Practices' and 'Data-driven Best Practices to Recruit and Retain Underrepresented Graduate Students May 12, 2011 - 1:30-3:00 p.m., EST' (may be dashes or return characters).
What can I do? Mine is a Windows php test environment but the live environment will be a LAMP one--can 't touch the .ini files.
Thanks.
-
K-Gun over 11 yearsI think you should show your XML source too.
-
-
IrfanClemson over 11 yearsNot sure how this will work in my code: Here is how I load the xml: $xml_apicheck = simplexml_load_file($serveraddress.$myparam)
-
Admin over 11 yearsIt should work if you do something like:
simplexml_load_string(utf8_for_xml(file_get_contents($serveraddress.$myparam)));
-
IrfanClemson over 11 yearsokay, I have this: $xml_apicheck = simplexml_load_file(utf8_for_xml(file_get_contents($serveraddress.$myparam))); but am now getting error: action.php on line 100 PHP Notice: Trying to get property of non-object in
-
IrfanClemson over 11 yearsProblem may be that file_get_contents is not xml anymore?