DOMDocument::loadHTML error

66,027

Solution 1

Header, Nav and Section are elements from HTML5. Because HTML5 developers felt it is too difficult to remember Public and System Identifiers, the DocType declaration is just:

<!DOCTYPE html>

In other words, there is no DTD to check, which will make DOM use the HTML4 Transitional DTD and that doesnt contain those elements, hence the Warnings.

To surpress the Warnings, put

libxml_use_internal_errors(true);

before the call to loadHTML and

libxml_use_internal_errors(false);

after it.

An alternative would be to use https://github.com/html5lib/html5lib-php.

Solution 2

With a DOMDocument object, you should be able to place an @ before the load method in order to SUPPRESS all WARNINGS.

$dom = new DOMDocument;
@$dom->loadHTML($source);

And carry on.

Solution 3

HTML5 elements are still not supported, but you can silence libxml errors completely with the $options parameter.

Just set

$doc = new DOMDocument();
$doc->loadHTMLFile("html5.html", LIBXML_NOERROR);

This option is preferred over @ which silences PHP errors.

But be careful, libxml is very forgiving and it will parse a broken HTML document. If you silence libxml errors you might not even be aware that the HTML is malformed.

Share:
66,027
user1079160
Author by

user1079160

Updated on November 01, 2020

Comments

  • user1079160
    user1079160 over 3 years

    I build a script that combines all css on a page together to use it in my cms. It worked fine for a long time now i i get this error:


    Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Tag header invalid in Entity, line: 10 in css.php on line 26

    Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Tag nav invalid in Entity, line: 10 in css.php on line 26

    Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Tag section invalid in Entity, line: 22 in css.php on line 26

    This is the php script

    This is my code:

    <?php
    header('Content-type: text/css');
    include ('../global.php');
    
    if ($usetpl == '1') {
        $client = New client();
        $tplname = $client->template();
        $location = "../templates/$tplname/header.php";
        $page = file_get_contents($location);
    } else {
        $page = file_get_contents('../index.php');
    }
    
    class StyleSheets extends DOMDocument implements IteratorAggregate
    {
    
        public function __construct ($source)
        {
            parent::__construct();
            $this->loadHTML($source);
        }
    
        public function getIterator ()
        {
            static $array;
            if (NULL === $array) {
                $xp = new DOMXPath($this);
                $expression = '//head/link[@rel="stylesheet"]/@href';
                $array = array();
                foreach ($xp->query($expression) as $node)
                    $array[] = $node->nodeValue;
            }
            return new ArrayIterator($array);
        }
    }
    
    foreach (new StyleSheets($page) as $index => $file) {
        $css = file_get_contents($file);
        echo $css;
    }
    
  • user1079160
    user1079160 over 12 years
    did that, now i get a blank page
  • Thomas Decaux
    Thomas Decaux over 11 years
    @user1079160 that is another problem ! Gordon has the good answer, thanks !
  • CodeGuru
    CodeGuru over 9 years
    @Gordan how do you fix the blank page issue?
  • ndm13
    ndm13 almost 7 years
    I had the same blank-page issue. My mistake was using print $document->saveXML() instead of $document->saveHTML(). The HTML version doesn't make certain formatting conversions that the XML version does. If that's not the issue, try checking the source of the output to see what tags, if any, are present. It should clue you in to what's happening under the hood. Also, don't forget var_dump!
  • Ahmad
    Ahmad over 3 years
    This is a terrible solution as you will make errors on this line a nightmare to debug. @Gordon's solution is much better.