Find div with class using PHP Simple HTML DOM Parser

97,740

Solution 1

The right code to get a div with class is:

$ret = $html->find('div.foo');
//OR
$ret = $html->find('div[class=foo]');

Basically you can get elements as you were using a CSS selector.

source: http://simplehtmldom.sourceforge.net/manual.htm
How to find HTML elements? section, tab Advanced

Solution 2

$html = new simple_html_dom();   
$html->load($output); 
$items = $html->find('div.youclassname',0)->children(1)->outertext; 
print_r($items);
Share:
97,740
Owl
Author by

Owl

Updated on July 09, 2022

Comments

  • Owl
    Owl almost 2 years

    I am just starting with the mentioned Parser and somehow running on problems directly with the beginning.

    Referring to this tutorial:

    http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/

    I want now simply find in a sourcecode tne content of a div with a class ClearBoth Box

    I retrieve the code with curl and create a simple html dom object:

    $cl = curl_exec($curl);  
    $html = new simple_html_dom();
    $html->load($cl);
    

    Then I wanted to add the content of the div into an array called divs:

    $divs = $html->find('div[.ClearBoth Box]');
    

    But now, when I print_r the $divs, it gives much more, despite the fact that the sourcecode has not more inside the div.

    Like this:

    Array
    (
        [0] => simple_html_dom_node Object
            (
                [nodetype] => 1
                [tag] => br
                [attr] => Array
                    (
                        [class] => ClearBoth
                    )
    
                [children] => Array
                    (
                    )
    
                [nodes] => Array
                    (
                    )
    
                [parent] => simple_html_dom_node Object
                    (
                        [nodetype] => 1
                        [tag] => div
                        [attr] => Array
                            (
                                [class] => SocialMedia
                            )
    
                        [children] => Array
                            (
                                [0] => simple_html_dom_node Object
                                    (
                                        [nodetype] => 1
                                        [tag] => iframe
                                        [attr] => Array
                                            (
                                                [id] => ShowFacebookButtons
                                                [class] => SocialWeb FloatLeft
                                                [src] => http://www.facebook.com/plugins/xxx
                                                [style] => border:none; overflow:hidden; width: 250px; height: 70px;
                                            )
    
                                        [children] => Array
                                            (
                                            )
    
                                        [nodes] => Array
                                            (
                                            )
    

    I do not understand why the $divs has not simply the code from the div?

    Here is an example of the source code at the site:

    <div class="ClearBoth Box">
              <div>
    <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
    <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
    <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
    <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
    <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
    
                  <strong class="AlignMiddle LeftSmallPadding">gute peppige Qualität</strong> <span class="AlignMiddle">(17.03.2013)</span>
              </div>
              <div class="BottomMargin">
                gute Verarbeitung, schönes Design,
              </div>
            </div>
    

    What am I doing wrong?

  • Owl
    Owl about 11 years
    Thank you so much! Now I am a little step further! In my case, because the class name is in two parts "ClearBoth Box" I have to use: div[class=ClearBoth Box] because div.ClearBoth Box searches a element Box after the div, and only div.ClearBoth return more matches than I need.
  • amitchhajer
    amitchhajer almost 9 years
    what if my div has no class name? I want all the divs on the page?
  • Kevin Gagnon
    Kevin Gagnon about 8 years
    @amitchhajer You either find an element with a unique ID higher or lower to your div in question and then more with the child, parent methods or you print the outertext of where your are (dom object) and count how many divs there are before the one you need and access it via it's number. 4th div = dom->find('div',3);
  • NomanJaved
    NomanJaved over 7 years
    how can I print the HTML how can I do this?
  • Trenton McKinney
    Trenton McKinney over 4 years
    Here are some guidelines for How do I write a good answer?. This provided answer may be correct, but it could benefit from an explanation. Code only answers are not considered "good" answers. From review.