how to parse contents from a html file using CURL?

18,150

Solution 1

Try simple html dom from http://simplehtmldom.sourceforge.net/

If you don't mind to use python or perl you can use beautifulsoup or WWW-Mechanize

Solution 2

I would use the Document Object Model rather than writing your own parsing code or (God forbid!) regular expressions.

Here's an example in PHP: PHP Parse HTML code

Share:
18,150
Balaji Kandasamy
Author by

Balaji Kandasamy

PHP

Updated on July 25, 2022

Comments

  • Balaji Kandasamy
    Balaji Kandasamy almost 2 years

    I want to parse an XHTML content using CURL. How to scrap transaction number, weight, height, Width between <table> tags. How to scrap only the contents from this HTML document and get it as array using CURL?

    transactions.php
    
     <table border=0 cellspacing=0 width=100%>
           <tr> 
            <td colspan="2">&nbsp;</td>
          </tr>
          <tr> 
            <td width="30%" class="Mellemrubrikker">Transaction Number::</td>
            <td width="70%">24752734576547IN</td>
          </tr>
          <tr> 
            <td width="30%" class="Mellemrubrikker">Weight:</td>
            <td width="70%">0.85 kg</td>
          </tr>
          <tr> 
            <td width="30%" class="Mellemrubrikker">Length:</td>
            <td width="70%">543 mm.</td>
          </tr>
          <tr> 
            <td width="30%" class="Mellemrubrikker">Height:</td>
            <td width="70%">156 mm.</td>
          </tr>
          <tr> 
            <td width="30%" class="Mellemrubrikker">Width:</td>
            <td width="70%">61 mm.</td>
          </tr>
          <tr> 
             <td colspan="2">&nbsp;</td>
          </tr>    
        </table>
    

    index.php

    <?php
    $url = "http://localhost/htmlparse/transactions.php";
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
    $output = curl_exec($ch);
    $info = curl_getinfo($ch);
    curl_close($ch);
    //print_r($output);
    echo $output;
    ?>
    

    This code gets whole html content from transactions.php . How to get data between <table> as an array value ?

  • iHaveacomputer
    iHaveacomputer almost 13 years
    came here to suggest the same. :)