Something faster than get_headers()

16,901

Solution 1

You can try CURL library. You can send multiple request parallel at same time with CURL_MULTI_EXEC

Example:

$ch = curl_init('http_url'); 
curl_setopt($ch, CURLOPT_HEADER, 1); 
$c = curl_exec($ch); 
$info = curl_getinfo($ch, CURLINFO_HTTP_CODE);
print_r($info);

UPDATED

Look this example. http://www.codediesel.com/php/parallel-curl-execution/

Solution 2

I don't know if this is an option that you can consider, but you could run all of them almost at the same using a fork, this way the script will take only a bit longer than one request http://www.php.net/manual/en/function.pcntl-fork.php

you could add this in a script that is ran in cli mode and launch all the requests at the same time, for example

Edit: you say that you have 200 calls to make, so a thing you might experience is the database connection loss. the problem is caused by the fact that the link is destroyed when the first script completes. to avoid that you could create a new connection for each child. I see that you are using the standard mysql_* functions so be sure to pass the 4th parameter to be sure you create a new link each time. also check the maximum number of simultaneous connections on your server

Share:
16,901

Related videos on Youtube

Clarkey
Author by

Clarkey

Updated on June 04, 2022

Comments

  • Clarkey
    Clarkey almost 2 years

    I'm trying to make a PHP script that will check the HTTP status of a website as fast as possible.

    I'm currently using get_headers() and running it in a loop of 200 random urls from mysql database.

    To check all 200 - it takes an average of 2m 48s.

    Is there anything I can do to make it (much) faster?

    (I know about fsockopen - It can check port 80 on 200 sites in 20s - but it's not the same as requesting the http status code because the server may responding on the port - but might not be loading websites correctly etc)

    Here is the code..

    <?php
      function get_httpcode($url) {
        $headers = get_headers($url, 0);
        // Return http status code
        return substr($headers[0], 9, 3);
      }
    
      ###
      ## Grab task and execute it
      ###
    
    
        // Loop through task
        while($data = mysql_fetch_assoc($sql)):
    
          $result = get_httpcode('http://'.$data['url']);   
          echo $data['url'].' = '.$result.'<br/>';
    
        endwhile;
    ?>
    
  • Clarkey
    Clarkey about 12 years
    Hi, I've also tried using cURL - like the code you posted. But it's the same, infact a little longer than get_headers();
  • safarov
    safarov about 12 years
    Try making multi request as i give link above. For example 10 request per time
  • nnichols
    nnichols about 12 years
    +1 Nice one! I did not know the curl extension could process requests in parallel.
  • Clarkey
    Clarkey about 12 years
    In parallel? So is that effectively another thread running at the same time?
  • Clarkey
    Clarkey about 12 years
    This is what I'm after - I'll have a look at your link, thanks.
  • safarov
    safarov about 12 years
    @MattClarke i have also applictions that using parallel request i can say , 50 request parallel takes time is nearly 1-2 request with normal request per time
  • mishu
    mishu about 12 years
    @MattClarke ok, I am glad you find it useful.. you will need to run the fork in the iteration where you get the results and ping the site if you are in the child or continue if you are in the parent
  • Clarkey
    Clarkey about 12 years
    @safarov - Nice! Do you mean you used PHP forking or CURL_MULTI_EXEC?
  • Clarkey
    Clarkey about 12 years
    I'm not following what your saying - this fork business is completely new to me, I didn't even know it was possible.
  • mishu
    mishu about 12 years
    @MattClarke it updated the answer to talk about a common problem in using fork.. it is normal to seem a bit complicated the first time.. if you decide that you want to use this option (to learn about these systems) you will find good resources on the php man page (the link in the answer).. a lot of good ideas can be found in the comments on that page
  • safarov
    safarov about 12 years
    @MattClarke if you familiar with php deamon you can use fork, better use C++ as running background to check continously , otherwise curl is what are you looking for
  • Clarkey
    Clarkey about 12 years
    Thanks - I'd +1 you but don't have enough rep to!
  • Clarkey
    Clarkey about 12 years
    Thanks - I'd +1 you but don't have enough rep to!
  • mishu
    mishu about 12 years
    :) no problem, glad if helps in a way
  • Brent Baisley
    Brent Baisley about 12 years
    You can do all 200 requests at once with multi-curl. It will take as long as the slowest server to respond. If one of them take 60 seconds, then the entire request will take 60 seconds. But you can set a timeout in curl.