Is there an API to force Facebook to scrape a page again?

47,966

Solution 1

Page metadata isn't the sort of thing that should change very often, but you can manually clear the cache by going to Facebook's Debug Tool and entering the URL you want to scrape

There's also an API for doing this, which works for any OG object:

curl -X POST \
     -F "id={object-url OR object-id}" \
     -F "scrape=true" \
     -F "access_token={your access token}" \
     "https://graph.facebook.com"

An access_token is now required. This can be an app or page access_token; no user authentication is required.

Solution 2

If you'd like to do this in PHP in a with-out waiting for a reply, the following function will do this:

//Provide a URL in $url to empty the OG cache
function clear_open_graph_cache($url, $token) {
  $vars = array('id' => $url, 'scrape' => 'true', 'access_token' => $token);
  $body = http_build_query($vars);

  $fp = fsockopen('ssl://graph.facebook.com', 443);
  fwrite($fp, "POST / HTTP/1.1\r\n");
  fwrite($fp, "Host: graph.facebook.com\r\n");
  fwrite($fp, "Content-Type: application/x-www-form-urlencoded\r\n");
  fwrite($fp, "Content-Length: ".strlen($body)."\r\n");
  fwrite($fp, "Connection: close\r\n");
  fwrite($fp, "\r\n");
  fwrite($fp, $body);
  fclose($fp);
}

Solution 3

This is a simple ajax implementation. Put this on any page you want facebook to scrape immediately;

var url= "your url here";
        $.ajax({
        type: 'POST',
        url: 'https://graph.facebook.com?id='+url+'&scrape=true',
            success: function(data){
               console.log(data);
           }
    });

Solution 4

If you're using the javascript sdk, the version of this you'd want to use is

FB.api('https://graph.facebook.com/', 'post', {
            id: [your-updated-or-new-link],
            scrape: true
        }, function(response) {
            //console.log('rescrape!',response);
        });

I happen to like promises, so an alternate version using jQuery Deferreds might be

function scrapeLink(url){
    var masterdfd = $.Deferred();
    FB.api('https://graph.facebook.com/', 'post', {
        id: [your-updated-or-new-link],
        scrape: true
    }, function(response) {
        if(!response || response.error){
            masterdfd.reject(response);
        }else{
            masterdfd.resolve(response);
        }
    });
    return masterdfd;
}

then:

scrapeLink([SOME-URL]).done(function(){
    //now the link should be scraped/rescraped and ready to use
});

Note that the scraper can take varying amounts of time to complete, so no guarantees that it will be quick. Nor do I know what Facebook thinks about repeated or automated usages of this method, so it probably pays to be judicious and conservative about using it.

Solution 5

An alternative solution from within a Drupal node update using curl could be something like this :

<?php
function your_module_node_postsave($node) {
    if($node->type == 'your_type') {
        $url = url('node/'.$node->nid,array('absolute' => TRUE));
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, 'https://graph.facebook.com/v1.0/?id='. urlencode($url). '&scrape=true');
        $auth_header = 'Oauth yOUR-ACCESS-TOKEn';
        curl_setopt($ch, CURLOPT_HTTPHEADER, array($auth_header));
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        $r = curl_exec($ch);
        curl_close ($ch);
    }
}

Notice the hook_node_postsave() implementation which is not standard Drupal core supported. I had to use www.drupal.org/project/hook_post_action in order to get this facebook scrape pickup last made changes to the node, since hook_node_update() is not triggered after databases have been updated.

Facebook requires now the access token in order to get this done. Guidelines to acquire a token can be found here : https://smashballoon.com/custom-facebook-feed/access-token/

Share:
47,966

Related videos on Youtube

Felipe Brahm
Author by

Felipe Brahm

Software Engineer http://www.linkedin.com/in/felipebrahm

Updated on February 07, 2020

Comments

  • Felipe Brahm
    Felipe Brahm over 4 years

    I'm aware you can force update a page's cache by entering the URL on Facebook's debugger tool while been logged in as admin for that app/page: https://developers.facebook.com/tools/debug

    But what I need is a way to automatically call an API endpoint or something from our internal app whenever somebody from our Sales department updates the main image of one of our pages. It is not an option to ask thousands of sales people to login as an admin and manually update a page's cache whenever they update one of our item's description or image.

    We can't afford to wait 24 hours for Facebook to update its cache because we're getting daily complaints from our clients whenever they don't see a change showing up as soon as we change it on our side.

  • Felipe Brahm
    Felipe Brahm over 11 years
    It's not that it's changing all the time, but sometimes we need to change an image right before something goes "live" (the URL is already on production, but not seen by the general public before that) and clients get upset if they don't see the new image right away. Thanks for your response!
  • rusllonrails
    rusllonrails over 10 years
    Cool, thanks. It is really works and without facebook authentication
  • Igy
    Igy over 8 years
    It should work with an App Access Token, so you still don't need any user to authenticate to make this call
  • steve
    steve over 8 years
    Igy, is there a limit imposed to how many scrapes can be done using the api? In the manual debugger, if I debug back to back, I get an error saying link is blocked -- does the same thing happen with the API?
  • Igy
    Igy over 8 years
    Yes, the same rate limits apply in each place as far as I know, it shouldn't cause issues unless you're trying to do it in bulk for many URLs together - in that case, you should spread them out slowly
  • andrewtweber
    andrewtweber over 8 years
    @Igy yes that's correct. Any access token will do, but it still needs one :)
  • steve
    steve over 8 years
    @Igy, I noticed that sometimes Facebook doesn't display the images even though they've been scraped and shown in the debugger: stackoverflow.com/questions/35125241/… Why does this happen?
  • Roman Marusyk
    Roman Marusyk almost 8 years
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes
  • Felipe Brahm
    Felipe Brahm almost 8 years
    That link is already in the question... As long as you own the app you don't need to add ?fbrefresh or anything. This question is about doing this automatically (in case you want to force a refresh after updating an image, for example, or if you want to force update hundreds or thousands of pages).
  • Tobia
    Tobia almost 8 years
    @steve I'm trying to figure that out too. Got any insight?
  • Igy
    Igy almost 8 years
    if the first request to fetch the image failed you may see the blank image cached on Facebook's end - try changing the image's URL and rescraping
  • Andy Hayden
    Andy Hayden over 7 years
    Does this actually work? Is there a facebook source saying this is the case, as it doesn't seem to work for me.
  • AGamePlayer
    AGamePlayer over 7 years
    It's weird that Facebook doesn't bind the app id as the website's ownership.
  • cmarrero01
    cmarrero01 almost 7 years
    I know that this posts has 3 years old, but I had this problem and this solution doesnt apply anynomre. Someone have a solution for this for api 2.10?
  • codebusta
    codebusta over 6 years
    no, it doesn't work anymore, here is the error log: {"error":{"message":"An access token is required to request this resource.","type":"OAuthException","code":104
  • Igy
    Igy over 6 years
    Use an access token, any should work, even your app access token should work
  • Chittaranjan
    Chittaranjan over 6 years
    This works :) but you need to send access token as well. So please update your answer.
  • Jonathan Anctil
    Jonathan Anctil about 6 years
    I doesn't seems to work anymore, I've it tried with access_token, but if I post on Facebook, I see the old image.
  • Stijn Haus
    Stijn Haus over 5 years
    developers.facebook.com/docs/graph-api/reference/v3.1/url , on the bottom of the page the "Updating" part. Apparently, access_token is no longer required.
  • Yuval A.
    Yuval A. about 5 years
    Yes. It should, but... see @merkushin answer above with the quote from Facebook OG documentation, today it also requires an access_token as a parameter...
  • Franz
    Franz over 2 years
    @Chittaranjan how to get access token?