Facebook not able to scrape my url
Solution 1
The Facebook documentation includes details on the Open Graph Protocol and how to include the correct meta tags so that Facebook can scrape your URL accurately.
https://developers.facebook.com/docs/opengraphprotocol/
Essentially what you'll want to do is include some special og:tags
instead (or in addition) to your existing meta tags.
<head>
<title>Ninja Site</title>
<meta property="og:title" content="The Ninja"/>
<meta property="og:type" content="movie"/>
<meta property="og:url" content="http://www.nin.ja"/>
<meta property="og:image" content="http://nin.ja/ninja.jpg"/>
<meta property="og:site_name" content="Ninja"/>
<meta property="fb:admins" content="USER_ID"/>
<meta property="og:description"
content="Superhuman or supernatural powers were often
associated with the ninja. Some legends include
flight, invisibility and shapeshifting..."/>
...
</head>
If you have an .htaccess
file redirecting things and making it difficult for Facebook to scrape your URL you might be able to get away with detecting Facebook's crawler with your .htaccess
and feeding it the correct tags. I believe the the user agent that the Facebook crawler provides is this :
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
The documentation also has a section talking about making sure that their crawlers can access your site.
Depending on your configuration you can test this by looking at your servers access_log. On a UNIX system running apache, the access log is located at /var/log/httpd/access_log
.
So you could use an entry similar to this in your .htaccess
file -
RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit
RewriteRule ^(.*)$ ogtags.php?$1 [L,QSA]
The [L,QSA]
flags that I placed there state that this is the Last rule that will be enforced on the current request (L
) and the QSA
(Query String Append) states that any query string given will be passed along when the URL is rewritten. For example, a URL such as :
https://example.com/?id=foo&action=bar
Will be passed to ogtags.php
like this - ogtags.php?id=foo&action=bar
. Your ogtags.php
file will gave to generate dynamic og:meta tags according to the parameters that were passed.
Now whenever your .htaccess
file detects the Facebook user agent, it will pass him the ogtags.php
file (that can contain the correct og:meta information). Please be aware of any other rules you have in your .htaccess
and how they might affect new rules.
From the .htaccess
entries that you have detailed, I would recommend placing this new "Facebook rule" as the very first rule.
Solution 2
I had the same problem, which was: Bad Response Code: URL returned a bad HTTP response code.
but oddly this is what solved it: I've added
<meta property="og:locale" content="en_US" />
to my site HEAD tag and it worked.
Also, not to forget, in your application dashboard (where you get your APP ID) you must have atleast "Website with Facebook Login" enabled and enter the URL of the website. Otherwise it won't work...regardless if you are not using any Facebook Logins on your site.
![Ninja](https://i.stack.imgur.com/kqrW8.jpg?s=256&g=1)
Comments
-
Ninja almost 2 years
I have the HTML structure for my page as given below. I have added all the meta og tags, but still facebook is not able to scrape any info from my site.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:fb="http://www.facebook.com/2008/fbml"> <head> <meta http-equiv="Content-Type" content="text/html;" charset=utf-8"></meta> <title>My Site</title> <meta content="This is my title" property="og:title"> <meta content="This is my description" property="og:description"> <meta content="http://ia.media-imdb.com/images/rock.jpg" property="og:image"> <meta content="<MYPAGEID>" property="fb:page_id"> ....... </head> <body> .....
When I input the URL in facebook debugger(https://developers.facebook.com/tools/debug), I get the following messages:
Scrape Information Response Code 404 Critical Errors That Must Be Fixed Bad Response Code URL returned a bad HTTP response code. Errors that must be fixed Missing Required Property The 'og:url' property is required, but not present. Missing Required Property The 'og:type' property is required, but not present. Missing Required Property The 'og:title' property is required, but not present. Open Graph Warnings That Should Be Fixed Inferred Property The 'og:url' property should be explicitly provided, even if a value can be inferred from other tags. Inferred Property The 'og:title' property should be explicitly provided, even if a value can be inferred from other tags.
Why is facebook not reading the meta tags info? The page is accessible and not hidden behind login etc.
UPDATE
Ok I did bit of debugging and this is what I found. I have htaccess rule set in my directory- I am using PHP Codeigniter framework and have htaccess rule to remove index.php from the url.
So, when I feed the url to facebook debugger(https://developers.facebook.com/tools/debug) without index.php, facebook shows a 404, but when I feed url with index.php, it is able to parse my page.
Now how do I make facebook scrape content when the url doesn't have index.php?
This is my htaccess rule:
<IfModule mod_rewrite.c> RewriteEngine On RewriteBase / #Removes access to the system folder by users. #Additionally this will allow you to create a System.php controller, #previously this would not have been possible. #'system' can be replaced if you have renamed your system folder. RewriteCond %{REQUEST_URI} ^system.* RewriteRule ^(.*)$ /index.php?/$1 [L] #When your application folder isn't in the system folder #This snippet prevents user access to the application folder #Submitted by: Fabdrol #Rename 'application' to your applications folder name. RewriteCond %{REQUEST_URI} ^application.* RewriteRule ^(.*)$ /index.php?/$1 [L] #Checks to see if the user is attempting to access a valid file, #such as an image or css document, if this isn't true it sends the #request to index.php RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ index.php?/$1 [L] </IfModule> <IfModule !mod_rewrite.c> # If we don't have mod_rewrite installed, all 404's # can be sent to index.php, and everything works as normal. # Submitted by: ElliotHaughin ErrorDocument 404 /index.php </IfModule>
-
Ninja about 12 yearsHi Lix, thanks a lot for the update. I have an issue though- in the rewrite rule, you have mentioned that I load ogtags.html but the meta tags will have dynamic content, based on the page that is being requested. I can't give a static html page there. I tried replacing ogtags.html with this rule: RewriteRule ^(.*)$ index.php?/$1 [L] but didn't help. Any thoughts on how to achieve this?
-
sergio almost 11 years@Lix: do you have any idea why I get a 500 error from the facebook debugger tool when I use your two rules? thanks in advance...
-
Lix almost 11 yearsHey there @ser - Have you checked your server logs for requests from Facebook that are being denied? I just added this link to my answer here, it might be useful to you too.
-
sergio almost 11 years@Lix: thank you very much for your reply! strange thing is: facebook debug tool can access mysite.dom/ogtags.php but for mysite.dom it returns 500... from the server logs I get 206 for mysite.dom/ogtags.php and 500 for all URIs within ogtags.php (e.g., og:image)... I see now that there could be an infinite recursion going on...
-
DS9 almost 10 years@Lix : I have the same problem.Here
-
Lix almost 10 years@DS9 - in the future - please don't attempt to contact users in this manner. You are getting views on your post (and even an "answer"). You left this comment only 10mins after you answered it... Try exercise a little patience next time.
-
DS9 almost 10 yearsOK..no problem:). Thanks for advice. but actually i am facing this problem for hours.I am trying to find a solution but I failed, this is the reason i am try to contact you. and in your profile you write :
Feel free to leave me a comment!
:) -
Lix almost 10 years@DS9 - yea... leave me a comment - but that's not what you did. You wanted to redirect my attention to your new question that was not relevant to that post at all... Comments are not meant to be used for instant messaging.