redirecting bots and crawlers to another but not human via htaccess
What you are trying to do could technically be classified as cloaking which is a violation of Google's terms and can result in your site being removed from the Google index. Google is very strict in what they class as cloaking and basically the rule is whatever the end user sees the crawler has to see as well. If you are trying to block malicious bots then the easiest thing to do is simply block their user agent strings using .htaccess but if you try cloaking with a legitimate crawler such as Google it will be detected and will result in severe penalties and manual action notices which can severely affect your SERP ranking.
Google not only uses the known Googlebot user agent but also uses other bots which have the user agent string of real browsers on IP addresses not affiliated with Google as a way to detect this on websites so there is no way to prevent yourself from being caught out doing this.
Now having given that warning...
You mention Facebook crawler specifically. Facebook has three different user agents for crawling. facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
and facebookexternalhit/1.1
which are used when a user shares your website to their wall and Facebot
which is used to help improve advertising performance. Out of all of them only Facebot
respects the robots.txt rule as the other ones are only triggered by a user action and so are treated the same as a web browser in effect. If you want to block any Facebook crawling simply add a .htaccess rule to detect these user agent strings and if they are detected either block them or return an error page that crawlers are not permitted. Trying to forward them to an alternate site with different content will simply complicate matters and could have the potential of reducing your SERP ranking due to not having context appropriate content on the pages that the Bots can access.
Related videos on Youtube
Sergio santa
Updated on September 18, 2022Comments
-
Sergio santa over 1 year
I would to apply this diagram via htaccess I tried a lots of codes but I failed every time
So I need to redirecting bots and crawlers especially from facebook via
.htaccess
-
Stephen Ostermiller over 7 yearsNo need for "technically". As far as Google is concerned, that is a "sneaky redirect" and explicitly against their webmaster guidelines: support.google.com/webmasters/answer/2604723?hl=en
-
MrWhite almost 7 yearsThis won't work as-is. You don't check an environment variable with mod_rewrite like that - the
RewriteCond
condition will always fail sinceenv=bad_bot
is seen as a literal string and compared against theHTTP_USER_AGENT
server variable (again, not what you are trying to do). (It looks like you are trying to borrow syntax frommod_auth_...
?!). TheRewriteCond
directive should read something likeRewriteCond %{ENV:bad_bot} 1
instead. (Although, as already stated, trying to redirect the bot is probably a bad idea to begin with - if anything it should simply be blocked.)