How to prevent robots from automatically filling up a form?

107,781

Solution 1

An easy-to-implement but not fool-proof (especially on "specific" attacks) way of solving anti-spam is tracking the time between form-submit and page-load.

Bots request a page, parse the page and submit the form. This is fast.

Humans type in a URL, load the page, wait before the page is fully loaded, scroll down, read content, decide wether to comment/fill in the form, require time to fill in the form, and submit.

The difference in time can be subtle; and how to track this time without cookies requires some way of server-side database. This may be an impact in performance.
Also you need to tweak the threshold-time.

Solution 2

I actually find that a simple Honey Pot field works well. Most bots fill in every form field they see, hoping to get around required field validators.

http://haacked.com/archive/2007/09/11/honeypot-captcha.aspx

If you create a text box, hide it in javascript, then verify that the value is blank on the server, this weeds out 99% of robots out there, and doesn't cause 99% of your users any frustration at all. The remaining 1% that have javascript disabled will still see the text box, but you can add a message like "Leave this field blank" for those such cases (if you care about them at all).

(Also, noting that if you do style="display:none" on the field, then it's way too easy for a robot to just see that and discard the field, which is why I prefer the javascript approach).

Solution 3

What if - the Bot does not find any form at all?

3 examples:

  1. Insert your form using AJAX
  • If you are OK with users having JS disabled and not being able to see/ submit a form, you can notify them and have them enable Javascript first using a noscript statement:
<noscript>
  <p class="error">
    ERROR: The form could not be loaded. Please enable JavaScript in your browser to fully enjoy our services.
  </p>
</noscript>
  • Create a form.html and place your form inside a <div id="formContainer"> element.

  • Inside the page where you need to call that form use an empty <div id="dynamicForm"></div> and this jQuery: $("#dynamicForm").load("form.html #formContainer");

  1. Build your form entirely using JS

// THE FORM
var $form = $("<form/>", {
  appendTo : $("#formContainer"),
  class    : "myForm",
  submit   : AJAXSubmitForm
});

// EMAIL INPUT
$("<input/>",{
  name        : "Email", // Needed for serialization
  placeholder : "Your Email",
  appendTo    : $form,
  on          : {        // Yes, the jQuery's on() Method 
    input : function() {
      console.log( this.value );
    }
  }
});

// MESSAGE TEXTAREA
$("<textarea/>",{
  name        : "Message", // Needed for serialization
  placeholder : "Your message",
  appendTo    : $form
});

// SUBMIT BUTTON
$("<input/>",{
  type        : "submit",
  value       : "Send",
  name        : "submit",
  appendTo    : $form
});

function AJAXSubmitForm(event) {
  event.preventDefault(); // Prevent Default Form Submission
  // do AJAX instead:
  var serializedData = $(this).serialize();
  alert( serializedData );
  $.ajax({
    url: '/mail.php',
    type: "POST",
    data: serializedData,
    success: function (data) {
      // log the data sent back from PHP
      console.log( data );
    }
  });
}
.myForm input,
.myForm textarea{
  font: 14px/1 sans-serif;
  box-sizing: border-box;
  display:block;
  width:100%;
  padding: 8px;
  margin-bottom:12px;
}
.myForm textarea{
  resize: vertical;
  min-height: 120px;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="formContainer"></div>
  1. Bot-bait input
  • Bots like (really like) saucy input elements like:
<input 
  type="text"
  name="email"
  id="email"
  placeholder="Your email"
  autocomplete="nope"
  tabindex="-1"
They wll be happy to enter some value such as
`[email protected]`
  • After using the above HTML you can also use CSS to not display the input:
input[name=email]{ /* bait input */
  /* do not use display:none or visibility:hidden
     that will not fool the bot*/
  position:absolute;
  left:-2000px;
}
  • Now that your input is not visible to the user expect in PHP that your $_POST["email"] should be empty (without any value)! Otherwise don't submit the form.
  • Finally,all you need to do is create another input like <input name="sender" type="text" placeholder="Your email"> after (!) the "bot-bait" input for the actual user Email address.

Acknowledgments:

Developer.Mozilla - Turning off form autocompletition
StackOverflow - Ignore Tabindex

Solution 4

What I did is to use a hidden field and put the timestamp on it and then compared it to the timestamp on the Server using PHP.

If it was faster than 15 seconds (depends on how big or small is your forms) that was a bot.

Hope this help

Solution 5

A very effective way to virtually eliminate spam is to have a text field that has text in it such as "Remove this text in order to submit the form!" and that text must be removed in order to submit the form.

Upon form validation, if the text field contains the original text, or any random text for that matter, do not submit the form. Bots can read form names and automatically fill in Name and Email fields but do not know if they have to actually remove text from a certain field in order to submit.

I implemented this method on our corporate website and it totally eliminated the spam we were getting on a daily basis. It really works!

Share:
107,781

Related videos on Youtube

Gal
Author by

Gal

Updated on July 08, 2022

Comments

  • Gal
    Gal almost 2 years

    I'm trying to come up with a good enough anti-spamming mechanism to prevent automatically generated input. I've read that techniques like captcha, 1+1=? stuff work well, but they also present an extra step impeding the free quick use of the application (I'm not looking for anything like that please).

    I've tried setting some hidden fields in all of my forms, with display: none; However, I'm certain a script can be configured to trace that form field id and simply not fill it.

    Do you implement/know of a good anti automatic-form-filling-robots method? Is there something that can be done seamlessly with HTML AND/OR server side processing, and be (almost) bulletproof? (without JS as one could simply disable it).

    I'm trying not to rely on sessions for this (i.e. counting how many times a button is clicked to prevent overloads).

    • Mike
      Mike over 9 years
      Thanks for not wanting captcha solutions! IMO, form spam is a problem for site owners and preventing it isn't a burden the user should bear. There are far too many alternative ways we can address spam on the site end, as evidenced by the replies here. Methods requiring user interaction should only be used by the lazy or the novice.
    • Abhishek Choudhary
      Abhishek Choudhary almost 3 years
      There is an an alternative CAPTCHA that is only triggered on suspicion, that allows normal users to submit but stops spam.
    • WHO's NoToOldRx4CovidIsMurder
      WHO's NoToOldRx4CovidIsMurder over 2 years
      Starred and upvoted, esp. because of what Mike said. Accessibility and the WCAG (Web Content Accessibility Guidelines) are another reason to avoid CAPTCHA - even when there's a audio workaround, that helps only some disabled folks.
  • Brian
    Brian over 14 years
    As a user I find recaptcha to be hard to figure out often times. Some of the words are so hard to read, that you end up having to try 3 or 4 times. Although this definitely will help with the robots problem.
  • Gal
    Gal over 14 years
    Thanks, though javascript can be easily disabled in any browser, thus annihilating my "anti spam mechanism", so I'm looking for something more global.
  • Neil Aitken
    Neil Aitken over 14 years
    Good idea. I wouldn't use colour as the criteria though, as this may exclude colourblind users
  • Brian
    Brian over 14 years
    Yes, good point. Actually a problem with images in general is that they are not accessible, and by making them "accessible" with alt tags, robots can easily figure them out.
  • Gal
    Gal over 14 years
    I may be wrong, but wouldn't this tell every JS-disabled user 'you are a bad client, go away pls.'?
  • smirkingman
    smirkingman over 14 years
    Smart robots can execute javascript. By doing a javascript solution you're blocking 99% of robots though
  • Gal
    Gal over 14 years
    Thanks! this is a great idea, and close to what I was looking for.
  • SF.
    SF. over 14 years
  • snowflake
    snowflake over 14 years
    Watch out if you want to allow end users to use automatic form fillers such addons.mozilla.org/en-US/firefox/addon/1882 that may allow very fast submission. As well as captcha any thing annoying the final user is generally not good, and especially when preventing a person in a hury from going (very) fast.
  • Pindatjuh
    Pindatjuh over 14 years
    Good point, but it all depends on the context. If the form is a login-form, I completely agree with you. But why disable login from bots? If the context is a comment box, like this one on StackOverflow, I know for sure: if you use auto-fill on a comment box then you are a spammer. Note that if you use auto-fill for signatures, you still require time to actually type content.
  • John Himmelman
    John Himmelman over 14 years
    Gal, its a trivial example, merely demonstrating how to validate against a request var set by client-side js.
  • Gal
    Gal over 14 years
    Do you think bots actually go through the css file and figure it's display:none; ? I really rather not use a JS-based solution, since it can be easily disabled.
  • snowflake
    snowflake over 14 years
    It seems to be an old solution for webmasters including tons of non pertinent key words in order to boost their webranking. I think search crawler bots such google ones can figure it's display:none. Why would other bots not able to do that ?
  • smirkingman
    smirkingman over 14 years
    The bot would have to execute javascript, that's the point. Gal - for the tiny tiny percentage of your users with javascript turned off, you simply have a label that says "Leave this blank". No harm done.
  • Jakob Borg
    Jakob Borg about 14 years
    Note that SO does something like this. Edit a comment to fast or too many times in a row and you will get presented with the "Are you a human?" page.
  • nirvdrum
    nirvdrum almost 13 years
    I've used this technique now on two sites that were getting hammered and bot signups are now zero 0 on both. It won't help against targeted attacks, but most are just looking for exploits or for spamming anyway.
  • totallyNotLizards
    totallyNotLizards over 12 years
    Small point here: to get around the JS issue, just use CSS to position your honeypot input above the page top - this way it will be ok to have js disabled, and to get around it the bot will have to be able to parse CSS absolute positioning and make a common sense decision on whether it's a honeypot or not. a little more bullet-proof this way :)
  • Austin Henley
    Austin Henley almost 12 years
    Interesting, do you know if it is more effective than the other answers... a hidden textbox or tracking the time it takes to fill the form?
  • Rob
    Rob over 11 years
    Add this as comment please when you get more reputation instead of an answer ;)
  • alexyorke
    alexyorke over 11 years
    @jammypeach or more simply, display: none
  • totallyNotLizards
    totallyNotLizards over 11 years
    @alexy13 yes it's more simple but as noted in the answer, it's also alot easier for a bot to figure out what you're trying to do, just test for one CSS property. If, however, you use the absolute positioning strategy, the bot has to parse all of your positioning rules and the rules of most of the element's parents to be able to figure out if the input would be visible or not, and then figure out whether or not to act on that information - which is all more trouble than it's worth for most (if not all) bots.
  • Will
    Will about 11 years
    I know this is a late comment but a site I am working on used the display 'display: none' method and is now receiving spam so the bot can find it. Im just in the process of testing other ways of doing it like setting the input off the screen rather than hiding it.
  • crafter
    crafter almost 11 years
    Hackers won't always request the form. Sometimes, a carefully crafted URL (using GET or POST) will be sufficient to post the form multiple times with little effort.
  • spirytus
    spirytus over 10 years
    As silly as it sounds I created honey pot input and simply made it type="hidden". All the dumb robots fall for it and no spam at all. I'm having trouble understanding why everyone goes with captcha which most of the time gives horrible user experience. My vote definitely goes for honey pots.
  • Kiren S
    Kiren S over 10 years
    @Pindatjuh I am trying to implement the same but have some problems stackoverflow.com/questions/20781673/…
  • San Bluecat
    San Bluecat about 10 years
    I've been using this approach several months now and it works just fine, Easiest anti-bot implementation i know.
  • San Bluecat
    San Bluecat about 10 years
    @jammypeach This css-tricks.com/places-its-tempting-to-use-display-none-but-d‌​ont For visually hidding elements without display:none;
  • totallyNotLizards
    totallyNotLizards about 10 years
    @SanBluecat yes it's the same strategy I've advocated due to the disadvantages of using display:none, but there are a couple of different approaches there, thanks for the link.
  • Kayla
    Kayla almost 10 years
    Good idea, however, I'd set the limit to about 3 to 5 seconds to allow fast/power users. I use this same approach, and setting a limit on my forms to 3 seconds filtered out 99% of the bots.
  • Otterfan
    Otterfan almost 10 years
    This presents accessibility problems. The honeypot index will not be hidden from users with screen readers.
  • CoffeDeveloper
    CoffeDeveloper over 9 years
    If the bot is a browser plugin it will be able to execute javascript and see things the user see (even if you are doing some flash or webgl rendering)
  • valicu2000
    valicu2000 over 9 years
    Images are always a bad idea ... the text can barely be read, I faced this issue with other websites
  • Andris
    Andris over 9 years
    @adnhack Do you mean something like: 1) on page load with php get server time and create session. 2) user or bot fills form, clicks Submit, with $.post send all to external php file. 3) in external php again get server time and compare with session time?
  • Mike
    Mike over 9 years
    As a user, I hate that crap. I get that spam is an issue, but how is it my problem, as a site user? Comment spam is an issue for the site owner, and as such, the user shouldn't take the burden of preventing it. If you walked into a store and were asked to put protective booties over your shoes because they didn't want to mop, what would your thoughts be then? It only takes a few seconds, but it's not your burden to bear.
  • Paul
    Paul about 9 years
    One further suggestion here would be to position the field beneath another absolutely positioned section of the screen using the z-order - that way it's still within the visible bounds, but not visible to the user. You could also use tab key prevention so the user can't accidentally tab to the hidden control. Belt and braces!
  • pablito.aven
    pablito.aven about 9 years
    @Pindatjuh What do you mean by tweak the threshold-time ?
  • JohnnyFaldo
    JohnnyFaldo almost 9 years
    I've found myself on this page because CAPTCHA / reCAPTCHA doesn't currently stop bot form submission. This is 5 years later and it's a new technique than when this answer was given
  • Mihai P.
    Mihai P. over 8 years
    This is just captcha with a very small twist that makes is harder for users. Also it is not accessible at all.
  • Mihai P.
    Mihai P. over 8 years
    @Miki spam makes a site owner waste time. Time is money, what I sell will be more expensive for you. Your argument can be easily be used to say that "I do not care that you have to pay rent, I want to pay cost of production +1$. How is you paying rent my problem". When you purchase something you pay for hosting, transportation, time etc.
  • Admin
    Admin over 8 years
    @John Himmelman Captchas are solvable and not necessarily the best defense against spam. There are pay-for-services like anti-captcha.com that will solve form captchas for a low fee.
  • Chewie The Chorkie
    Chewie The Chorkie over 8 years
    Just use CSS to place the text field above the page if you are worried about people having JavaScript disabled.
  • towi_parallelism
    towi_parallelism over 8 years
    I'm amazed why this answer does not have more upvotes. whether or not the user like, this is a great solution. Especially, if it is only used for the registration form.
  • Jimbo Jonny
    Jimbo Jonny over 8 years
    @Mike - It's your problem because you want the form to work (obviously, since you're using it). Machines find even the most obscure sites and will spam tens of thousands of submissions a day, making those forms unusable. So next time you submit a question to a small business using a form on their website and you have to add 9+3 to do it...and ask yourself "why do I have to do this?" your answer can be "because I actually want an answer to my question".
  • Mike
    Mike about 8 years
    @JimboJonny You completely missed my point. Spam is an issue (like I stated), but there are ways to address it on the backend that don't taint the user experience. I currently have contact forms deployed on dozens (hundreds, even) of websites, and spam is minimal (a few spam messages a month, per form) because I've addressed spam programmatically, not by making users jump through hoops. My point wasn't that spam is not an issue; it IS an issue. My point was that there are ways to address it without fudging with the user's experience.
  • Mike
    Mike about 8 years
    @JimboJonny Case in point, look at the highest ranked (and accepted) answers on this question. None involve any sort of user input. That's the way spam mitigation should be.
  • Jimbo Jonny
    Jimbo Jonny about 8 years
    @Mike - yikes...so when you've got hundreds of forms on hundreds of websites that rely on a honeypot hidden field and suddenly the more popular bot program makers just add a new feature to detect if a field is hidden via CSS (wouldn't really be that hard) you're going to have, what...a million or so spam messages per hour to suddenly have to clean out and hundreds of websites that are unusable until you get a new solution? I see you like to live dangerously, my friend.
  • Mike
    Mike about 8 years
    @JimboJonny I never said I rely solely on a honeypot, but there are a combination of similar methods that are effective. You realize I could have written your exact same comment about the user input method above, right? A computer would never be able to provide the answer to a simple math problem...
  • Jimbo Jonny
    Jimbo Jonny about 8 years
    @Mike - Simple math isn't great either (more complex user input is desirable), but it is still a lot harder to build the AI to recognize math on a page and its relevance to a field (that's the hard part). I used the honeypot example, but the idea of these simple yet obscure fixes (like the accepted answer) is what I'm going at. Those answers work SOLELY because that method is not prevalent enough for bot makers to bother spending the couple hours it would take to get around it. It's security through obscurity...which works until it's not obscure anymore and then blows up catastrophically.
  • Mike
    Mike about 8 years
    @JimboJonny AI like that has been around since the 1960's... which is the reason why virtually all credible captcha type utilities rely on more than just human input. But it's clear you aren't getting my point, so I digress. Keep on thinking that dropping a steaming pile on the user experience is the way to handle something that should (and can) be addressed programmatically.
  • Jimbo Jonny
    Jimbo Jonny about 8 years
    @Mike - The math is not the hard part. Very few can render a page and spatially/linguistically analyze visual relationships reliably across websites to figure out what fields need what from a human perspective. Humans do that innately. That is the main difference between a bot and human. Others can be too easily faked. That is why input methods work so well. On the other hand every non-input method listed on this page I could program a workaround for quite easily, now that I know people are using it. Obscurity was all they had. If you have one that is not so obscurity dependent then share.
  • wilbbe01
    wilbbe01 about 8 years
    This would also catch those users who cannot follow directions, which may not be desired.
  • wilbbe01
    wilbbe01 about 8 years
    Could a legitimate user's browser potentially see the bait input field as an email field and autofill it automatically when the user chooses to autofill the rest of the form? The user wouldn't see a field far off screen had been filled, and they would still look like a bot.
  • Parham Doustdar
    Parham Doustdar about 8 years
    I'm a blind user, and I found a form field like this once, and the label above it read: "If you can see this, leave this blank." Very effective IMO.
  • Parham Doustdar
    Parham Doustdar about 8 years
    The problem with this approach is that I have seen a lot of bots using PhantomJS. This would allow them to get through.
  • Parham Doustdar
    Parham Doustdar about 8 years
    Wouldn't bots who use stuff like PhantomJS easily get around this?
  • Gras Double
    Gras Double about 8 years
    As it's a full browser engine, that loads assets and such, yeah that should be possible. Still, I'm not sure it is often used for a spam bot, as it's probably much slower than cURL scripts.
  • skybondsor
    skybondsor almost 8 years
    Interesting idea! Have you used this in the real world at all?
  • John Himmelman
    John Himmelman almost 8 years
    @ParhamDoustdar Agreed, I answered this question about a year before PhantomJS was released :(.
  • Norbert Norbertson
    Norbert Norbertson almost 8 years
    It won't work. These days spammers are using software that runs in the browser. So they can mimick the user experience which creates the cookie and then run it x number of times using different content that is generated by the software.
  • xenoterracide
    xenoterracide over 7 years
    any reason this would be better than a CSRF token?
  • Gras Double
    Gras Double over 7 years
    a CSRF token won't stop a bot at all. 1st request, GET the form, which includes the token. 2nd request, POST the form, including the token.
  • nmit026
    nmit026 over 7 years
    I like this! Until the bot starts trying different combinations of blank and filled-in fields... best way to test is implement this and scan with one of these: sectoolmarket.com/…
  • rogerdpack
    rogerdpack over 7 years
    These days recaptcha starts as a simple checkbox, perhaps it's not as painful as it used to be? ...
  • Kenny Johnson
    Kenny Johnson over 7 years
    I know this answer is nearly 7 years old but I feel like this is worth commenting on. Many bots can be programmed to ignore fields with a style="display:none" to avoid this type of protection.
  • Kenny Johnson
    Kenny Johnson over 7 years
    This wouldn't work if the user was not using a mouse. If your form is set up properly, the user should be able to fill in the entire form using they keyboard. You can tab to the next fields, use space bar to select radio buttons, and use space bar (or enter) when you tab onto the submit button.
  • Corbin Miller
    Corbin Miller about 7 years
    link is invalid.
  • Talha Awan
    Talha Awan about 7 years
    Effective so far as the person managing the bot doesn't find out and tweaks the code.
  • Yashovardhan99
    Yashovardhan99 almost 7 years
    Implement this with captcha. If the form was submitted too fast, present a captcha to let genuine users through.
  • handle
    handle about 6 years
    I suspect autocomplete=nope would default to on ;-) MDN: input#attr-autocomplete
  • Roko C. Buljan
    Roko C. Buljan about 6 years
    @handle it doesn't matters, it's a bot bait input. You can write autocomplete="oh sunny day" for that matter.
  • Synchro
    Synchro almost 6 years
    reCaptcha also (by design) leaks data to google. Not a good look for privacy.
  • icefront
    icefront over 5 years
    Tried this, but it isn't a good idea at all. When the user fills the form, and receives an error, the correction of the error may take several seconds (eg. correcting the e-mail address). Upon the second submit attempt the form is already completed and it is submitted again in short time. Also there are autofill browser extensions which again will produce false positives.
  • SF.
    SF. over 4 years
    There are dozens of methods of obscuring inputs, using Javascript, displaying dummy elements on top of them, moving them out of visible area, styling them to blend with background or layout decorations perfectly etc. Randomizing (hashing) input names (and keeping the mapping of hashed=>original in session server-side) will help against using names as hints and manually mapping which inputs are valid. Regardless, there is no defense against manual spam.
  • Hawkeye
    Hawkeye about 4 years
    It seems to me that the very fact you have to put the alternate text means that your two image solution is just as susceptible to scripting as the other alternatives. And for the "I am not a spammer" button: can't that be scripted too?
  • Hawkeye
    Hawkeye about 4 years
    Out of curiosity, what's your feeling on reCAPTCHA? You're the first mention other paid services, but how do those compare to reCAPTCHA, and/or why would you recommend those OVER the free service?
  • Adam
    Adam about 4 years
    @Hawkeye My answer was that an headless browser can emulate anything : javascript, delays, mouse move, hidden fields, ... The term "beautiful" before my examples was kind of "sarcastic". But those examples illustrate that understanding english, and having to make a simple choice, is harder for a spambot than : waiting 10 seconds, handling CSS or javascript, knowing that a field is hidden, emulating mouse move or emulating keyboard typing, ...
  • Hawkeye
    Hawkeye about 4 years
    I see your point now. Maybe add last statement "But those examples illustrate..." etc. to your answer. Because that helped me understand what you mean. It seemed at first to be a self contradicting argument that "we can't assume bots can't..." but then list things that we still can't assume bots can't do. But the crux of your point is that your example (having to make a choice on which submit button) is harder --which (now that I understand) is a brilliant answer. +1
  • Michael Moriarty
    Michael Moriarty about 4 years
    Could you elaborate on some of these steps valicu2000? Are they still valid in 2020? Thanks.
  • avia
    avia over 3 years
    @RokoC.Buljan creative anti-bot solutions, thanks fors sharing
  • Noodles
    Noodles about 3 years
    This is not a great idea. A bot could easily implement a delay, or the user could user auto form fillers.
  • Abhishek Choudhary
    Abhishek Choudhary almost 3 years
    These methods only stop general spam bots, which crawl internet and spam every form, if a bot is made specifically for your form, by a human, it will be useless.
  • Abhishek Choudhary
    Abhishek Choudhary almost 3 years
    Stopping general spam bots is easy anyways, if it's a targeted bot, CAPTCHA is the only solution