Need Pure/jQuery Javascript Solution For Cleaning Word HTML From Text Area
Solution 1
I am looking at David Archer's answer and he pretty much answers it. I have used in the past a solution similar to his:
$("textarea").change( function() {
// convert any opening and closing braces to their HTML encoded equivalent.
var strClean = $(this).val().replace(/</gi, '<').replace(/>/gi, '>');
// Remove any double and single quotation marks.
strClean = strClean.replace(/"/gi, '').replace(/'/gi, '');
// put the data back in.
$(this).val(strClean);
});
If you are looking for a way to completely REMOVE HTML tags
$("textarea").change( function() {
// Completely strips tags. Taken from Prototype library.
var strClean = $(this).val().replace(/<\/?[^>]+>/gi, '');
// Remove any double and single quotation marks.
strClean = strClean.replace(/"/gi, '').replace(/'/gi, '');
// put the data back in.
$(this).val(strClean);
});
Solution 2
You could check out Word HTML Cleaner by Connor McKay. It is a pretty strong cleaner, in that it removes a lot of stuff that you might want to keep, but if that's not a problem it looks pretty decent.
Solution 3
It might be useful to use the blur event which would be triggered less often:
$("textarea").blur(function() {
// check input ($(this).val()) for validity here
});
Solution 4
What about something like this:
function cleanHTML(pastedString) {
var cleanString = "";
var insideTag = false;
for (var i = 0, var len = pastedString.length; i < len; i++) {
if (pastedString.charAt(i) == "<") insideTag = true;
if (pastedString.charAt(i) == ">") {
if (pastedString.charAt(i+1) != "<") {
insideTag = false;
i++;
}
}
if (!insideTag) cleanString += pastedString.charAt(i);
}
return cleanString;
}
Then just use the event listener to call this function and pass in the pasted string.
Alex Racho
Just your average UI designer with a lust for jQuery and various other code ventures.
Updated on June 05, 2022Comments
-
Alex Racho almost 2 years
I know this issue has been touched on here but I have not found a viable solution for my situation yet, so I'd like to but the brain trust back to work and see what can be done.
I have a textarea in a form that needs to detect when something is pasted into it, and clean out any hidden HTML & quotation marks. The content of this form is getting emailed to a 3rd party system which is particularly bitchy, so sometimes even encoding it to the html entity characters isn't going to be a safe bet.
I unfortunately cannot use something like FCKEditor, TinyMCE, etc, it's gotta stay a regular textarea in this instance. I have attempted to dissect FCKEditor's paste from word function but have not had luck tracking it down.
I am however able to use the jQuery library if need be, but haven't found a jQuery plugin for this just yet.
I am specifically looking for information geared towards cleaning the information pasted in, not how to monitor the element for change of content.
Any constructive help would be greatly appreciated.