JavaScript library to read doc and docx on client
You can use docxtemplater for this (even if normally, it is used for templating, it can also just get the text of the document) :
var zip = new JSZip(content);
var doc=new Docxtemplater().loadZip(zip)
var text= doc.getFullText();
console.log(text);
See the Doc for installation information (I'm the maintainer of this project)
However, it only handles docx, not doc
Torben
Master of Engineering in Geoinformatics Studied in Neubrandenburg (Germany) B.Eng. written at Claas AgroSystems GmbH (Gütersloh, Germany) M.Eng. written at German Aerospace Center (Neustrelitz, Germany)
Updated on June 05, 2022Comments
-
Torben almost 2 years
I am searching for a JavaScript library, which can read
.doc
- and.docx
- files. The focus is only on the text content. I am not interested in pictures, formulas or other special structures in MS-Word file.It would be great if the library works with to JavaScript FileReader as shown in the code below.
function readExcel(currfile) { var reader = new FileReader(); reader.onload = (function (_file) { return function (e) { //here should the magic happen }; })(currfile); reader.onabort = function (e) { alert('File read canceled'); }; reader.readAsBinaryString(currfile); }
I searched through the internet, but I could not get what I was looking for.
-
Torben almost 7 yearsThanks, that is what i was looking for. You did great work.
-
slwr about 6 yearsI get an error when I use this as a zip file
zip.file('yo.docx', element.data, {base64: true});
-
edi9999 about 6 yearsWhat kind of error ? Are you using jzip version 2 ? If you are using JSZip version 3, it will fail.
-
Ramtin about 3 yearsIt is customary to inform the users that you are the author of the library mentioned in your answer.
-
Jeon almost 3 years@edi9999 where can I find documentation for
doc
object? -
edi9999 almost 3 years