Reading or Converting word .doc files iOS

16,218

Solution 1

I don't know if you are still looking for solution or you figured it yourself but I am answering this hoping it will help someone else looking for the same.

I was looking for a solution related to my task that I want to convert word file to text file. I came on this question after some googling and according to the answer from @TJD I gone on the link and from there I found this link.

For my requirement as I was needed to convert word file to text file. I followed second link as my solution.

As the docx file is created with Open XML File format and it is mentioned in there I understand that I need to unzip the docx file considering it a zip.

For Zip/Unzip google provides code here. After unziping the docx file in our document directory according to the wikipedia link there are three directories and one xml file in root.

For my solution I choose word directory as mentioned in link that original content of file is placed there (I didn't gone in any other directory or file till now). There is a file under your extract path word/doctment.xml this is where your docx file content placed in xml format.

There are lots of tags available in that xml file and I don't know the meaning of those text right now but after looking at the xml file I got that the tag which contain my text is w:t.

After that every thing is like cake. I just used NSXMLParser and parsed the data from the xml file targeting the w:t tag and I got my whole string.

Note: I will update my answer as soon as I understand about the other files and tags. As well this solution is not working with doc files as of I know OpenXMLFile format is introduced in MSOffice 2007 so I will also update my answer for doc file solution.

I know this is not enough that it is not covering creating doc file etc. But I hope this will help lots of us.

Solution 2

The "trick" most apps use to read Word files is UIWebView — it can read them. This doesn't allow for writing docs, but that is a much harder problem for which I don't believe an easy solution exists.

Solution 3

Modern versions of office use an open standard xml format. http://en.wikipedia.org/wiki/Office_Open_XML

Solution 4

libOPC!

ISO/IEC 29500 standard conformant, cross-platform, open source, standard C99-based implementation of Part II (OPC) and Part III (MCE) of the ISO/IEC 29500 specification (OOXML). And it works for ios as well http://www.nooxml.com/video/libopc_iphone.wmv

Solution 5

Here's how to read the Open Office XML (OOXML) format in iOS: http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/05/09/147049.aspx

The link will lead you to a tutorial that will get you the metadata of an OOXML file, it's not the text, or the formatting, but it's a start.

.doc files are a proprietary zip format containing many files related to text and formatting (if you want to see what's inside, go into the finder and 'Get Info'; then rename the extension to .zip and decompress the file). Those files are filled to the top with very large amounts of random XML symbols that are of no use to you or anyone else.

However, .docx files can be opened and converted easily with the adoption of the OOXML standard. See the link.

Share:
16,218
casey
Author by

casey

Updated on June 05, 2022

Comments

  • casey
    casey almost 2 years

    How are other apps on iOS able to read and write word docs? I see some other questions related to this and accepted answers are along the lines of "it can't be done." I don't want to just display a word doc, I want to read it along with its formatting. How are other apps doing it, are they writing the parsing themselves using the published standard put out by Microsoft? Are they using some kind of bundled utility to convert the file to some other format like XML or HTML before processing it? Is there an open source way of doing this? Looking for ideas.

  • casey
    casey over 12 years
    Yeah, most of my worries were around the previous doc format, which is documented, but doesn't seem to be openly implemented, at least in C / or compatible with iOS.
  • CodaFi
    CodaFi over 12 years
    Yes, earlier versions of MS Word used a proprietary .doc and .dot format. Now, (I think 2007 and onward) it uses the open standards of .docx and .dotx. That is why do many earlier questions were answered so negatively.
  • casperOne
    casperOne over 12 years
  • casperOne
    casperOne over 12 years
  • casperOne
    casperOne over 12 years
    I've been deleting your comments as they have not been constructive to the question and answer in general. I've suggested that you flesh out your answer (to address concerns over why you are being downvoted, as well as why this is not a good answer). See my answer for what would more than likely be considered a good answer.
  • CodaFi
    CodaFi over 12 years
    CasperOne, specific edits would be a nice suggestion. Since I don't seem to have the foggiest of what you're getting at here.
  • casperOne
    casperOne over 12 years
    That's up to you, you've been shown the guideline, and given an example. If you have any specific questions about either, please feel free to ask.
  • casey
    casey over 12 years
  • casey
    casey over 12 years
    They use it for displaying, not reading. Once opened in the UIWebView there is no way to access its content. And the data access methods available on the mac are not available on iOS so I was looking to see if anyone knew "the secret" or if everyone is rolling their own. Maybe this needs to be turned into a Google Code project
  • BDGapps
    BDGapps almost 12 years
    Thank You so much that helped a lot. I am just getting started with xml with ios. Could you provide the code in the phaser that finds the w:t. That is my only question. Thank You so much.
  • Kapil Choubisa
    Kapil Choubisa almost 12 years
    Sorry didn't get you. What exactly you mean with Phaser? Do you want to know how to parse the document.xml file or something else
  • Ameer
    Ameer about 10 years
    @Kapil : Have your found solution for doc format..?