Apache POI and Aspose?

11,513

Solution 1

can any one suggest me which one i should go with? And what are the limitations of Apache POI and Aspose?

It is very difficult and general question and can only have very general answers.

Every software project has different requirements and features. And the feasibility of using the 3rd party components is most probably also different for each project. To choose between different 3rd party is difficult because you need to do more or less of

  • Requirements evaluation (which product meets your requirement or closely meets)
  • See how good is the customer support, before and after the product purchase
  • Feature comparison of 3rd party products
  • Find how much stable the products are. Check how many versions they have released. New versions have bug fixes, new features?
  • Any awards from independent source
  • Overall usability of the API and documentation
  • License terms
  • Cost and benefit

For comparison of Aspose with Apache POI and other alternates, see below links:

Overall, its very difficult to find limitations and compare features of popular file format components. Why? Because both MS Office and Adobe PDF are very old, mature and stable products, you can put vast variety of contents in these files.

One tip is to try to get hands on the most complex and large file (pdf, doc, xls etc) and load using both Aspose and Apache POI. Test for your worst case.

PS. I am a Developer Evangelist at Aspose.

Solution 2

We have evaluated both tools and came up with a review, mainly about Aspose.Words because it works better for our need. But we also write about Apache POI. I'm pasting the review here for your reference.

We are a company that develops online word processor. One big challenge is to convert Microsoft Word DOC, DOCX and RTF contents to and from our proprietary data model. Due to limitation of the thin client and the complex nature of Microsoft Word document, we must handle the conversion in the server side.

Our server-side technology is java/spring/hibernate. We realized that there aren’t many options out there in java space that deals with DOC(X) processing. And we only look for proven and mature products. We have evaluated Apache POI in public domain. One main problem we found with Apache POI is that there are many seemingly independent components under the hood and we must use two different components to handle DOC and DOCX. The POI component that handles DOCX is fairly new and doesn’t have many features yet. As far as RTF is concerned, Apache POI simply doesn’t support it.

Knowing that Apache POI isn’t a good choice for our application, we checked out Aspose.Words for java. In fact, it’s only commercial product in the space, as far as our search goes. The evaluation was very smooth. We easily created a Maven artifact for the Aspose library and integrated the library into our backend web application. Based on our experience, we believe Aspose.Words for java is the top product in this space and is actually far superior to any other solutions. Due to space limitation, we can only share with you two main features that are most valuable to us, from a technology perspective.

First, Aspose.Words uses a consistent, intuitive and well-documented DOM model as underlying document structure. This DOM model is straight-forward and easy to understand and turns out to be quite expressive and powerful. This DOM model is actually different from OOXML’s DOM model. We like Aspose’s DOM model a lot better. It reminds us of the difference between JDOM and W3C model for XML, where JDom’s model is way simpler and more intuitive yet powerful enough to deal with most manipulations ever needed for a business application. To our surprise, one single DOM model is used across all formats supported by Aspose.Words, including but not limited to DOC, DOCX and RTF. This particular design/feature of Aspose.Words greatly lowers the level of effort on our side because we only need to develop one code base to handle all three formats currently needed by our application, as well as other formats (such as PostScript) that may be needed in the future. We found this design/architecture to be the key technology strength of Aspose.Words, in addition to its rich features and APIs.

Second, Aspose.Words is able to preserve all OLE components in the original Word documents in its open/close round trip. That is: having Apose.Words load an existing Word document into its DOM model in memory and immediately export the DOM model back to Word document. Aspose.Words will generate a lossless copy of the document, compared to the original one. This feature is crucial to our application and no other product – commercial or public domain – claims to provide that feature as far as we know.

We would like to share two screenshots to conclude this review. One screenshot (http://s26.postimg.org/lfc1skz8n/screenshot_rtf.jpg) is a complex table generated by Aspose.Words for us. The other (http://s26.postimg.org/5v4o21p47/screenshot_converted.jpg) is some contents (converted from a Word document by Aspose.Words) displayed in our online editor.

Solution 3

So the best method to evalute both frameworks against a specific problem is to test them against your specific problem you mentioned ("can't create borders for table" by the way this is imho fixed in Aspose 4 Slides. See: http://www.aspose.com/community/forums/thread/320218/borders-are-not-shown-in-aspose.slides-2.6.0.aspx).

Simply get a demo licence for "Aspose Slides 4 Java", download the latest version and implement the Solution for your problem ith ist. In your case this would only take a few lines.

After that you do the same with POI (or Tika like Gagravarr mentioned). After that you know at least wether "Aspose slides 4 java" can handle your problem or not and then you can decide to spend the money on Aspose or take the POI lib wich is for free.

We work with "Aspose Sliedes 4 Java" for 3 years now. There wehre many bugs, but they were all fixed when we posted them inside the forum. We also did a all those Powerpoint things with POI before we bought Aspose. I would say both framworks are almost equvalent in functinality, stableness and reliability.

The only big disadvantage in Aspose is you have to code all your stuff two times. One specific code for the old PowerPoint-Format (PP 97-2003) and one for the new PPTX format. Thats something that really can get on your nerves when you havet to code for all formats.

Share:
11,513
anshul
Author by

anshul

I am Lead Engineer at Zebra technologies. Currently working on Android.

Updated on June 09, 2022

Comments

  • anshul
    anshul about 2 years

    hi i am creating an app which can read file like pdf/doc/docx/xls/ppt etc and display it to user.I have read that if in doc there is some images and a table , apache POI can't help because it can't create borders for table.going with aspose is not a problem ,but i should have strong reason to use aspose instead of apache POI which is open source.

    can any one suggest me which one i should go with? And what are the limitations of Apache POI and Aspose?