Tuesday, September 25, 2007

PCLinuxOS - Using antiword to Read and Process MS Word documents

Most Linux (and PCLinuxOS) users have to read MS Word documents at sometime. They use Abiword or OOo Writer for that purpose, but feel annoyed at OpenOffice's sluggish behavior and Abiword's ill rendering of the document. I browsed through many Linux forums and got suggestions to - disable java loading or enable ooquickstart. I applied those fundas to my PCLinuxOS. There was some performance boost, but not satisfactory. Then, a wise man replied to my forum post - "Hey, why don't you try Antiword." And I got some command line freedom to view and process those *.doc files. It's smal, fast, easy and simple.

You can easily get the rpm of Antiword. Just google "antiword rpm". Whatever version of Antiword I tried, it never asked for dependencies. But I would suggest you to install the latest stable version in order to get maximum compatible with MS Word. If you don't use PCLinuxOS, no matter, you can get source tgz package and install it.

After installation I issued "man antiword". But it did not returned any readable help. While simply entering "antiword" in the console I got some usable help.


click on the image for a better view

To read a plain MS Word (one with minimum formatting, tables, forms etc.) just issue "antiword file.doc." You can read the document quite well. But for the documents with complex formatting, Antiword offers you numerous options to create a pure text file, pdf document, ps file or xml file, straight from a doc file.

To convert a document into text file, issue "antiword -t file.doc > file.txt"
To convert a document into pdf file, issue "antiword -a letter file.doc > file.pdf"
To convert a document into pdf file, issue "antiword -p letter file.doc > file.ps"

Though conversions work fine, except for reproducing the images. You should use the options "-i" and "-i2" to reproduce the images in your desired output file. You can issue "-s" option to view the hidden comments of the doc file.

4 comments:

robin said...

good blog. plz post something on pico, the good old text editor.

mirza said...

informative

Dhiren Gala said...

Hey your blog looks great
i don't have much idea about linux but then to i liked the style of your blog

Anonymous said...

how to read read complex documents having pictures and illustrations on antiword?

How about this