Hi!
I'd want to know if there's any class or library to open office documents and then read them.
With the conventional method, creating a StreamReader and so, sometimes strange characters are read, for example if the document contains tables or images.
What I want to do is, for example, open an existing file (.doc, .xls, .pub, .ppt etc., even .pdf) , read the whole document
and then count how many times appears each word.
Is this possible How
I found a library called iTextSharp which allows me to create documents but I'm not sure if I can read them, and if so I don't know how because there's no documentation.
Thank you very much, I hope there's someone who can help me,
Susana

How to read any office document
Pat Brenner MSFT
hi,
I don't think its possible to do what you plan to do. the reason is that all document formats follow a different specification for saving data/text. Unlike plain text files, *.doc, *.xls, *.ppt etc. have specific formats in which they save data which is not directly readable using a text based file stream. You need to you specific libraries that know how to read and intrepret the document formats to read through these. Hence you will have to find specific libraries for each of the different file formats.
I hope this gives you some idea to solve your problem.
Regards,
Saurabh Nandu
www.MasterCSharp.com
www.AksTech.com
Essential