You could always use OIVT (OutsideIn Viewer Technology, I think) now owned by oracle.
I'll be honest, it's not a cheap solution, and while this product is to allow you view, print, etc... I think if i remember correctly, they do offer an option to extract the content to text or they another product that does that. it can do this from pretty much any document type including doc, docx, pdf (just to name a few) without having to use the "original" application installed as they have their own set of filters.
Here's a link to get you started
Outside In Viewer Technolog
Good luck