Is it possible to read and write Word (2003 and 2007) files in PHP without using a COM object? I know that I can:
$file = fopen(\'c:\\file.doc\', \'w+\');
fw
I don't know what you are going to use it for, but I needed .doc support for search indexing; What I did was use a little commandline tool called "catdoc"; This transfers the contents of the Word document to plain text so it can be indexed. If you need to keep formatting and stuff this is not your tool.
phpLiveDocx is a Zend Framework component and can read and write DOC and DOCX files in PHP on Linux, Windows and Mac.
See the project web site at:
http://www.phplivedocx.org
Most probably you won't be able to read Word documents without COM.
Writing was covered in this topic
Reading binary Word documents would involve creating a parser according to the published file format specifications for the DOC format. I think this is no real feasible solution.
You could use the Microsoft Office XML formats for reading and writing Word files - this is compatible with the 2003 and 2007 version of Word. For reading you have to ensure that the Word documents are saved in the correct format (it's called Word 2003 XML-Document in Word 2007). For writing you just have to follow the openly available XML schema. I've never used this format for writing out Office documents from PHP, but I'm using it for reading in an Excel worksheet (naturally saved as XML-Spreadsheet 2003) and displaying its data on a web page. As the files are plainly XML data it's no problem to navigate within and figure out how to extract the data you need.
The other option - a Word 2007 only option (if the OpenXML file formats are not installed in your Word 2003) - would be to ressort to OpenXML. As databyss pointed out here the DOCX file format is just a ZIP archive with XML files included. There are a lot of resources on MSDN regarding the OpenXML file format, so you should be able to figure out how to read the data you want. Writing will be much more complicated I think - it just depends on how much time you'll invest.
Perhaps you can have a look at PHPExcel which is a library able to write to Excel 2007 files and read from Excel 2007 files using the OpenXML standard. You could get an idea of the work involved when trying to read and write OpenXML Word documents.
Would the .rtf format work for your purposes? .rtf can easily be converted to and from .doc format, but it is written in plaintext (with control commands embedded). This is how I plan to integrate my application with Word documents.
www.phplivedocx.org is a SOAP based service that means that you always need to be online for testing the Files also does not have enough examples for its use . Strangely I found only after 2 days of downloading (requires additionaly zend framework too) that its a SOAP based program(cursed me !!!)...I think without COM its just not possible on a Linux server and the only idea is to change the doc file in another usable file which PHP can parse...