Is this possible to extract images from MS Word Documents using PHP? And if so, how?
Requirement: Definitely old-shool doc support, but preferably both old and new.<
If you are extracting images from older files you have a couple of options.
Run a converter to update all files to DocX then use IntermediateHacker's code.
Find the VBA code necessary to extract the images, and then either create a macro and call this code via PHP's COM interface functions or call the code yourself via these functions.
The first thing to do though is find how to do it in VBA, that will make it much easier to do it in PHP.
Create a new PHP file and name it as extract.php and add the following code in it.
<?php
/*Name of the document file*/
$document = 'attractive_prices.docx';
/*Function to extract images*/
function readZippedImages($filename) {
/*Create a new ZIP archive object*/
$zip = new ZipArchive;
/*Open the received archive file*/
if (true === $zip->open($filename)) {
for ($i=0; $i<$zip->numFiles;$i++) {
/*Loop via all the files to check for image files*/
$zip_element = $zip->statIndex($i);
/*Check for images*/
if(preg_match("([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)",$zip_element['name'])) {
/*Display images if present by using display.php*/
echo "<image src='display.php?filename=".$filename."&index=".$i."' /><hr />";
}
}
}
}
readZippedImages($document);
?>
Now create another PHP file and name it as display.php and add the following code to it.
<?php
/*Tell the browser that we want to display an image*/
header('Content-Type: image/jpeg');
/*Create a new ZIP archive object*/
$zip = new ZipArchive;
/*Open the received archive file*/
if (true === $zip->open($_GET['filename'])) {
/*Get the content of the specified index of ZIP archive*/
echo $zip->getFromIndex($_GET['index']);
}
$zip->close();
?>
Source(s): Extracting Images from DocX using PHP
If you are using the newer docx format it can easily be achieved because they are no more than a zip file. See the following link:
http://www.botskool.com/geeks/how-extract-images-docx-files-using-php
Hope this help You and you can also format according to your need .
<?php
/**
* Created by PhpStorm.
* User: khalid
* Date: 04/26/2015
* Time: 10:32 AM
*/
class DocxImages {
private $file;
private $indexes = [ ];
/** Local directory name where images will be saved */
private $savepath = 'docimages';
public function __construct( $filePath ) {
$this->file = $filePath;
$this->extractImages();
}
function extractImages() {
$ZipArchive = new ZipArchive;
if ( true === $ZipArchive->open( $this->file ) ) {
for ( $i = 0; $i < $ZipArchive->numFiles; $i ++ ) {
$zip_element = $ZipArchive->statIndex( $i );
if ( preg_match( "([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)", $zip_element['name'] ) ) {
$imagename = explode( '/', $zip_element['name'] );
$imagename = end( $imagename );
$this->indexes[ $imagename ] = $i;
}
}
}
}
function saveAllImages() {
if ( count( $this->indexes ) == 0 ) {
echo 'No images found';
}
foreach ( $this->indexes as $key => $index ) {
$zip = new ZipArchive;
if ( true === $zip->open( $this->file ) ) {
file_put_contents( dirname( __FILE__ ) . '/' . $this->savepath . '/' . $key, $zip->getFromIndex( $index ) );
}
$zip->close();
}
}
function displayImages() {
$this->saveAllImages();
if ( count( $this->indexes ) == 0 ) {
return 'No images found';
}
$images = '';
foreach ( $this->indexes as $key => $index ) {
$path = 'http://' . $_SERVER['HTTP_HOST'] . '/' . $this->savepath . '/' . $key;
$images .= '<img src="' . $path . '" alt="' . $key . '"/> <br>';
}
echo $images;
}
}
$DocxImages = new DocxImages( "doc.docx" );
/** It will save and display images*/
$DocxImages->displayImages();
/** It will only save images to local server */
#$DocxImages->saveAllImages();
?>