making IPTC data searchable

前端 未结 2 520
别那么骄傲
别那么骄傲 2021-01-17 05:43

I have a question about IPTC metadata. Is it possible to search images that aren\'t in a database by their IPTC metadata (keywords) and show them and how would I go about do

相关标签:
2条回答
  • 2021-01-17 06:42

    It is not clear what in particular is giving you problems, but perhaps this will give you some ideas:

    <?php
    # Images we're searching
    $images = array('/path/to/image.jpg', 'another-image.jpg');
    
    # IPTC keywords to values (from exiv2, see below)
    $query = array('Byline' => 'Some Author');
    
    # Perform the search
    $result = select_jpgs_by_iptc_fields($images, $query);
    
    # Display the results
    foreach ($result as $path) {
        echo '<img src="', htmlspecialchars($path), '">';
    }
    
    function select_jpgs_by_iptc_fields($jpgs, $query) {
        $matches = array();
        foreach ($jpgs as $path) {
            $iptc = get_jpg_iptc_metadata($path);
            foreach ($query as $name => $values) {
                if (!is_array($values))
                    $values = array($values);
                if (count(array_intersect($iptc[$name], $values)) != count($values))
                    continue 2;
            }
            $matches[] = $path;
        }
        return $matches;
    }
    
    function get_jpg_iptc_metadata($path) {
        $size = getimagesize($path, $info);
        if(isset($info['APP13']))
        {
            return human_readable_iptc(iptcparse($info['APP13']));
        }
        else {
            return null;
        }
    }
    
    function human_readable_iptc($iptc) {
    # From the exiv2 sources
    static $iptc_codes_to_names =
    array(    
    // IPTC.Envelope-->
    "1#000" => 'ModelVersion',
    "1#005" => 'Destination',
    "1#020" => 'FileFormat',
    "1#022" => 'FileVersion',
    "1#030" => 'ServiceId',
    "1#040" => 'EnvelopeNumber',
    "1#050" => 'ProductId',
    "1#060" => 'EnvelopePriority',
    "1#070" => 'DateSent',
    "1#080" => 'TimeSent',
    "1#090" => 'CharacterSet',
    "1#100" => 'UNO',
    "1#120" => 'ARMId',
    "1#122" => 'ARMVersion',
    // <-- IPTC.Envelope
    // IPTC.Application2 -->
    "2#000" => 'RecordVersion',
    "2#003" => 'ObjectType',
    "2#004" => 'ObjectAttribute',
    "2#005" => 'ObjectName',
    "2#007" => 'EditStatus',
    "2#008" => 'EditorialUpdate',
    "2#010" => 'Urgency',
    "2#012" => 'Subject',
    "2#015" => 'Category',
    "2#020" => 'SuppCategory',
    "2#022" => 'FixtureId',
    "2#025" => 'Keywords',
    "2#026" => 'LocationCode',
    "2#027" => 'LocationName',
    "2#030" => 'ReleaseDate',
    "2#035" => 'ReleaseTime',
    "2#037" => 'ExpirationDate',
    "2#038" => 'ExpirationTime',
    "2#040" => 'SpecialInstructions',
    "2#042" => 'ActionAdvised',
    "2#045" => 'ReferenceService',
    "2#047" => 'ReferenceDate',
    "2#050" => 'ReferenceNumber',
    "2#055" => 'DateCreated',
    "2#060" => 'TimeCreated',
    "2#062" => 'DigitizationDate',
    "2#063" => 'DigitizationTime',
    "2#065" => 'Program',
    "2#070" => 'ProgramVersion',
    "2#075" => 'ObjectCycle',
    "2#080" => 'Byline',
    "2#085" => 'BylineTitle',
    "2#090" => 'City',
    "2#092" => 'SubLocation',
    "2#095" => 'ProvinceState',
    "2#100" => 'CountryCode',
    "2#101" => 'CountryName',
    "2#103" => 'TransmissionReference',
    "2#105" => 'Headline',
    "2#110" => 'Credit',
    "2#115" => 'Source',
    "2#116" => 'Copyright',
    "2#118" => 'Contact',
    "2#120" => 'Caption',
    "2#122" => 'Writer',
    "2#125" => 'RasterizedCaption',
    "2#130" => 'ImageType',
    "2#131" => 'ImageOrientation',
    "2#135" => 'Language',
    "2#150" => 'AudioType',
    "2#151" => 'AudioRate',
    "2#152" => 'AudioResolution',
    "2#153" => 'AudioDuration',
    "2#154" => 'AudioOutcue',
    "2#200" => 'PreviewFormat',
    "2#201" => 'PreviewVersion',
    "2#202" => 'Preview',
    // <--IPTC.Application2
          );
       $human_readable = array();
       foreach ($iptc as $code => $field_value) {
           $human_readable[$iptc_codes_to_names[$code]] = $field_value;
       }
       return $human_readable;
    }
    
    0 讨论(0)
  • 2021-01-17 06:44

    If you don't have extracted those IPTC data from your images, each time someone will search, you'll have to :

    • loop on every images
    • for each image, extract the IPTC data
    • see if the IPTC data for the current image matches

    If you have more than a couple image, this will be really bad for performances, I'd say.


    So, in my opinion, it would be far better to :

    • add a couple of fields in your database
    • extract the relevant IPTC data when the image is uploaded / stored
    • store the IPTC data in those DB fields
    • search in those DB fields
      • Or use some search engine like Lucene or Sphinx -- but that is another problem.

    It'll mean a bit more work for you right now : you have more code to write...

    ... But it also means your website will have better chances to survive when there are several images and many users doing searches.

    0 讨论(0)
提交回复
热议问题