问题
I recently started using Google's vision API. I am trying to annotate a batch of images and therefore issued the 'batch image annotation offline' guide from their documentation.
However, it is not clear to me how I can annotate MULTIPLE images from one API call. So let's say I have stored 10 images in my google cloud bucket. How can I annotate all these images at once and store them in one JSON file? Right now, I wrote a program that calls their example function and it works, but to put it simple, why can't I say: 'Look in this folder and annotate all images in it.'?
Thanks in advance.
from batch_image_labeling import sample_async_batch_annotate_images
counter = 0
for file in os.listdir('my_directory'):
filename = file
sample_async_batch_annotate_images('gs://my_bucket/{}'.format(filename), 'gs://my_bucket/{}'.format(counter))
counter += 1
from google.cloud import vision_v1
from google.cloud.vision_v1 import enums
import six
def sample_async_batch_annotate_images(input_image_uri, output_uri):
"""Perform async batch image annotation"""
client = vision_v1.ImageAnnotatorClient()
if isinstance(input_image_uri, six.binary_type):
input_image_uri = input_image_uri.decode('utf-8')
if isinstance(output_uri, six.binary_type):
output_uri = output_uri.decode('utf-8')
source = {'image_uri': input_image_uri}
image = {'source': source}
type_ = enums.Feature.Type.LABEL_DETECTION
features_element = {'type': type_}
type_2 = enums.Feature.Type.IMAGE_PROPERTIES
features_element_2 = {'type': type_2}
features = [features_element, features_element_2]
requests_element = {'image': image, 'features': features}
requests = [requests_element]
gcs_destination = {'uri': output_uri}
# The max number of responses to output in each JSON file
batch_size = 2
output_config = {'gcs_destination': gcs_destination, 'batch_size': batch_size}
operation = client.async_batch_annotate_images(requests, output_config)
print('Waiting for operation to complete...')
response = operation.result()
# The output is written to GCS with the provided output_uri as prefix
gcs_output_uri = response.output_config.gcs_destination.uri
print('Output written to GCS with prefix: {}'.format(gcs_output_uri))
回答1:
It's somewhat unclear from that example, but your call to async_batch_annotate_images
takes a requests
parameter which is a list of multiple requests. So you can do something like this:
rom google.cloud import vision_v1
from google.cloud.vision_v1 import enums
import six
def generate_request(input_image_uri):
if isinstance(input_image_uri, six.binary_type):
input_image_uri = input_image_uri.decode('utf-8')
if isinstance(output_uri, six.binary_type):
output_uri = output_uri.decode('utf-8')
source = {'image_uri': input_image_uri}
image = {'source': source}
type_ = enums.Feature.Type.LABEL_DETECTION
features_element = {'type': type_}
type_2 = enums.Feature.Type.IMAGE_PROPERTIES
features_element_2 = {'type': type_2}
features = [features_element, features_element_2]
requests_element = {'image': image, 'features': features}
return requests_element
def sample_async_batch_annotate_images(input_uri, output_uri):
"""Perform async batch image annotation"""
client = vision_v1.ImageAnnotatorClient()
requests = [
generate_request(input_uri.format(filename))
for filename in os.listdir('my_directory')
]
gcs_destination = {'uri': output_uri}
# The max number of responses to output in each JSON file
batch_size = 1
output_config = {'gcs_destination': gcs_destination, 'batch_size': batch_size}
operation = client.async_batch_annotate_images(requests, output_config)
print('Waiting for operation to complete...')
response = operation.result()
# The output is written to GCS with the provided output_uri as prefix
gcs_output_uri = response.output_config.gcs_destination.uri
print('Output written to GCS with prefix: {}'.format(gcs_output_uri))
sample_async_batch_annotate_images('gs://my_bucket/{}', 'gs://my_bucket/results')
This can annotate up to 2,000 images in a single request. The only downside is that you can only specify a single output_uri
as a destination, so you won't be able to use counter
to put each result in a separate file, but you can set batch_size = 1
to ensure each response is written separately if this is what you want.
来源:https://stackoverflow.com/questions/58732768/how-to-annotate-multiple-images-from-a-single-call-using-googles-vision-api-py