Fastest AWS DB service for large media nested metadata querying

问题

I'm trying to determine what might be the best suited (mainly in terms of speed) database service for querying metadata of media content such as images/videos/audio located on AWS S3. Currently I'm looking at DynamoDB and Redshift, but there may be better alternatives I haven't considered.

Example use case:

I have millions of images (and cropped sections of images) ran through a web of machine learning full-image classification, bounding-box object detection, and pixel segmentation (RLE pixel labeled) models, where nested labels are predicted and attributes/scores are assigned. The nested structure is continually evolving. For example, an image may be predicted by a full-image classifier and given the tag "outside", sent to an object detector that detects bounding box locations of multiple "person" tags with x/y/width/height coordinates, then these crops may be sent to a further full (small) image detector that classifies these predicted person crops as "sitting" or "standing". I'd like to be able to speedily query the nested metadata to get the image ID's corresponding to all images with particular combinations of labels.

Specific query example:

What are the S3 locations of all images tagged with the whole-image classification label "outside", with >= two counts of the object detection label "person", and where at least one person object has been further classified as "sitting".

I've been browsing this AWS DB offering page and am not sure what is best suited to this task. Of course, if there's a far superior non-AWS/S3 solution, I'd certainly like to know that. Any suggestions are greatly appreciated!

Edit: Updated the example slightly to describe the nesting structure more clearly.

来源：https://stackoverflow.com/questions/56844046/fastest-aws-db-service-for-large-media-nested-metadata-querying

标签

database

amazon-web-services

amazon-s3

nested-queries

imagedata