How to Filter by Query on Nested Fields in AWS AppSync

问题

Problem and Expected Results

I'm using a proof of concept schema and DynamoDB Table setup to filter on nested field values. I've followed the ideas very generally here as well as the documentation for $utils.transform.toDynamoDBFilterExpression (here).

The basic idea is this: using the same sort of principles, I'd like to filter by any arbitrarily deep nested field (short of the 32 document path length limit in DynamoDB). The relevant setup looks like this:

AppSync schema (apologies for the naming conventions; was supposed to be a quick and dirty PoC):

query {
    listActiveListingsBySubAndFilter(
        filter: TableTestMasterDataTable_ImportV1FilterInput!,
        limit: Int,
        nextToken: String
    ): TestMasterDataTable_ImportV1Connection
}

input TableBooleanFilterInput {
    ne: Boolean
    eq: Boolean
}

input TableDataObjectFilterInput {
    beds: TableFloatFilterInput
    baths: TableFloatFilterInput
}

input TableFloatFilterInput {
    ne: Float
    eq: Float
    le: Float
    lt: Float
    ge: Float
    gt: Float
    contains: Float
    notContains: Float
    between: [Float]
}

input TableIDFilterInput {
    ne: ID
    eq: ID
    le: ID
    lt: ID
    ge: ID
    gt: ID
    contains: ID
    notContains: ID
    between: [ID]
    beginsWith: ID
}

input TableIntFilterInput {
    ne: Int
    eq: Int
    le: Int
    lt: Int
    ge: Int
    gt: Int
    contains: Int
    notContains: Int
    between: [Int]
}

input TableStringFilterInput {
    ne: String
    eq: String
    le: String
    lt: String
    ge: String
    gt: String
    contains: String
    notContains: String
    between: [String]
    beginsWith: String
}

input TableTestMasterDataTable_ImportV1FilterInput {
    id: TableStringFilterInput
    status: TableStringFilterInput
    sub: TableStringFilterInput
    data: TableDataObjectFilterInput
}

type TestMasterDataTable_ImportV1 {
    id: String!
    status: String!
    sub: String!
    data: AWSJSON
}

type TestMasterDataTable_ImportV1Connection {
    items: [TestMasterDataTable_ImportV1]
    nextToken: String
}

input UpdateTestMasterDataTable_ImportV1Input {
    id: String!
    status: String
    sub: String!
    data: AWSJSON
}

VTL request and response resolvers:

## Request resolver

#set( $filter = $ctx.args.filter )
#set( $path = $filter.data )

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",  ## GSI on table with HASH: status, RANGE: sub
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
            "#status" : "status",
            "#sub" : "sub"
        },
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : $util.transform.toDynamoDBFilterExpression($path),
    "limit": $util.defaultIfNull($ctx.args.limit, 20),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}


## Response resolver

{
    "items": $util.toJson($ctx.result.items),
    "nextToken": $util.toJson($util.defaultIfNullOrBlank($context.result.nextToken, null))
}

Example DynamoDB Table element:

{
  "_meta": {
    "exposure": 0.08,
    "lastActive": 1557800000,
    "lastUpdated": 1557878400,
    "lastView": 1557878500,
    "numViews": 63,
    "posted": 1557878400
  },
  "buildingID": "325-5th-Ave,-New-York,-NY-10016,-USA",
  "data": {
    "agent": [
      {
        "agentID": "daeo@gmail.com"
      },
      {
        "agentID": "ben@gmail.com"
      }
    ],
    "amenities": [
      "hot tub",
      "time machine"
    ],
    "baths": 2,
    "beds": 2
  },
  "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
  "status": "Active",
  "sub": "new-york/manhattan/listings",
  "unitNum": "37C",
  "unitRefID": "325-5th-Ave,-New-York,-NY-10016,-USA#37C"
}

Based on all of this, if I run the following query:

listActiveListingsBySubAndFilter(filter: {
    "sub" : {
      "eq" : "new-york/manhattan/listings"
    },
    "data": {
      "beds": {
        "eq": 2.0
      }
    }) {
    items {
      id
      status
    }
    nextToken
}

I would expect to get something like this in return:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [
          {
              "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
              "status": "Active"
          }
      ],
      "nextToken": null
    }
  }
}

Note: this is the only expected return since there's only one item matching these requirements in the database at this time.

Actual Results

All of that said, the results I'm getting (or lack thereof) aren't making much sense. No matter the query (data.beds, data.baths), if the field is nested in data the return is the same:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [],
      "nextToken": null
    }
  }
}

I've verified the query is working as expected and the filter expression is formatted appropriately (it works on other non-nested fields like id). What's perplexing is that the filter just doesn't seem to get applied (or maybe is being applied in some non-intuitive way?). For reference, here's a snippet of a typical CloudWatch log for the above:

{
    "context": {
        "arguments": {
            "filter": {
                "sub": {
                    "eq": "new-york/manhattan/listings"
                },
                "data": {
                    "beds": {
                        "eq": 2
                    }
                }
            },
            "limit": 200
        },
        "stash": {},
        "outErrors": []
    },
    "fieldInError": false,
    "errors": [],
    "parentType": "Query",
    "graphQLAPIId": "q7ueubhsorehbjpr5e6ymj7uua",
    "transformedTemplate": "\n\n{\n    \"version\" : \"2017-02-28\",\n    \"operation\" : \"Query\",\n    \"index\" : \"listings-index\",\n    \"query\" : {\n        \"expression\": \"#status = :status and #sub = :sub\",\n        \"expressionNames\" : {\n        \t\"#status\" : \"status\",\n            \"#sub\" : \"sub\"\n    \t},\n        \"expressionValues\" : {\n            \":status\" : {\"S\":\"Active\"},\n            \":sub\" : {\"S\":\"new-york/manhattan/listings\"}\n        }\n    },\n    \"filter\" : {\"expression\":\"(#beds = :beds_eq)\",\"expressionNames\":{\"#beds\":\"beds\"},\"expressionValues\":{\":beds_eq\":{\"N\":2.0}}},\n    \"limit\": 200,\n    \"nextToken\": null\n}"
}

Notice the filter expressionValues value in transformedTemplate: { "N" : 2.0 } (sans $util.toDynamoDBJson formatting) and compare it to the value in the object in DynamoDB on that field.

I've tried everything, including changing the fields themselves to strings and doing various filter operations like eq and contains to see if this was some odd type inconsistency, but no luck.

As of now, I have two backup solutions that involve either "pulling up" all the relevant fields I might want to filter on (cluttering my records with attributes I'd rather keep nested) or creating a new nested type containing only high-level fields for filtering on -- i.e., effectively split the records into a record reference and a record filter reference. In this scenario, we'd get some "Listing" record that has as its data field value something like ListingFilterData -- e.g.:

type Listing {
    id: String!
    sub: String!
    status: String!
    data: ListingFilterData!
}

type ListingFilterData {
    beds: Float!
    baths: Float!
}

Both are doable, but I'd rather try to solve the current issue instead of adding a bunch of extra data to my table.

Any thoughts?

Update 9/17/19

After some more fiddling, I happened upon the solution implied here. Following from what I could glean about solution, I successfully implemented a hardcoded nested query filter using the following VTL request resolver (and changing the filter expression keyname to avoid a reserved word conflict on data):

#set( $filter = $ctx.args.filter )
#set( $path = $filter.filterData ) ## currently, unused

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
            "#status" : "status",
            "#sub" : "sub"
        },
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : {
        "expression" : "#filterData.beds = :beds",
        "expressionValues" : {
            ":beds" : $util.dynamodb.toDynamoDBJson(2.0)
        }
    },
    "limit": $util.defaultIfNull($ctx.args.limit, 20),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}

This returns my expected result:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [
        {
          "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
          "status": "Active"
        }
      ],
      "nextToken": null
    }
  }
}

Seems like progress, but any ideas on how to dynamically create the docpath and create an expression name for a nested attribute? Running more ideas and will report back if anything interesting shows up...

Update #2 from 9/17/19

After further playing around with the request resolver, I think I've found a quick and dirty way to dynamically grab the path and target vars for creating a filter expression for my nested attributes. Note: The whole thing still returns an empty result set and it assumes there's only one filter key (for now), but the reserved keyword bit seems to have been solved. Still wondering why the results aren't showing up as expected though.

#set( $filter = $ctx.args.filter )
#foreach( $parent in $filter.keySet() )
    #set( $path = $parent )
#end
#set( $target = $filter[$path] )
#foreach( $ff in $target.keySet() ) ## should only contain one Map key-value pair
    #set( $fp = $ff )
#end
#set( $fv = $target[$fp] )

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
            "#status" : "status",
            "#sub" : "sub"
        },
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : {
        "expression" : "#ffp = :$fp",  ## filter path parent.target = :target
        "expressionNames" : {
            "#ffp" : "${path}.${fp}"
        },
        "expressionValues" : {
            ":$fp" : $util.dynamodb.toDynamoDBJson(${fv.eq}), ## :target : value to filter for
        }
    },
    "limit": $util.defaultIfNull($ctx.args.limit, 200),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}

Inspecting the CloudWatch log transformedTemplate shows the expression names and values are being substituted appropriately:

"filter" : {
    "expression\" : "#ffp = :beds",
    "expressionNames" : {
        "#ffp" : "filterData.beds"
    },
    "expressionValues" : {
        ":beds" : { "N": 2.0 }
    }
}

Update from 09/18/19

I may have finally discovered the root of the problem: it seems that the way in which expressionNames are evaluated does not allow for a key to be a docpath. If I run either of the following filters (notice the use of a non-reserved DynamoDB keyword to illustrate the problem is with expression name substitution), I'll get the result I'm looking for:

"filter" : {
    "expression" : "filterData.beds = :beds",  ## filter path parent.target = :target
        "expressionValues" : {
            ":beds" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

"filter" : {
    "expression" : "filterData.beds = :${fp}",  ## filter path parent.target = :target
        "expressionValues" : {
            ":{fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

Now, if I make a minor change, only attempting to substitute with an expression name value

"filter" : {
    "expression" : "#filterData.beds = :${fp}",  ## filter path parent.target = :target
        "expressionNames": {
            "#filterData.beds" : "filterData.beds"
        },
        "expressionValues" : {
            ":{fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

I get the following error message:

"ExpressionAttributeNames contains invalid key: Syntax error; key: \"#filterData.beds\" (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException"

Even with a hardcoded path substitution, VTL seems to read the path as a single key name. Same issue when swapping the values of the expressions dynamically so there're no hardcoded strings.

回答1:

SOLVED

I happened upon this gem by accident, and it gave me the little bit extra I needed to find a workable solution with dynamic key names!

Here's what the filter expression looks like now:

"filter" : {
        "expression" : "#path.#filter = :${fp}",  ## filter path parent.target = :target
        "expressionNames": {
            "#path" : "${path}",
            "#filter" : "${fp}"
        },
        "expressionValues" : {
            ":${fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

The hold up here was that while expression attribute names are generally interpreted as document paths, with the introduction of the substituted name, the interpreter treats the key name as a scalar attribute and NOT as a document path. You need to individually identify the path elements and substitute for each.

来源：https://stackoverflow.com/questions/57963008/how-to-filter-by-query-on-nested-fields-in-aws-appsync

标签

filter

nested

graphql

amazon-dynamodb

aws-appsync