问题
Our dates are stored as: "2/22/2008 12:00:00 AM"
. We need to filter results so that we get documents between two times.
If we compare two queries, one using a UDF and the other not, the one with the UDF is orders of magnitude slower.
With:
SELECT DISTINCT
c.eh, c.wcm, w AS wt
FROM
c
JOIN w IN c.wt
WHERE
(udf.toValue(w.ced) BETWEEN udf.toValue('03/02/2023') AND udf.toValue('09/02/2023'))
AND w.ty = 'FW'
OFFSET 0
LIMIT 10
And without:
SELECT DISTINCT
c.eh, c.wcm, w AS wt
FROM
c
JOIN w IN c.wt
WHERE
w.ty = 'FW'
OFFSET 0
LIMIT 10
Here's the UDF:
function userDefinedFunction(datestr){
return new Date(datestr).getTime();
}
According to the second answer here (by one of the MS employees who works on Cosmos) I should just be able to do a direct compare:
(w.ced BETWEEN '03/02/2023' AND '09/02/2023')
But that returns 0 results. I'm extremely new to Cosmos. How can this query be optimized? I should add, there is already an idex on wt/ced
.
回答1:
Generally speaking if you can use a system function instead of a UDF the performance will be much better.
In this scenario however, you should persist your dates in a consistent format to avoid having to use UDF's the fix them up each time.
I recommend looking at our best practices for working with dates.
If you are able to store the dates in a consistent format in Cosmos DB (recommended format is following the ISO 8601 UTC standard), then you can avoid needing to convert the format within the query itself (this will be costly). Any data formatting conversions, if needed, should be done within the app. For example, converting the date “03/02/2023” to ISO 8601 UTC standard before running the query (and then use this text in the query).
Hope this is helpful.
来源:https://stackoverflow.com/questions/61110774/azure-cosmos-db-udf-for-date-time-is-seriously-slowing-down-query