Filtering by ranges in datastore API

女生的网名这么多〃 提交于 2019-12-12 16:19:03

问题


The datastore API documentation at http://docs.ckan.org/en/ckan-2.2/datastore.html describes how methods such as "datastore_delete" or "datastore_search" can include a "filter" parameter to filter results.

filters (dictionary) – filters to apply before deleting (eg {“name”: “fred”}). If missing delete whole table and all dependent views. (optional)

Taking into account the {"name": "fred"} example, it seems that filtering is based in exact matching. However, is it also possible to specify a range, e.g. get results greater/lesser than a given value?

For example, if I have a datastore with a field named "date" with values from "2014-06-12T12:03:10.052000" (oldest) to "2014-06-12T16:08:39.542000" (newest) in my record, how I can get for example the records older to 2014-06-12T14:00:00 or newer to 2014-06-12T15:00:00?

Any example or pointer to documentation is welcome. Thanks!


回答1:


The answer is that, as of CKAN 2.2, this is not possible. the datastore_delete and datastore_search actions only take filters of the type "field = value".

For datastore_search there is an alternative: datastore_search_sql which allows you to provide your own SQL query, and thus implement range searches. (I think the limitations are that this only works on public datasets, and requires a read only database user, which CKAN only supports for Postgres 9.1+).

There is an experimental branch, 1725 currently being developed which will allow plugins to extend the SQL queries ran by the datastore. This will allow plugins to implement new types of filter, such as ranges. If you are a developer you may want to try that branch.




回答2:


The trick is marking the values so that the filter will catch all the values you wish to delete at once.

  • First download the data you're looking to delete via an SQL query via the CKAN API.
  • Then, change two columns in that data to all be the same highly identifiable number where those values will never be found together in the entire table.
  • Next, update the data via "datastore_upsert" using the "update" method to ensure you're not adding new data to the table somehow.
  • Finally, implement "datastore_delete" with a filter that identifies both columns with that highly identifiable number.

Here's an example of the filter:

filter = {"col1": 9876543210, "col2": 9876543210}


来源:https://stackoverflow.com/questions/24206728/filtering-by-ranges-in-datastore-api

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!