I need a CouchDB view where I can get back all the documents that don\'t have an arbitrary field. This is easy to do if you know in advance what fields a document might
Without knowing the possible fields in advance, the answer is easy. You must create a new view to find the missing fields. The view will scan every document, one-by-one.
To avoid disturbing your existing views and design documents, you can use a brand new design document. That way, searching for the missing fields will not impact existing views you may be already using.
This technique is called the Thai massage. Use it to efficiently find documents not in a view if (and only if) the view is keyed on the document id.
function(doc) {
// _view/fields map, showing all fields of all docs
// In principle you could emit e.g. "foo.bar.baz"
// for nested objects. Obviously I do not.
for (var field in doc)
emit(field, doc._id);
}
function(keys, vals, is_rerun) {
// _view/fields reduce; could also be the string "_count"
return re ? sum(vals) : vals.length;
}
To find documents not having that field,
GET /db/_all_docs
and remember all the idsGET /db/_design/ex/_view/fields?reduce=false&key="some_field"
_all_docs
vs the ids from the query.The ids in _all_docs
but not in the view are those missing that field.
It sounds bad to keep the ids in memory, but you don't have to! You can use a merge sort strategy, iterating through both queries simultaneously. You start with the first id of the has list (from the view) and the first id of the full list (from _all_docs).
Depending on your language, that might be difficult. But it is pretty easy in Javascript, for example, or other event-driven programming frameworks.