After testing the discovery service, it seems useless to me at least or I might be missing something.
When I query, it matches the document and returns the whole document. If my document is huge, then for all queries it returns the whole document matching the query text, which is useless.
Now Do I have to create a separate document for every query?
If that's the case, API.AI or WIT.AI is a better option.
Please clear me on what I am missing in here!
For now with Discovery, you would need to break up your documents once to put them in a collection, then any query against the collection in Discovery will return results from that set of separated docs. So if your documents don't change, this split should be a one time action.
Though the solution of automatically identifying the relevant section of a larger doc for a query is a good consideration for Discovery (note: I work for IBM Watson).
wit or api are more similar to our watson conversation service. Discovery is about finding relevant content out of a corpus, while the two you mentioned, and our Conversation service, are more about responding with a dialog using NLP to understand the query.
There is now a Document Segmentation option to apply to your Discovery configuration. This allows Discovery to segment the document when initially loading and indexing them. This was added last in October 2017. Beware, there are some restrictions, particularly around preservation of custom metadata. Here is a link to the doc.
https://console.bluemix.net/docs/services/discovery/building.html#doc-segmentation
Watson Discovery service allows cognitive search in hundreds of documents. You can use the Watson Document Conversion service in order to automatically create granularity of PAUs (Possible Answer Units) for each document in JSON format. Then you can load the PAUs generated by the Watson Document Conversion in the Watson Discovery Service. This way, Watson Discovery will return exact answers for your cognitive queries.
There is now a passages
parameter that can be passed to the query API. It's in beta as of this writing. It provides the location within the document as well as the "passage" text and score.
{
"document_id": "dd2a7574-c266-4587-812b-69a47aa271d6",
"passage_score": 23.961884787023948,
"passage_text": " query block name in many hints to specify the query block to which the hint applies. This syntax lets you specify in the outer query a hint that applies to an inline view.\n\nThe syntax of the query block",
"start_offset": 404,
"end_offset": 607
},
来源:https://stackoverflow.com/questions/41801660/how-to-get-exact-answers-instead-of-the-whole-document-using-watson-discovery