I have an application developed using the MVC pattern and I would like to index now multiple models of it, this means each model has a different data structure.
Jonathan's answer is great. I would just add few other points to consider:
Both the above answers are great!
I am adding an example of several types in an index. Suppose you are developing an app to search for books in a library. There are few questions to ask to the Library owner,
Questions:
How many books are you planning to store?
What kind of books are you going to store in the library?
How are you going to search for books?
Answers:
I am planning to store 50 k – to 70 k books (approximately)
I will have 15 k -20 k technology related books (computer science, mechanical engineering, chemical engineering and so on), 15 k of historical books, 10 k of medical science books. 10 k of language related books (English, Spanish and so on)
Search by authors first name, author last name, year of publish, name of the publisher. (This gives you the idea about what information you should store in the index)
From the above answers we can say the schema in our index should look somewhat like this.
//This is not the exact mapping, just for the example
"yearOfPublish":{
"type": "integer"
},
"author":{
"type": "object",
"properties": {
"firstName":{
"type": "string"
},
"lastName":{
"type": "string"
}
}
},
"publisherName":{
"type": "string"
}
}
In order to achieve the above we can create one index called Books and can have various types.
Index: Book
Types: Science, Arts
(Or you can create many types such as Technology, Medical Science, History, Language, if you have lot more books)
Important thing to note here is the schema is similar but the data is not identical. And the other important thing is the total data you are storing.
Hope the above helps when to go for different types in an Index, if you have different schema you should consider different index. Small index for less data . big index for big data :-)
Although Jonathan's answer was correct at the time, the world has moved on and it now seems that the people behind ElasticSearch have a long term plan to drop support for multiple types:
Where we want to get to: We want to remove the concept of types from Elasticsearch, while still supporting parent/child.
So for new projects, using only a single type per index will make the eventual upgrade to ElasticSearch 6.x be easier.
There are different implications to both approaches.
Assuming you are using Elasticsearch's default settings, having 1 index for each model will significantly increase the number of your shards as 1 index will use 5 shards, 5 data models will use 25 shards; while having 5 object types in 1 index is still going to use 5 shards.
Implications for having each data model as index:
Implications for having each data model as an object type within an index:
If you are asking what is too much data vs small data? Typically it depends on the processor speed and the RAM of your hardware, the amount of data you store within each variable in your mapping for Elasticsearch and your query requirements; using many facets in your queries is going to slow down your response time significantly. There is no straightforward answer to this and you will have to benchmark according to your needs.