Data warehousing principles and NoSQL

拟墨画扇 提交于 2019-12-01 00:44:38

In my estimation OLAP subsets or structures will not go away and may become more common for a few reasons. In no particular order: f) Map-reduce is all you get in many cases. Mongodb is on a steadier foot with their speedier aggregation-pipeline; u) A big gotcha with NoSQL is the lack of joins or relationships. Meaning that your underlying data has to be ugly in order to support many OLAP reports; b) Its worthwhile constructing 'throw away' or volatile data subsets simply to keep a clean master table/collection; a) NoSQL is perfectly suited for redundant datasets: there are no create table or even schemas needed, its dead simple to spin up and kill collections; r) NoSQL is heaps easier to scale for the additional dataset than SQL; d) A fledgling start-up can avoid the cost and resources needed to support two db technologies ( one for OLAP and one for OLTP ); and, b) you'll find your backend / frontend code much much much easier and manageable with massaged data sets; and, c) the unbeatable speed advantage of premade datasets with their own premade indices.

Answer to both of your questions is YES. 1. It is still valid to restructure your transaction data for analysis. 2. You can use NoSQL to do everything you asked.

As you mentioned only about querying/analysis/OLAP, I am assuming the only consideration here is creating a querying/reporting platform. So, OLTP system and whether NoSQL can handle it or not is out of discussion.

It is difficult to answer this question without having a context associated to it. Context as in whether you are creating this platform for a team, department, vertical, business line etc. of an organization or you are creating this platform for whole organization as central repository.

If you are setting it up for a team/department, volume is not humongous, less number of users will query it, querying frequency is not that high then OLAP is still valid. But if volume is humongous and and with high frequency of querying and large number of users and you see that you will need to scale in future then NoSQL will be your bet.

Also, if you create a platform for NoSQL at an enterprise level. Say - you create a enterprise data warehouse or a data lake which caters any and every audience from the organization. But within organization teams/departments might create their own OLAP by creating Data Marts to cater their own needs. So, in this case both OLAP and NoSQL is still valid.

I would say it totally depends on your use case. To reach a decision there are various factors that need to be considered. Pros and cons are always there for any technology in consideration. There is no generic answer for these kind of comparisons. You need to answer questions like - What are your data sources and their format; if they are structured, semi-structured, unstructured? Who are your users and how many; if there are multiple departments with different needs, if they need their separate dashboard, whether they need to access each others data? What is the volume of the data you will be handling? What is the frequency of querying the reporting platform? And many more questions you can ask yourself. After answering these questions then decide what is the best suited option for you.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!