Datamart vs. reporting Cube, what are the differences?

巧了我就是萌 提交于 2020-04-05 08:07:11

问题


The terms are used all over the place, and I don't know of crisp definitions. I'm pretty sure I know what a data mart is. And I've created reporting cubes with tools like Business Objects and Cognos.

I've also had folks tell me that a datamart is more than just a collection of cubes.

I've also had people tell me that a datamart is a reporting cube, nothing more.

What are the distinctions you understand?


回答1:


Cube can (and arguably should) mean something quite specific - OLAP artifacts presented through an OLAP server such as MS Analysis Services or Oracle (nee Hyperion) Essbase. However, it also gets used much more loosely. OLAP cubes of this sort use cube-aware query tools which use a different API to a standard relational database. Typically OLAP servers maintain their own optimised data structures (known as MOLAP), although they can be implemented as a front-end to a relational data source (known as ROLAP) or in various hybrid modes (known as HOLAP)

I try to be specific and use 'cube' specifically to refer to cubes on OLAP servers such as SSAS.

Business Objects works by querying data through one or more sources (which could be relational databases, OLAP cubes, or flat files) and creating an in-memory data structure called a MicroCube which it uses to support interactive slice-and-dice activities. Analysis Services and MSQuery can make a cube (.cub) file which can be opened by the AS client software or Excel and sliced-and-diced in a similar manner. IIRC Recent versions of Business Objects can also open .cub files.

To be pedantic I think Business Objects sits in a 'semi-structured reporting' space somewhere between a true OLAP system such as ProClarity and ad-hoc reporting tool such as Report Builder, Oracle Discoverer or Brio. Round trips to the Query Panel make it somewhat clunky as a pure stream-of-thought OLAP tool but it does offer a level of interactivity that traditional reports don't. I see the sweet spot of Business Objects as sitting in two places: ad-hoc reporting by staff not necessarily familiar with SQL and provding a scheduled report delivered in an interactive format that allows some drill-down into the data.

'Data Mart' is also a fairly loosely used term and can mean any user-facing data access medium for a data warehouse system. The definition may or may not include the reporting tools and metadata layers, reporting layer tables or other items such as Cubes or other analytic systems.

I tend to think of a data mart as the database from which the reporting is done, particularly if it is a readily definable subsystem of the overall data warehouse architecture. However it is quite reasonable to think of it as the user facing reporting layer, particularly if there are ad-hoc reporting tools such as Business Objects or OLAP systems that allow end-users to get at the data directly.




回答2:


The term "data mart" has become somewhat ambiguous, but it is traditionally associated with a subject-oriented subset of an organization's information systems. Data mart does not explicitly imply the presence of a multi-dimensional technology such as OLAP and data mart does not explicitly imply the presence of summarized numerical data.

A cube, on the other hand, tends to imply that data is presented using a multi-dimensional nomenclature (typically an OLAP technology) and that the data is generally summarized as intersections of multiple hierarchies. (i.e. the net worth of your family vs. your personal net worth and everything in between) Generally, “cube” implies something very specific whereas “data mart” tends to be a little more general.

I suppose in OOP speak you could accurately say that a data mart “has-a” cube, “has-a” relational database, “has-a” nifty reporting interface, etc… but it would be less correct to say that any one of those individually “is-a” data mart. The term data mart is more inclusive.




回答3:


As the name suggests, a cube is a structured multidimensional data-set, (typically three dimensions each representing three sides of a cube). A data mart is just a container and not a structure by itself, although it contains data-sets flatly organized (as tables) in dimensions and facts.

The structure of a cube makes it easy to visualize or conceptualize data along various dimensions of a cube. Thus most business analysts or developers find it easy to query and interact with the cube.

Since a data mart is just a container with a bunch of tables; users need to first conceptualize and understand dimensional structures before querying and analyzing data.




回答4:


Data mart traditionally has meant static data, usually date/time oriented, used by analysts for statistics, budgeting, performance and sales reporting, and other planning activities.

A Cube is an OLAP database that pretty exhaustively converts OLTP data into a static, date/time-oriented schema that uses a query language that is not SQL, but built specifically for answering data mart type questions. It uses terms like measures, dimensions, star-schema, etc. rather than tables, columns, and rows. The best familiar analogy might be pivot-tables in a spreadsheet.




回答5:


Data mart is a collection of data of a specific business process. It is irrelevant how the data is stored. A cube stores data in a special way, multiple-dimension, unlike a table with row and column. A cube in a olap database is like a table to traditional database. A data mart can have tables or cubes. Cubes make the analysis faster because it pre-calculates aggregations ahead of time.




回答6:


Remember:

Data Warehousing is the process of taking data from legacy and transaction database systems and transforming it into organized information in a user-friendly format to encourage data analysis and support fact-based business decision making.

A Data Warehouse is a system that extracts, cleans, conforms, and delivers source data into a dimensional data store and then supports and implements querying and analysis for the purpose of decision making.

KIMBALL e.g. consistently has defined data mart as a process-oriented subset of the overall organization’s data based on a foundation of atomic data, and that depends only on the physics of the data-measurement events, not on the anticipated user’s questions.

  • Data marts are based on the source of data, not on a department’s view of data.
  • Data marts contain all atomic detail needed to support drilling down to the lowest level.
  • Data marts can be centrally controlled or decentralized.

CORRECT DEFINITION

  • Process based
  • Atomic Data Foundation
  • Data Measurement

MISGUIDED DEFINITION

  • Department Based
  • Aggregate Data Only
  • User Question Based



回答7:


To me, a datamart is just place where data gets dumped in a relatively flat, unusable format.

Cube is taking that data and making it dance.




回答8:


I agree with Matthew. We tend to use the term 'Data Mart' for any data source that stores generic data and mappings used across various applications in an enterprize. We don't store measurable data in a data mart, so I see a data mart as one of multiple data sources for a cube. This, however, is how we do it. I am sure there is nothing preventing you from storing measurable data in a data mart.



来源:https://stackoverflow.com/questions/360900/datamart-vs-reporting-cube-what-are-the-differences

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!