The terms are used all over the place, and I don\'t know of crisp definitions. I\'m pretty sure I know what a data mart is. And I\'ve created reporting cubes with tools li
The term "data mart" has become somewhat ambiguous, but it is traditionally associated with a subject-oriented subset of an organization's information systems. Data mart does not explicitly imply the presence of a multi-dimensional technology such as OLAP and data mart does not explicitly imply the presence of summarized numerical data.
A cube, on the other hand, tends to imply that data is presented using a multi-dimensional nomenclature (typically an OLAP technology) and that the data is generally summarized as intersections of multiple hierarchies. (i.e. the net worth of your family vs. your personal net worth and everything in between) Generally, “cube” implies something very specific whereas “data mart” tends to be a little more general.
I suppose in OOP speak you could accurately say that a data mart “has-a” cube, “has-a” relational database, “has-a” nifty reporting interface, etc… but it would be less correct to say that any one of those individually “is-a” data mart. The term data mart is more inclusive.
As the name suggests, a cube is a structured multidimensional data-set, (typically three dimensions each representing three sides of a cube). A data mart is just a container and not a structure by itself, although it contains data-sets flatly organized (as tables) in dimensions and facts.
The structure of a cube makes it easy to visualize or conceptualize data along various dimensions of a cube. Thus most business analysts or developers find it easy to query and interact with the cube.
Since a data mart is just a container with a bunch of tables; users need to first conceptualize and understand dimensional structures before querying and analyzing data.
Data mart traditionally has meant static data, usually date/time oriented, used by analysts for statistics, budgeting, performance and sales reporting, and other planning activities.
A Cube is an OLAP database that pretty exhaustively converts OLTP data into a static, date/time-oriented schema that uses a query language that is not SQL, but built specifically for answering data mart type questions. It uses terms like measures, dimensions, star-schema, etc. rather than tables, columns, and rows. The best familiar analogy might be pivot-tables in a spreadsheet.
Data mart is a collection of data of a specific business process. It is irrelevant how the data is stored. A cube stores data in a special way, multiple-dimension, unlike a table with row and column. A cube in a olap database is like a table to traditional database. A data mart can have tables or cubes. Cubes make the analysis faster because it pre-calculates aggregations ahead of time.
Remember:
Data Warehousing is the process of taking data from legacy and transaction database systems and transforming it into organized information in a user-friendly format to encourage data analysis and support fact-based business decision making.
A Data Warehouse is a system that extracts, cleans, conforms, and delivers source data into a dimensional data store and then supports and implements querying and analysis for the purpose of decision making.
KIMBALL e.g. consistently has defined data mart as a process-oriented subset of the overall organization’s data based on a foundation of atomic data, and that depends only on the physics of the data-measurement events, not on the anticipated user’s questions.
CORRECT DEFINITION
MISGUIDED DEFINITION
To me, a datamart is just place where data gets dumped in a relatively flat, unusable format.
Cube is taking that data and making it dance.