Using Apache Hadoop in a Cognos BI environment

为君一笑 提交于 2019-12-06 00:44:37

Hadoop as a platform is not aimed for ad-hoc queries or analytic reports.
Cognos is an IBM product. It can only query it's own distribution of Hadoop, which is called big insights:
InfoSphere BigInsights
Over BigInsight Cognos issues queries using Hive, which eventually translate to MapReduce.

You say you are using Sybase IQ (this is not the content store, this is the reporting DB your queries are running on).
Although I don't know much about Sybase IQ, I am working heavily with Vertica, which is also columnar DB.
In order to get good performance, you have to tune anything possible:

  • Cognos Framework model
  • Cognos reports
  • Sybase DB tuning and structure. Hadoop can certainly help by preparing data in the correct level of granularity and by precalculate any your required calculations.

Simply put, Hadoop is a distributed platform for manipulating large data sets. It has fault-tolerance built in which makes it appealing to organizations where downtime can impact business processes. Cognos is a business intelligence tool that allows users to explore and report on data. So there appears to be a logical fit.

Hadoop, however, does not lend itself (yet) to ad-hoc querying as the other poster has commented. There is a Hadoop project that promises just that - Hive. Developers have released ODBC connectors to access Hive databases (which is simply a data warehouse view of your Hadoop data and can be queried using an SQL-like language called HiveQL). Since Cognos can extract data from an ODBC database, it stands to reason that Cognos can extract data from Hadoop through Hive.

The other approach to using Hadoop in your Cognos environment is to transfer data using text files such as CSV. Hadoop can generate a data file that can then be imported into Cognos. This is the approach I currently use.

Yet, I have not answered the "why" of using Hadoop. The two applications I have used Hadoop on are inventory forecasting and cash flow/budgeting. If you are trying to perform routine forecasts of hundreds of thousands of SKU's, Hadoop is a wonderful tool. If you are trying to perform a Monte Carlo simulation over a thousand budget items, Hadoop is wonderful. Just import data from your data warehouse, run your Hadoop jobs, and import the resulting CSV files into Cognos. Voila!

Take care though, Hadoop is not a panacea. Sometimes old fashion SQL and your programming language of choice are just as good - or better. Hadoop comes with a learning curve and resource demands. I learned by downloading the Hortonworks sandbox; it is a preconfigured virtual machine that runs in VMware, VirtualBox, etc. So you do not have to install or configure anything!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!