Limit data coming into Spotfire by a different data table

♀尐吖头ヾ 提交于 2019-12-12 23:07:30

问题


I have Table A prompted on Year/Month and Table B. Table B also has a Year/Month column. Table A is the default data table (gets pulled in first). I have set up a relationship between Table A and B on the common Year/Month column.

The goal is to get Table B to only pull through data where the Year/Month matches the Year/Month on Table A (what the user entered). The purpose is to keep the user from entering the Year/Month multiple times.

The issue is Table B contains almost 35 million records. What I do not want to do is have Spotfire pull across all 35 Million records. What is currently happening is Spotfire is pulling all those records, then by setting filtering to include Filtered Rows Only on Table B, I am limiting what is seen in the visualization to under 200,000 rows. I would much rather just pull across 200,000 rows to start with.

The question: Is there a way to force Spotfire to filter the data table (Table B) by another data table (Table A) as it pulls the data table (Table B) across, thus only pulling a small number of records into memory?


回答1:


I'm writing this off the basis that most people utilize information links to get data into Spotfire, especially large data sets where the data is not embedded in the analysis. With that being said, I prefer to handle as much if not all of the joining / filtering / massaging at the data source versus the Spotfire application. Here are my views on the best practices and why.

Tables / Views vs Procedures as Information Links

Most people are familiar with the Table / View structure and get data into Spotfire in one of 2 ways

  1. Create all joins / links in information designer based off data relations defined by the author by selecting individual tables from the data sources avaliable
  2. Create a view (or similar object) at the data source where all joining / data relations are done, thus giving Spotfire a single flat file of data

Personally, option 2 is much easier IF you have access to the data source since the data source is designed to handle this type of work. Spotfire just makes it available but with limited functionality (i.e. complex queries, Intellisense, etc aren't available. No native IDE). What's even better is Stored Procedures IMHO and here is why.

In options 1 and 2 above, if you want to add a column you have to change the view / source code at the data source, or individually add a column in the information designer. This creates dwarfed objects and clutters up your library. For example, when you create an information link there is a folder with all the elements associated with it. If you want to add columns later, you'll have another folder for any columns added, and this gets confusing and hard to manage. If you create a procedure at the data source to return the data you need, and later want to add some columns, you only have to change this at the data source. i.e. change the procedure. Everything else will be inherited by Spotfire... all you have to do is click the "reload data" button in Spotfire. You don't have to change anything in the information designer. Additionally, you can easily add new parameters, set default parameter properties or prompt the user, making this a very efficient method of data retrieval. This is perfect when the data source is an OLTP and not a data-mart/data-warehouse (i.e. the data isn't already aggregated / cleansed) but can also be powerful in data warehouse environments as well.

Ditch the GUI, Edit the SQL

I find managing conditions, parameters, join paths, etc a bit annoying--but that's me. Instead, when possible, I prefer to click "Edit SQL" next to all the elements in my Information Link and alter the SQL there. This will allow database guys to work in an environment which is more familiar.



来源:https://stackoverflow.com/questions/43522260/limit-data-coming-into-spotfire-by-a-different-data-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!