I am looking for the best way to structure my database. I have quarterly financial statements for 1000’s of companies from 1997-2012. Each company has three different state
I would probably go with the following structure in one data table:
Company
StatementType
LineItem
FiscalYear
Q1, Q2, Q3, Q4
StatementType would be Income Statement, Balance Sheet or Cash Flow Statement. Line Item would be the coded/uncoded text of the item on the statement, Fiscal Year is 2012, 2011 and so on. You'd still need to make sure that Line Items are consistent across companies.
This structure would let you query for flat statement -
select
LineItem, Q1, Q2, Q3, Q4
from Data
where
Company = 'RoyalBank'
and FiscalYear = 2012
and StatementType = 'Income Statement'
or
QoQ
select
FiscalYear,
Q1
from Data
where
Company = 'Royal Bank'
and
StatementType = 'Income Statement'
and
LineItem = 'Sales'
order by FiscalYear
in addition to aggregates. You'd probably want to have another table for line items with some kind of an index reference to make sure you can pull the statement back in the original order of line items.
Whenever I've had to build a database for fiscal reports by fiscal quarter, month, or year or whatever, I've found it convenient to borrow a concept from star schema design and data warehousing, even if I'm not really building a DW.
The borrowed concept is to have a table, let's call it ALMANAC, that has one row for each date, keyed by the date. In this case a natural key works out well. Dependent attributes in here can be what fiscal month and quarter the date belongs to, whether the date was one where the enterprise was open for business (TRUE or FALSE), and whatever other quirks are in the company calendar.
Then, you need a computer program that just generates this table out of nothing. All the strange rules for the company calendar are embedded in this one program. The ALMANAC can cover a ten year period in a little over 3,650 rows. That's a tiny table.
Now every date in the operational data can be used like a foreign key into the ALMANAC table, provided you consistently use the Date datatype for dates. For example, each sale has a date of sale. Then aggregating by fiscal quarter, or fiscal year, or whatever you like is just a matter of joining operational data with the ALMANAC, and using GROUP BY and SUM() to get the aggregate you want.
It's simple, and it makes generating a whole raft of time period reports a breeze.
My advice to you is to think about not using a SQL database to do this. Instead, think of using something like SQL Server Analysis Services (SSAS). If you want to get a quick start with SSAS, I recommend getting up to speed on PowerPivot for Excel. You can take the model you develop in PowerPivot and import it into SSAS when you're ready.
Why don't I recommend SQL? Because you're going to have a problem aggregating accounts in SQL Server. For example, your balance sheets aren't going to be something you're going to be able to aggregate easily in SQL -- Asking SQL Server for the "Cash" for 2010, for example means that you want to get the entry for the end of December 2010, not that you want to SUM all of the entries for Cash for that year (which would be a nonsense number). On the other hand, with income and expense accounts such as those which would appear on your income statements, you would want to SUM those values up. To make matters worse, some reports are going to have a mix of account types on them, which is going to make reporting quite difficult.
SSAS has provisions inside the product where it "knows" how to aggregate for your reports based on account type, and there are many tutorials out there which can show you how to set this up.
Either way, you're going to need to store your data somewhere before it goes into your reporting system or Analysis Services cube. In order to do that, you should structure your data something like this. Let's say you're storing your data in a table called Reports:
Reports
--------
[Effective Date]
[CompanyID]
[AccountID]
[Amount]
Your Account table would have the description of what you're trying to store (income, expenses, etc). Your [Effective Date] column would link back into a Dates
table which describes to which year, quarter, etc., your data belongs. In essence, what I'm describing is a classic shape for reporting databases, called a star schema.