The worst thing that can happen to your Business Analytics (BA) system is that the data in your Data Warehouse (DW) is not in sync with your transactional data and the user community loses faith in the validity of your BA data. Nothing can relegate your BA system to an earlier grave and oblivion than losing the faith of the user community. IT, being the guardian of data and provider of information compiled from that data, has to ensure the integrity of said data.
I’d like to emphasize here that I’m not talking about operational integrity (someone enters a false transaction) but transactional integrity (the DW data has to match to the penny to the transactional data at any given cut-off point). If you have $1,000,000 worth of open orders in your order entry system and you only show $990,000 of open orders in your DW during the same time period, someone in IT should utter a phrase akin to “Houston, we have a problem”. In order to achieve this, one should have comparative exception reports at their disposal. For near-real-time data warehouses, a certain “quiet” period needs to be established where all pending updates are processed and no new updates will be processed. At that time comparative reports are run and any discrepancies would be looked into. If a “quiet” time is not achievable due to international 24/7 availability requirements, a date and time cutoff in the near past can be used for assuring DW integrity.
Perceived data integrity problems happen when users don’t understand the reports they are using/comparing. Many people choose to assume rather than ask. It is a wise practice to put legends for acronyms and brief description for complex calculations in the report header/footer. Column and field headings should also make sense and not be vague. For example if a report is titled “Unconfirmed Inventory Report” it is within reason to assume that the total at the end of the report is a total of your “unconfirmed” inventory although the field description says “Total Inventory”. Changing your field description to “Total Unconfirmed Inventory” would eliminate any confusion that users might have when comparing two totals that say “total Inventory” from different reports and wondering why they don’t match. Sometimes the simple things create the biggest problems.
Another form of perceived data integrity can happen when users export their reports into Excel (which is a tool I lovingly call “The curse of IT”) and make changes to the excel sheet changing its original structure then complain the data does not add up and assume the source of the inaccuracy is the BA system. I always tell users that once exported into Excel, the report and its content become their responsibility. To prove to me that the BA system has integrity issues the problem has to be shown to me on an original run screen or in a pdf format.
In summary, when it comes to data integrity, dot your I’s and cross your t’s. Establish a standard corporate dictionary (ATS – Available To Sell or Available To Ship). Be specific and descriptive in your naming and labeling on charts and reports and may the spirits of IT protect you from complex Excel documents with totals that do not add up.