Reading time: 3 min
Data warehouse testing, from a project manager’s point of view
As a junior project leader, I used to be shocked by seeing so many “bad software” sold on the market. Even products of the world’s leading companies have defects: word processors, spreadsheet applications, development tools, not to mention ERP systems.
Over the years, have I just faced the fact that there is no perfect software: testing numerous functions, analyzing their different relations would practically last forever and consume endless resources. Therefore, software development also means reaching a compromise.
In recent years, I have worked in many data warehousing projects, where my views on testing have been supplemented by another factor: with data. It’ not just functions that have to have the proper combinations but data as well, multiplying the complexity of a system that has to be implemented with a limited budget. As a project leader, I think it’s important for data warehouse solutions too, that
- we accept that time, money and resources are all limited;
- both the client and the supplier knows that the received or delivered program might have malfunctions, and they have to work together to reach a compromise;
- after delivery, both the client and the supplier should be satisfied, so there should be no “loser” in the delivery process.
According to our practice, the best solution is testing with expected values. This means that we calculate in advance the output of the data warehouse calculations for some test cases, and then we compare the results of the developed tool to these test cases.
This method has several advantages:
- project scope is finalized during the definition of test cases because the client needs to consider and calculate the business implementation of the specification thoroughly;
- the number of misunderstandings is significantly reduced because the actual interpretation of specification shall be provided by business people, and not by IT analysts or developers;
- less time is needed for coding because we ask the client to prepare the test cases and expected values by the time programming starts, while we plan the architectural questions of the system. Our developers can focus on programming instead of consultation and the interpretation of the specification. There might be questions about the test cases too, but less time is needed to discuss them. If it turns out, that a particular test case is incorrect, or the expected value is wrong, it is enough to fix them, and we don’t need to modify the whole application;
- the client’s testers do not have to deal with trivial errors during delivery, because developers can check themselves much easier. Thus, the quality of testing will dramatically improve, and much less testing iterations will be required between the customer and the supplier;
- testing can be done quickly and with planned resources, because during the process, clients must check if the numbers in the report, on the dashboard, or in the database are correct;
- projects have a well-defined end because the delivery of a system that operates according to the test cases and produces the expected results means acceptance of the system at the same time. The client takes over the product, and the project is completed on time.
The biggest downside of this method is that it is a challenging and laborious task to compile test cases fully and determine expected values. It is only an illusion that we can save efforts on this work. On the contrary, if we do not do it in a structured way and in advance, then we will have to do it during testing and under time pressure. All this is less plannable and results in more work just when everyone is waiting for the outcome of the project.
Errors may remain in the system even after careful testing. If an error occurs, then that can be corrected under warranty based on the specification. Still, at least we can be sure that the system does operate as expected in the most frequent and important cases determined by the test cases.
Finally, here is an instructive example:
One of our clients had to make a very complex calculation due to a change in legal provisions, where the interpretation of the law was unclear. Together we came up with a solution of manually counting some typical cases in Excel.
- the client provided the test cases, and cases for which the program should be prepared
- we generated case-specific input data needed for test case calculations
- the client-defined output data as interpreted by law and internal business decisions
What was interesting was that the customer “did not like” the results of the calculations that ran over the entire data, even though they knew the calculation was correct. They took over the product, then interpreted the legal provisions slightly differently, changed the way Excel was calculating, and ordered this change in the data warehouse as well.
Error-free software is still a dream. But testing based on expected values can help us to devote development resources to the development of calculations useful for business.
This is how the dream of a commercially useful software becomes a reality.
Zsolt Tüske – Project Director