Reading time: 5 min
As a company providing data solutions, we often encounter a kind of confusion among our customers about how to apply a Machine Learning (ML) based solution to a business problem (or even how to give data-oriented support for an entire department or a company). One of the most important basic features of analytical initiatives is that they are often pilot projects, and their success and added value are difficult to predict at the beginning. Therefore, if we do not approach it carefully and with proper methodology, such an initiative can easily be a failure and burn into the minds of business decision-makers as a negative experience, a total waste of money. In case of analytical projects, an approach is required where phases are defined in a way that enables gradually higher investment and parallelly increasing guaranteed ROI (return on investment). Based on our experience, the best solution is a Proof of Concept-based approach that can and should be used at any level of data maturity.
PoC, i.e. Proof Of Concept
By PoC, we mean mini-projects that explore the feasibility of a (Machine Learning) solution and its business benefits quickly and cheaply.
Many times, a data project idea sounds very good in theory: it promises low costs and high benefits, and on a high level there is no sign of any obstacles. However, in reality, when the development has already begun, the project might fail for a variety of reasons (e.g., no data, poor data quality, or simply due to a business issue that is not able to be modelled, etc.). As a manager, it’s tempting to accept such good ideas with the promise of quick implementation, but it’s much more uncomfortable to face the fact that the dreamed-of ML app still doesn’t meet our expectations, plus we’ve thrown a small (or even bigger) amount of money out the window. The solution to minimizing such risks is the PoC-based approach, which means spending extra energy (and money) only on what is proven to be worth addressing.
The life cycle of ML-based solutions
The figure below illustrates how an idea is integrated into business processes.
- As a first step, we need to be aware of what use cases and data related business problems are available. If these are not available, it is worth stepping back and reconsidering the main pain points of the company’s operations (e.g., reducing costs, stopping customer dropouts, increasing margins, etc.). Once these pain points are given, testable ideas (hypotheses) can be defined, for example: risk of churn can be determined using customer data and past behavior patterns.
- Regardless of the sector, there will be several business problems for which a data solution is conceivable, but there is no capacity to implement all the ideas. Ideas should therefore be prioritized, i.e. it should be determined for which idea(s) it is the most worthwhile to start a PoC project. In our experience, there is no obstacle to running multiple PoCs in parallel (if you have the capacity or, for example, to test multiple suppliers), but you should expect these threads to run independently. The big advantage of this is that you can have two strings to your bow, so you have a better chance of success, but obviously, the attention is divided, so it is worth deciding on parallelization depending on the available capacities. It is also worth considering speed through prioritization, i.e. to define PoC projects for those business cases where you can validate the possibilities and pitfalls of an analytical solution relatively quickly.
- If the PoC is successful, i.e., the idea is tried and brings the expected benefits, it is worth moving on to implementation and incorporating the results into day-to-day operations. In case the result is the opposite, stop and draw conclusions, don’t waste more resources on the idea.
Risk reduction is guaranteed by smart phasing: we only implement the idea that makes sense based on PoC.
We often go even further in reducing risks, and at the third of a PoC project, there is a GO / NO-GO decision. The process of developing a data-driven solution can be simplified and risks can be reduced by using the following decision tree from use case idea to implementation.
Principles of the methodology:
- Planning – Let’s spend time selecting an idea for PoC, specifically checking feasibility in terms of data availability and quality.
- Toolbox – No need for a cloud data warehouse or sandbox, but sufficient amount of good quality data is really needed
- Open-mindedness – It is crucial that the organization be open to such a solution, and not as an attack against expertise, as experts could feel “a machine will not tell me what to do”.
- Expertise – When acquiring business knowledge, it is worth involving experts, asking for help and not competing with them with an algorithm.
- Lessons Learned – If a PoC is not successful, draw conclusions and use them in the next idea.
- Multiple attempts – If possible, try multiple problems (e.g. prediction, optimization, anomaly detection, etc.)
- Tensions – Be prepared for the fact that a new solution can replace old processes, and you have to deal with these changes.
- Simplicity – To maintain motivation and address expectations, start with an easier idea that is expected to be successful, and then deal with riskier ones.
Arguments against PoC, difficulties
Although there are many arguments in favor of the PoC approach, we also have some against to highlight:
- You can’t just launch PoC projects forever. It is important that the data “project funnel” be strong and that integrated data applications are born, and there is not only experiment all the time.
- Scalability. It is not always clear how a successful PoC project can be scaled to an implemented solution. This aspect must be taken into account from the beginning.
- Dealing with excessive expectations. It is important that experts and business decision-makers involved in the process know what to expect. The result of a PoC is not a final solution, not a classic project, and the realization of financial benefits typically come in a follow-up project.
- Take the time. One of the advantages of PoCing is that it is fast, but that doesn’t mean you can hurry. Depending on the complexity of the PoC, the lead time can take from 2 weeks to up to half a year, and the typical lead time is 10 weeks based on our experience.
- Buy-in. Without the commitment and motivation of the related business area, it is not possible to design and operate an ML solution successfully.
- Quantify benefits. It is often unclear how the potential benefits will be accurately demonstrated (e.g. savings due to an unforeseen event). Significant energies need to be allocated for thorough and fair re-measurement and evaluation.
Szabolcs Biró – Head of Advanced Analytics
Ákos Matzon – Advisory Team Leader