Demand-driven forecasting and the need for clean data

7 December 2015

Rebecca Ameson

Rebecca Ameson

Former Consultant

Forecasts. Predictions. Projections. Whatever you want to call them, these insights are a powerful tool in business, particularly in the Consumer Packaged Goods (CPG) industry. Coming from a statistics background, I had previously had an entirely theoretical outlook to forecasting and all of its applications.

However, having recently undertaken a demand forecasting proof of concept for a global CPG company, it dawned on me that there is a significant difference between the considerations of a textbook and those of a demand planner. Getting it right can unlock millions per year in cost savings.

What is a demand-driven forecast?

Very simply put, a demand-driven forecast is the prediction of a future trend or event, taking demand signals (such as price, promotions and seasonality) into account.

Taking the consumer into account

For consumer goods organisations, a demand forecast is critical in managing a supply chain. Previously, companies have focused on supply-based forecasting, but this fails to take the consumer into account. Admittedly, consumer demand is highly volatile and difficult to predict, but cracking this will enable companies to be proactive to changes in demand; taking reactive lead times out of the equation, reducing safety stock holdings and, ultimately, keeping consumers happy.

So why doesn’t everyone do this?

Anyone can create a forecast. For instance, to forecast sales for the next financial period the minimum that is required data-wise is any amount of historical sales data. So why then do so many consumer products organisations rely so heavily on human intervention instead of system generated forecasts?

When studying forecasting, the biggest challenge was having the technical knowhow around data models to boost accuracy. However, we are now in a time where pre-built solutions exist to remove the necessity of having a deep statistical knowledge of data modelling (of course, this does not mean to say that expertise is invalid). As a result, over the course of the proof of concept, the biggest challenges were a result of data quality, not data modelling.

CPG companies have a multitude of data streams to keep track of, both internal (supply chain, sales, stock data) and external (POS, marketing data), so it is no surprise that data quality is a real issue.

A data model is only as good as the data it is provided with

So what kind of data quality issues should you look out for?

The more historical data you can provide, the better

Time series based forecasting works by using historical trend to predict future trend, hence the longer the history provided, the easier it is for a model to pick up on seasonality, cyclical events and long-term trend to enhance the forecast.

The complete picture is always better than a half finished puzzle

Missing data can cause big problems… Is that a blank value or a zero? Models are able to interpolate blank values in a series, but if these missing values are recorded as zeros rather than blanks, this will skew the data.

Causal factors can make all the difference

A forecast built from sales history alone can be improved by adding in causal factors (such as price changes, POS data, stock quantities, weather changes, etc.) as independent variables in the model. Not all of them will have a significant effect, but any that do can considerably boost forecast accuracy.

Matching data from different sources can be problematic

When including POS data, the biggest challenges were around product SKUs. These may need to be matched between a supplier and a customer, updated as a product is rebranded, or flagged as a discontinued item.

It may be the case that a supplier has one SKU system while a customer has another to meet their own internal standards, so data manipulation may be required here to create a full data set to model and forecast.

Data granularity can help improve overall accuracy

Use of product hierarchies allows forecasts to be made at the most granular level of the data, and at each level upwards, then aggregation can be used to the level of choice. Producing time series at each level of the hierarchy allows multiple models to be chosen that best fit the data at each level. For example, a regional price change has a higher likelihood of affecting regional level than country level, so taking this into account will mean more accurate forecasts are being ‘rolled up’ to produce our country level forecast.

It is only once these issues are dealt with, and base models are fitted, that statistical expertise, model fine-tuning and, if necessary, human intervention should be considered.

In summary

With my statistician’s hat on, I can theoretically see and understand how clean data can significantly improve the accuracy of a forecast.

As a consultant, I can also see how this can financially benefit consumer goods companies, for example with demonstrated double-digit percentage point improvements in forecast accuracy and thus a reduction in safety stock and a tighter supply chain.

About the author

Rebecca Ameson

Former Consultant

Bluefin and SAP S/4HANA - welcome to the one horse race