[time series] Using MAPE? Consider these 7 alternatives instead
MAPE (Mean Absolute Percentage Error) is a popular metric for evaluating the accuracy of forecasting models in time series analysis. We’ll compare that to sMAPE and I’ll introduce you to some resources.
First, what is MAPE? It’s a way to measure the performance of a model. It’s often used in time series data analysis. Here’s a formula:
Pros of using MAPE
Interpretability: MAPE is easy to understand. It represents error as a percentage, allowing stakeholders to grasp the model's performance quickly.
Scale-Invariance: Being a percentage-based metric, MAPE is not influenced by the scale of the data, which makes it useful for comparing performance across different time series with varying scales.
Uniform Impact: Each observation contributes uniformly to the overall error measure, unlike squared error metrics where larger errors have disproportionately higher influence.
Emphasizes Relative Error: Because it's percentage-based, MAPE emphasizes the relative size of the error, which can be more important than the absolute size in some applications.
Cons
Undefined for Zero Values: MAPE is undefined when actual values contain zeros, as division by zero is undefined.
Bias Toward Underprediction: MAPE penalizes overprediction and underprediction differently. If the actual value is 100 and the forecast is 110, the MAPE is 10%, but if the forecast is 90, the MAPE is also 10%. This can bias the model towards underpredicting.
Not Symmetric: Unlike metrics like MSE (Mean Squared Error), MAPE is not symmetric. A forecast that is 10% too high contributes as much to the MAPE as a forecast that is 10% too low, but these may not be equally bad in some contexts.
Not Suitable for Low-Volume Data: MAPE can be skewed or misleading for series with low volume or small numbers, as small changes can lead to large percentage errors.
Inconsistent Across Scales: While it's scale-invariant for individual time series, it can be inconsistent when comparing performance across datasets with different units or level of granularity.
In summary, MAPE has its benefits in terms of interpretability and scale-invariance but falls short in areas like handling zero values and potential biases.
Why MAPE Fails: Bias towards underprediction
Let’s double-down on “Bias towards underprediction”. Here’s a quote from Rob Hyndman:
Armstrong (1985, p348) was the first (to my knowledge) to point out the asymmetry of the MAPE saying that “it has a bias favoring estimates that are below the actual values”. A few years later, Armstrong and Collopy (1992) argued that the MAPE “puts a heavier penalty on forecasts that exceed the actual than those that are less than the actual”. Makridakis (1993) took up the argument saying that “equal errors above the actual value result in a greater APE than those below the actual value”. He provided an example where y_t=150 and yhat_t=100, so that the relative error is 50/150=0.33, in contrast to the situation where y_t=100 and yhat_t=150, when the relative error would be 50/100=0.50.
sMAPE?
Differences from MAPE
Symmetry: sMAPE treats overpredictions and underpredictions more symmetrically than MAPE by using the average of the actual and forecasted values as the denominator.
Defined for Zero Values: Unlike MAPE, sMAPE can handle cases where either At or Ft are zero (though not both at the same time) because the denominator will not become zero.
Still Scale-Invariant: Like MAPE, sMAPE is scale-invariant and can be used to compare forecasts across different scales.
However, it's worth mentioning that sMAPE has its own criticisms, such as being bounded only when forecasts and actuals have the same sign and still being sensitive to the scale of the data. Nonetheless, sMAPE is often favored over MAPE when a more symmetric error metric is desired.
What else can we use?
Hyndman covers a variety of other metrics:
Let’s break those down:
How they compare:
Interpretability:
MAPE, MdAPE, and sMAPE are more interpretable as they use percentage errors.
MASE, GMRAE, and MdRAE are less straightforward to interpret but offer nuanced insights.
Sensitivity to Outliers:
Median-based metrics (MdAPE, sMdAPE, MdRAE) are less sensitive to outliers compared to their mean-based counterparts.
Symmetry:
sMAPE and sMdAPE treat overestimations and underestimations more symmetrically compared to MAPE and MdAPE.
Handling Zeros:
MAPE is undefined for zero actual values.
sMAPE, MASE, GMRAE, and MdRAE can handle zero actuals to some extent.
Scale:
MAPE, sMAPE, and their median variants are scale-invariant, but they may not be suitable for comparing models across different data sets.
MASE is explicitly designed to be a scale-independent metric.
Relative Error:
MdRAE, GMRAE, and MASE use relative errors, taking into account the historical data, which can be useful to gauge performance in the context of data volatility.
Mathematical Properties:
GMRAE involves a geometric mean, making it more complex to compute.
MASE adjusts the error by the mean absolute error of a naïve forecast, providing a normalized measure.
Applicability:
MAPE and sMAPE are more general and can be used in many settings.
MdRAE, GMRAE, and MASE are often more specialized and are used when understanding error in the context of historical data volatility or scaling is important.
Computational Complexity:
MAPE, sMAPE, and MASE are computationally simpler.
GMRAE involves multiplication and nth roots, making it computationally more intensive.
Hope you enjoyed this
Now you know a few more metrics. Keep them in your toolkit in case you need them down the road.
Resources
Rob Hyndman: https://robjhyndman.com/hyndsight/smape/
Rob Hyndman (2005): https://www.robjhyndman.com/papers/mase.pdf