On Being Wrong
Wall Street is full of brilliant people with spotless resumes, people who ever since 8th grade have gotten As in school and scored in the top decile on standardized tests like the SAT.
Unfortunately, markets don’t deliver many perfect scores. In investing, beating the market 60% of the time would be a triumphant performance.
Successful investing requires us to confront the problem of mistakes—no investor can avoid them, and we can expect any good strategy to generate a significant amount of errors. But is it possible to reduce our errors, to avoid mistakes, and also generate high returns? Or is a willingness to be wrong a precondition of beating the market?
Like most other quantitative investment firms, we use linear factor models to rank stocks. The simple premise behind these models is to look at what sensible factors have historically predicted stock returns and then rank today’s stocks according to how they score on each factor. Stocks with higher factor scores will have higher expected returns. However, these factor models tend to only explain about 30% of market movements, meaning they generate a significant amount of errors.
Take, for example, the linear factor model Verdad uses to rank stocks in the US and Europe. We show a chart below that compares return predictions from the model against actual outcomes. The diagonal line represents the expected return forecast from the model, and the circles represent the realized returns in each decile of stock rank. The distance between each circle and the diagonal line represents the magnitude of the forecast error, or what statisticians call “residuals” (defined as the difference between the realized return and the model’s forecast return).
Figure 1: Return Residuals by Linear Ranking Decile (Jul 1997 to Jun 2018)
Source: S&P Capital IQ and Verdad research
This linear model does an excellent job of ranking stocks. But, curiously, the model makes more and more errors as the expected return increases: On average, the residuals in deciles 6 to 10 of linear rank are further away from the expected return line than the residuals in deciles 1 to 5.
We wanted to see if we could identify any patterns in what was causing the linear model to make bigger mistakes in higher-ranked stocks. We looked at the major ingredients in our model, plotting forecast errors against four individual factors: leverage, value, size, and momentum.
Figure 2: Return Residuals by Individual Factors (Jul 1997 to Jun 2018)
Source: S&P Capital IQ and Verdad research
These plots show a consistent result: The biggest errors are in the extremes of each factor. The most levered stocks have the highest return dispersion and, therefore, the biggest forecast errors. The cheapest and smallest stocks also have large forecast errors due to higher return dispersion. Finally, firms with extremely low and extremely high returns over the past year (momentum deciles 1 and 10) have a wide dispersion of outcomes over the next year. The factors that predict returns also appear to predict dispersion and model errors.
But what if we could train a separate model to eliminate or reduce these errors? The Holy Grail of investing would be to invest in the extremes of these factors, where dispersion is highest, but avoid the most negative outcomes.
We wanted to figure out exactly what was causing these big errors in the highest-ranked stocks. So we built a data set with all of our linear model’s return forecasts and actual outcomes from 1997 to 2018. We labelled the most negative mistakes (bottom three deciles of residuals) with a “1” because those are the errors we sought to eliminate. Every other residual was labelled with a “0”. The targeted residuals are shaded in green in the chart below.
Figure 3: Distribution of Residuals (Jul 1997 to Jun 2018)
Source: S&P Capital IQ and Verdad research
We then pulled down every company characteristic we thought might be predictive of model error: valuation, profitability, leverage, momentum, earnings volatility, industry, and dozens of other sensible explanatory variables. Using a subset of this data, we trained a machine learning algorithm to study which quantitative characteristics separate the 1s (mistakes) from the 0s. And when testing out-of-sample on the remaining data we originally held out, the machine learning model provided a probability that each new company is a 1 (i.e. a mistake we wish to eliminate). These probabilities were then compared against the actual proportion of 1s in the new data in order to evaluate the predictive performance of the model.
Figure 4 presents the out-of-sample results. When presented with completely new data, the machine learning model’s predictions of linear mistakes lined up very closely with reality.
Figure 4: Out-of-Sample Forecasts of Probability of Being Wrong (Jul 1997 to Jun 2018)
Source: S&P Capital IQ and Verdad research
We were thrilled to see these results. The machine learning model can effectively anticipate mistakes from our linear ranking system. We were excited to see whether we could then use this knowledge to improve our investment returns: we’d apparently found the Holy Grail—an ability to reduce the error rates of traditional factor models.
But our enthusiasm quickly turned to surprise when we plotted the probabilities of being wrong against stock returns. The chart below plots returns (expected and realized) by decile of probability of being wrong. The two lines on either side of the bars represent the volatility of realized returns in each decile.
Figure 5: Out-of-Sample Returns by Decile of Probability of Being Wrong (Jul 1997 to Jun 2018)
Sources: S&P Capital IQ and Verdad research
As you move from safe decile 1 (13% chance of being wrong) to riskier decile 9 (44% chance of being wrong), the average realized return increases from 13% to 21% per year. This means that in about 90% of cases, it is not worth trying to eliminate the linear model’s mistakes because doing so would also eliminate stocks with high returns. Being wrong about 40% of the time is actually a good thing as an investor because it means you’re taking a sufficient level of contrarian risk in order to achieve high returns.
The higher the probability of being wrong, the higher the expected return, and the higher the realized return! We had set out to see if we could improve returns by reducing errors, but our months of quantitative work had produced results that mostly suggested that the only way to improve returns was to take on a higher risk of errors! We had produced either a wonderful proof of market efficiency or a confirmation of Nassim Taleb’s idea of antifragility: “What is antifragile loves randomness and uncertainty, which also means—crucially—a love of errors.”
We had set out to identify the source of our factor model’s errors and thus improve returns, and we discovered that our model produced higher returns precisely because of the errors. The mistakes were the source of return, not the enemy of return. They were our friend, not our enemy.
And broadly, perhaps this is why quantitative factor investing works. All of the fundamental investors—the straight-A students—don’t want to make an investment with a 40% chance of being wrong. Imagine being an analyst at Viking and pitching stocks with 40% error rates each month: How many disasters would it take before you got fired? In contrast, quantitative investors would look at the base rates and prefer these characteristics.
That is not to say, however, that our efforts were entirely in vain or merely produced a philosophical insight into market efficiency. Look closely at the returns graph, and you will see that in decile 10 of risk—where the average probability of being wrong is about 50%—the realized returns do in fact drop off. Even though the expected return is 20% in the tenth decile, the average realized return is 11% because about half of these stocks experience big price declines over the next year.
Next week, we will discuss how we use these 10th decile probabilities in our investment process. We will talk about how, by eliminating stocks that present uncompensated risk at the extremes, we can take on a heavier dose of sensible factor exposure. We can venture further into deep value and take on more leverage risk because we understand where that risk is best compensated. And crucially, we understand that high error rates are the source, not the enemy, of returns in our investment strategy.