**In this unprecedented time of uncertainty, how can policymakers plan for the future? Jeff Chen explains how it's only possible to predict the future when the phenomenon of interest is sufficiently stable and resilient.**

Planning for the future seems to have become an almost impossible task. Before the pandemic, many people enjoyed some semblance of regularity, whether it was commuting to work and seeing co-workers or going to buy food from the neighbourhood grocer. But the current paradigm feels different and thinking beyond the next day or week presents a challenge – and if even the small things stand on shifting sands, how are decision-makers supposed to plan ahead?

As a computational statistician who has worked in different areas of policy, I look at the world through a mathematical lens. Outlooks about the future are grounded in the principles of forecasting and prediction. If one can predict what will happen next with some degree of certainty, then it stands to reason that it is possible to affect change. This idea of staying ahead of the curve is centrally engrained in all aspects of public policy, from managing fire risk at the local level to monitoring the economic health of an entire country. It turns out anyone can produce a forecast. Whether that forecast is any good is entirely another thing. So, what is the secret sauce in a useful forecast?

In a word, stability. Social scientists and statisticians have long approached the world in terms of the stability of behaviour and outcomes. Social and economic phenomena are quantified as time series (i.e. data points captured at regular intervals such as daily, monthly) in order to infer the direction of human events. It turns out we can only predict the future when the phenomenon of interest is sufficiently stable and resilient. This means if a large shock to the system occurs, there will be changes, but if stable and resilient, the variables of interest will return to normal levels. When looking at time series data, this means it is stationary – the average and the variance (around the average) of a time series are constant over time.

But what if a phenomenon is not stationary? In practice, this means the course of history is a bit more random, which in turn causes the average and variance over time to be changing over time. It becomes harder to predict. A large, exogenous shock (e.g. COVID-19) has effects that are felt in perpetuity rather than fading over time, permanently altering the trajectory of a time series. In these non-stationary cases, a time series is said to have a unit root, a term meaning that the best predictor of what happens tomorrow is what happens today. The reality could turn out to be very different, but we have no better way of predicting it. The figure contrasts time series with and without unit roots.

This sounds technical, but the practical policy implications of stationarity and unit roots (or their absence) are far reaching. Central banks, for example, are charged with keeping inflation under control, and in order to do so, they forecast the complex dynamics of the economy to identify which monetary lever they should pull. Municipal services such as fire departments, police, and transit have to forecast demand for their services in order to plan how many staff need to be on hand. In all cases, the policies built upon these forecast models are only valid if the researcher took into account whether the data is stationary or instead has a unit root process. Failure to detect a unit root can mean policy is built on a house of cards.

So how do we detect this esoteric-sounding yet omni-present condition? Over many decades, econometricians have developed hypothesis tests (a type of statistical diagnostic) to detect the presence of a unit root in time series data. Yet when these tests matter the most – at the point where a phenomenon stops being stationary and starts to display a unit root – their diagnostic power is only slightly better than a coin toss. Furthermore, many of these tests do not agree on whether a unit root exists, even when looking at the same data. For such an important building block of policy, this is unsatisfactory.

In our working paper “Standing on the Shoulders of Machine Learning: Can We Improve Hypothesis Testing?”, my co-authors Gary Cornwall (U.S. Bureau of Economic Analysis), Beau Sauley (University of Cincinnati) and I investigate how to close the gap by bringing to bear current-day computational techniques. Most existing hypothesis tests descend from a statistical tradition that traces back to 1925, when Ronald Fisher published Statistical Methods for Research Workers – thus the current state of the art relies on cutting edge principles from nearly a century ago. Since both stationarity and unit roots have clear mathematical definitions, in our paper show how to leverage heavy computational infrastructure to generate billions of data points and map out under which conditions the phenomenon exists even when current tests disagree. In essence, we are able to replicate unit roots and stationarity in a lab-like environment inside the computer where we know the “truth”. By applying machine learning techniques, our algorithms take an unprecedentedly close look at hundreds of thousands of scenarios and construct a new type of hypothesis test: a composite test which incorporates information from many of the current tests at once. In so doing, we dramatically reduce any ambiguity in the diagnostic tests.

Our paper found that of nine current state-of-the-art unit root hypothesis tests, the best performing test could detect near-unit roots only 56% of the time. Furthermore, many of the tests rarely agreed in their verdicts. In contrast, our machine learning-based test increased the detection rate of near unit roots to 92% – a 36-percentage point increase. One can imagine that in the counterfactual world where our test had been available, there are countless economic decisions in the last four decades that could have unfolded quite differently.

The devil is indeed in the details. If you analyze time series, whether for macroeconomics, epidemiology, and any other inference-driven field, please check out our prototype R package (hypML @ Github) that makes testing as easy as classical unit root tests. We hope those advising policymakers will use it.