Professor Neil Lawrence and Jess Montgomery discuss the potential of data science to generate new insights for research and policy.
In the classic film Monty Python and the Holy Grail, John Cleese, as Sir Lancelot the Brave, finds a note – a plea from a prisoner of Swamp Castle – beseeching the discoverer to help them escape an unwanted marriage. Responding to this distress call, Sir Lancelot singlehandedly storms Swamp Castle, slaying the guards, terrorising the wedding guests, and fighting his way to the Tall Tower. There, instead of the expected damsel in distress, cruelly imprisoned, Sir Lancelot is surprised to find a wayward groom, Prince Herbert, who sent the note after an argument with his father.
The United Kingdom is considered an international leader in science, and a pioneer in the provision of science advice. The Government has well-established structures for accessing scientific expertise in emergencies through its network of science advisers, the Scientific Advisory Group for Emergencies (SAGE) and departmental advisory committees, including the Science for Pandemic Influenza Groups that provide advice on covid-19 modelling and behavioural science. Together, these structures might call to mind a different Arthurian vision, evoking the works of Thomas Malory: the scientist as Merlin, giving wise counsel to Arthur and honing the Government’s decision-making through deep knowledge of the scientific arts.
Scientists are concerned citizens, and it is perhaps with this vision of adviser as trusted arbiter that many researchers entered into public and policy debates surrounding covid-19. While pursuing the wise Merlin, however, efforts to advise government can easily drift towards Monty Python’s Lancelot. Confident in his knowledge of castle-storming, his individual dedication and his skills in damsel-rescuing, Sir Lancelot enters the fray with only a partial understanding of the challenges and circumstances at hand.
Science policy has long sought ways of bridging the gaps between scientists and policymakers, helping each understand the ways in which evidence can inform policymaking. The UK’s response to the covid-19 pandemic has highlighted the importance of this work, and the long-standing cultural issues that make this mission so challenging. Driven by experiment and theory, scientists often seek definitive answers to a particular question, with each new study prompting more questions and illuminating areas for investigation that stretch into the future. In contrast, policy advice is often rooted to a moment in time. Events cannot wait for definitive scientific understanding. Instead, policymakers need access to high-quality advice that provides actionable insights, based on current understandings, while acknowledging areas of uncertainty.
But what constitutes the best current understanding? Any policy issue can be viewed through multiple lenses: the scientific principles at hand, the economic implications, public acceptability of potential responses, the values embedded in those responses, or operational considerations in policy delivery, amongst others. Each of these lenses is important in considering the evidence available to inform responses to the covid-19 pandemic: the complexity of the pandemic, our relatively limited understanding of the virus, and the practical difficulties of implementing public health policy demand a range of expertise.
These complex and uncertain challenges require a multi-disciplinary response.
The unprecedented nature of the pandemic has spurred multiple efforts to bring research expertise to bear on covid-19 policymaking. These have included a call for rapid assistance from the modelling community, a group providing rapid review and literature analysis, and an independently convened SAGE group. Our own experience is of another of these efforts – the Royal Society-convened DELVE Initiative.
In the early stage of the pandemic, as many governments struggled to implement policies that held back the first wave of infections, data scientists began to explore how advanced analytics could complement traditional forms of government science advice. Chaired by the President of the Royal Society and feeding into SAGE, the DELVE Initiative set out to analyse data from countries at a more advanced stage of the covid-19 pandemic, using these insights to inform UK policy responses.
Data science for ‘real world’ policy questions can only be done effectively with access to domain expertise: extracting insights from data is important, but applying these insights to policy development requires the contextual understanding brought by domain experts, including those embedded in the policymaking process. Mapping this onto the attack of Swamp Castle, Professor Lancelot is better advised to consult his colleagues at the Round Table before charging the Tall Tower – while Lancelot has the tools to break down the doors, other knights may know the residents, the routes in, and why the distress call was sent.
Bridging the ‘supply chain of ideas‘ between researchers and policymakers has been core to DELVE’s approach. The breadth of covid-19’s impacts and potential policy responses has required that DELVE make connections across public health, economics, behavioural science, immunology, and more, and the value of collaboration can be seen across DELVE reports. For example, in an early report on testing and tracing systems: researchers from public health brought a wealth of insights about how to detect and manage disease outbreaks in communities, data scientists translated this to analysis that quantified the compliance rates needed to ensure testing and tracing efforts would be successful, and economists contextualised this with evidence about what interventions would encourage individuals and organisations to comply with a test, trace, isolate regime. DELVE’s remit became interdisciplinary by default, with data the focal point around which to convene domain experts.
This type of evidence synthesis would traditionally rely on collaborations developed with the luxury of time – time to understand how different disciplines frame an issue and to identify the different types of evidence that might be policy-relevant. Using data as a convenor has offered a short-cut through these discussions, by creating a common focal point from which different domain experts can explore their ideas. Arthur brought his knights together through the convening power of a sword, Excalibur; DELVE convened its scientists through data.
Despite its importance in enabling rapid evidence synthesis, in pursuing this ambitious research agenda, a consistent barrier to further action has been access to data. Labouring the parallels to Arthurian legend, in many cases relevant data was so difficult to identify and access it may as well have been mythical. But in practice it was the idea that the data might exist and be accessible, rather its actual availability, that was sufficient to convene expertise through DELVE.
Barriers to government data sharing – whether resulting from perceived legal issues, lack of capability in government, or technical barriers to data use – were well-characterised before the pandemic, but have been thrown into sharp relief in recent months. Where successful data sharing arrangements have been established to support the covid-19 response, these have tended to rely on pre-existing relationships between data scientists and policymakers that foster shared understandings of how to use data in research and policy. In some ways, other disciplines have already learned this lesson – sustained engagement between government and academia has played a central role in major policy changes across government, from the smoking ban to the Climate Change Act. If data science is to find a role in policymaking, it will need to build on these experiences.
A new model of open data science, which capitalises on the power of data in convening multidisciplinary exchanges, will be vital, if we are to realise the potential of data science for research and policy. By building a community of researchers at the interface of data science and other disciplines, there is an opportunity to create exciting new research agendas that both advance data science methods and generate new insights for research and policy. Such a community would embed multidisciplinary engagement in its research culture, developing relationships and building capacity for rapid response to future policy challenges. It would seek to create a governance environment in which data can be used safely and rapidly, while ensuring that data analysis tools are made widely available, with clear information about how to use them.
This open data science model will be central to the work of the Accelerate Programme for Scientific Discovery, a new initiative from the Cambridge Computer Lab that will pursue research at the interface of machine learning and the sciences. By operating outside the traditional boundaries that separate disciplines, open data science could bridge the gap between ‘data science’ and the domains that would benefit from its tools and techniques, enabling ideas to spread rapidly and ultimately advancing scientific discovery for the benefit of society.
The views and opinions expressed in this post are those of the author(s).