Blog

Written by Sam Gilbert

Using Google data for policy research

Sam Gilbert explores the revealing data on our collective thoughts and needs easily available from Google – how it has been used, and how you can access it too.

Answer the Public - Public policy question

Every day, we use Google to make a mind-boggling 5.6 billion searches, telling it (in the words of AnswerThePublic’s Sophie Coley) “things we might not tell our partners, our friends, our family…even our doctor”.  Go to Google now and start typing “I’ve just h” into the search box and you will see what Sophie means. As I’m writing this, here are some of the searches Google anticipates, based on what it’s seen people search for before:

  •  “I’ve just had enough”
  • “I’ve just had a car accident”
  • “I’ve just had a panic attack”
  • “I’ve just had the craziest week”
  • “I’ve just had a baby, what am I entitled to?”
  • “I’ve just had a period but feel pregnant”
  • “I’ve just had a poo and there was blood”

What this revealing set of searches highlights is that we turn to Google not just for practical information, but also for advice at life’s most significant and intense moments. Because we are so unfiltered when we use Google and other search engines, the data that is created by our internet searches is hugely powerful. It can be thought of as a vast reservoir of human needs and desires, that grows deeper every minute. It’s a profound expression of our collective consciousness.

In the commercial sector, analytically-minded marketers have been mining big search data for some time.  With help from the data science company Taxonomics, the DIY hardware retailer Screwfix used search data to discover mismatches between how consumers articulated demand for its products and how it had listed them on its website.  Simple changes like re-naming “outdoor lighting” as “security lighting”, and “nailers” as “nail guns” resulted in a ten-fold increase in online sales from Google search.  The founders of Worldstores built a £100m-turnover business by using the same approach to internet search data, setting up multiple niche websites like ShedsWorld and TrampolineWorld that catered to clusters of demand they had found.  And it was big search data analysis that I used to determine product strategy for the fintech startup Bought By Many.  I found unmet demand among owners of specific dog breeds like pugs and cockapoos: seven years later, Bought By Many is the world’s No.1 provider of insurance for unusual pets.

Search data analysis is no more limited to the commercial realm than Google searches are limited to consumer products.  A literature review I did recently with SAGE Ocean surfaced 265 peer-reviewed academic articles that use search data.  Many focus on public health: search data has been used to predict the spread of the Mayaro virus, identify seasonal patterns in domestic violence, and analyse awareness of rheumatoid arthritis.  But while there are a handful of economics articles that use search data to understand changes in consumer demand and movements in financial markets, barely any articles informed by search data have been published in sociology, psychology, or political science journals.  It seems that social scientists are missing out.

This isn’t because there aren’t any interesting puzzles in the social sciences that search data can help with.  Nate Silver has famously highlighted a correlation between support for Donald Trump in the 2016 Republican primary and a geographic analysis of racist Google searches by the data scientist Seth Stephens-Davidowicz. Sophie Coley has used Google search data to reveal Brexit anxieties about the potential for cancer drug shortages, a house price crash, and even civil war. In my own research, I’ve used search data to qualify Bernard Manin’s theory of “Audience Democracy” by comparing interest in parties and party leaders in 60 states, and to predict the outcome of party leadership contests. 

Liberal democrat leadership graph

Rather, I suspect the reason big search data isn’t more widely used in the social sciences is that many people simply aren’t aware of how they can access it.  So if this blog has piqued your interest, where should you start?  I would suggest with Google Trends. Designed to allow ordinary Google users to draw on the company’s vast reservoir of internet search data, it is free to use and has a straightforward online interface.  Queries will return an indexed volume for any search term or topic over time, going back to 2004.  In a recent training session I ran for the Bennett Institute team, simple Google Trends queries illuminated research questions on post-financial crisis US mortgage demand, the economic value of free software, and UK public interest in climate change.  

Climate change interest graph

Another accessible route into search data is AnswerThePublic, which aggregates the Google autocomplete suggestions with which we began this blog post.  You can run three keyword-based queries per day for free, and AnswerThePublic will return csv downloads of searches formulated using question words, comparisons, and prepositions, along with visualisations like this one for “democracy”:

Democracy diagram

So, why not spend fifteen minutes using these tools to explore some topics that are interesting for your own research?

  • About the author

    Sam Gilbert, Affiliated Researcher

    Sam is an entrepreneur and researcher working at the intersection of politics and technology.  His interests include the political legitimacy of big tech companies, and methods innovation using internet search data.   Learn more

    Sam Gilbert