Predictive Analytics in the Time of COVID-19

April 1, 2020 2 min
  • facebook
  • facebook
  • instagram
  • twitter

Could we have better-utilised technology to predict the outbreak of COVID-19?

The health, safety, and economic impacts of COVID-19 around the world are still unfolding. It may be months or years before we fully appreciate the devastation of this disease on society, business, culture, and our sense of self.

One question that some are beginning to ask is this: could we have better-utilised technology to predict the outbreak of COVID-19?

If you’ve watched the 2019 Netflix documentary Pandemic, the creators of that series (covering the potential for an influenza outbreak) now seem almost prophetic. A widely circulated 2015 video of Bill Gates giving a TED talk about the dangers of a future global outbreak of disease seems prescient in March 2020.

I talk to clients all the time about the massive and disruptive potential for artificial intelligence in business, government, education, and medicine. I am concerned that even with the potential power of AI to predict novel patterns from within data, we somehow missed faint signals of an outbreak amidst all of the “noise”. Even though these signals were weak, we must learn how to harness the power of predictive analytics for the future more effectively. By picking up on these faint signals earlier, the impact on health and safety across could be enormous.

Covid Blog CTA

Professional and citizen data scientists today have a vast array of information sources to tap. Narrowing the choices of data pipelines down to the right streams is part of the art of data science. Data points from disparate sources such as healthcare providers, government websites, Facebook posts, Twitter feeds, trusted sources of news, mobile alerts, WhatsApp groups, email, mobile data, etc., etc., etc. all constitute potential sources of unstructured data. This real-time data can then be synthesised with more structured (but less dynamic) information contained in various public datasets to discover possible leading indicators related to disease outbreaks.

There is an emerging field in data science that is becoming more commonly known as “outbreak analytics”. This approach to data science seeks to use widely disparate sources of data—often thousands of sources in real-time—to determine if the independent variables from our data sources are beginning to give off faint signals that an outbreak of disease might be imminent. Data scientists can then run these thousands of seemingly unrelated independent variables through their AI toolkits. The data scientist is then in a position to try and solve for the all-important dependent variable: how likely is it that an outbreak of disease is about to occur? As much (or more) as the types of analytics that companies are developing today to predict buying, selling, fraud, or risk patterns, outbreak analytics is an area that needs more expertise and investment.


Maybe one of the silver linings in all of this is that we could see innovations emerge in this area driven by professional and citizen data scientists alike.

Given the seriousness of the COVID-19 outbreak, perhaps more attention will now be given to the use of predictive analytics for the public good by big technology companies and smaller organisations—professional scientists and citizen data scientists alike. By using AI approaches like natural language processing, natural language understanding, machine learning, and deep learning, there is the potential to make sense of the massive amounts of public data at our disposal.