Forecasting not Prediction: A real-time Social Sciences Earth Observatory

I believe the holy grail of political and security risk analysis is accurate forecasting as opposed to prediction, which draws on one’s mystical abilities to see the future like Nostradamus did. The goal is to forecast possible likely scenarios using available data and a stringent regimen of the good ol’ scientific method. If this can be done accurately and reliably one should be able to make (semi)informed judgments about what is to happen where, when and why in the future. Granted, we have been trying to do this for years and we have certainly not figured it out fully yet, but we are well on our way. If you are interested in seeing how far scientists have come, I urge you to read these articles:

*Although some of them call it prediction here, they mean forecasting 😉

In the coming months I will be looking into the field of forecasting, as I believe we are on the precipice (or perhaps already over it) of actually being able to gain valuable information from analysing readily available open-source data on global events. Leading the field, specifically for forecasting human societal-scale behaviour, is the Global Database of Events, Language, and Tone (GDELT). This initiative by Kalev Leetaru, Philip Schrodt, and Patrick Brandt attempts to gather data on all global events, make this data freely available for open research, and provide daily updates to create the first “real-time social sciences earth observatory”.

So what is GDELT? The acronym stands for Global Data on Events, Location and Tone. The creators describe it as a: “CAMEO-coded data set containing more than 200-million geo-located events with global coverage for 1979 to the present. The data are based on news reports from a variety of international news sources coded using the Tabari system for events and additional software for location and tone.” It utilises the top-of-the-line machine-coded automated data gathering methods. What this means is that it is capable of collecting and coding over 26-million news reports in six minutes, leaving the prospective researcher with over 3-million categorised new events to analyse. This same task would take about 500,000 man hours and cost around $10-million, so yeah.

These events are then divided into groups (Quad Classes) such as verbal co-operation, material co-operation, verbal conflict, and material conflict. Each event contains info on what happened, where it happened, who was involved and what the tone of the article was. It also gives each event an impact score, called a Goldstein (1992) score. With this set of data, the fun process of analysis and perhaps even forecasting can now begin. Below is a map illustrating one day’s events and interactions in the world in the middle of the Arab spring in 2011. The colour of the dots indicate a positive (green) or a negative (red) event and the links show interaction between actors.

GDELT MapAnother interesting map is this ‘heat map’ below illustrating global security-related activity as measured on GDELT on the same day 2011.

 GDELT 2013

The authors of the GDELT initiative have been gracious enough to make the whole massive data set available for download. Furthermore, they state that new events will be added daily. It should be noted that government agencies across the world have been involved with similar studies for years, but finally this kind of data is now becoming freely available to the public.

A recent interesting article in the Wall Street journal examines a similar topic. In this article they describe a US federal intelligence research project, funded by the Intelligence Advanced Research Projects and managed by Virginia Tech professor Naren Ramakrishnan. According to a program announcement, posted on; “The goal of the program is to develop continuously automated systems that use information from these sources to predict when and where a disease outbreak, riot, political crisis or mass violence might occur. Currently, the project is focusing on events in Latin America.” They recently successfully forecasted the 7 September socio-economic protests in Brazil.

The goal of my future exploration into this subject will be to find a systematic and simple way to use these databases for political risk analysis. I believe that political risk products could benefit greatly from the forecasting capacity that these studies and programmes could open the door to. However, it should be noted that most of these studies gather their information from current and past media sources, and as one knows there are news sources, I won’t name and shame, though…Fox…ugh…Xinhau, that clearly does not follow a double-blind peer review system before they submit their stories. Still, I believe there lies great opportunity in this field of research and that the real benefit of it is just beginning to be realised.

Just remember: “A good forecaster is not smarter than everyone else, he merely has his ignorance better organised” and “Those who have knowledge, don’t predict. Those who predict, don’t have knowledge“.

Written by Barend Lutz – Political Risk Analyst | Social Media Manager at red24

Tagged , , , , ,

One thought on “Forecasting not Prediction: A real-time Social Sciences Earth Observatory

  1. […] all open-source media and look for specific relevant events in real time. These can be used for forecasting human societal-scale behaviour,which is the long-term goal of programmes such as the Global Database of Events, Language, and […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: