Guest post: What 100,000 studies tell us about climate impacts around the world
The rapid growth of climate research provides an unprecedented evidence base for observing the impacts of climate change across the globe.
However, the sheer volume of published studies means attempting to evaluate it as a whole makes for a daunting challenge.
In our new study, published in Nature Climate Change, we used machine-learning approaches to assess, classify and map more than 100,000 peer-reviewed studies on climate impacts.
Our findings show that the influence of human-caused warming on average temperature and rainfall can already be felt for 85% of the world’s population and 80% of the world’s area.
The results also highlight an “attribution gap” between countries in the global north and south, due to a relative lack of research on climate impacts in less-developed countries.
§ Machine-learning
Since the Intergovernmental Panel on Climate Change (IPCC) published its first assessment report in 1990, the number of studies relevant to observed climate impacts published per year has increased by more than two orders of magnitude.
By itself, the first part of the IPCC’s most recent assessment report – published in August – references more than 14,000 scientific papers.
This exponential growth in peer-reviewed scientific publications on climate change is already pushing manual expert assessments to their limits.
In this study, we develop an approach using machine learning – developing an algorithm that can recognise not just whether a study is about climate impacts, but the locations mentioned, the climate impact driver – whether the impacts were caused by temperature or precipitation changes – and the type of impacts described.
To do so, we use the state-of-the-art deep-learning language representation model, called “BERT”. The model can capture context-dependent meanings of texts, which means it can extract the information we are looking for from each study we analysed.
We trained our algorithm using “supervised learning”, which involves our team hand-coding more than 2,000 documents. Our algorithm was then able to replicate the classification decisions made by humans well. The predictions it makes are, of course, not perfect, but our approach allows us to assess uncertainty ranges of our predictions explicitly.
From our experience with double-coded documents by different human coders, we can testify that human classification is not without errors or disagreement. How the performance of machine learning and human coders compare is an interesting area for further research.
In total, our algorithm identified 100,000 studies that document ways in which the human and natural systems have been impacted by climate and weather. These papers are drawn from the journal databases, Web of Science and Scopus. We do not filter the studies according to their quality or the prestige of the journal they are published in.
The chart below shows how the amount of scientific papers documenting climate impacts is large and growing. The blue shading reflects the uncertainty around our machine-learning approach.
§ Weight of evidence
Where possible, our approach draws out the location where each study is focused. This allows us to map how these studies are distributed across the globe, as shown below.
Each cell has a “weighted studies” score, with the darker areas of the map indicating where evidence is more dense – that is, where there are more studies referring to each grid cell.
For example, almost every grid cell in Europe has several studies documenting climate impacts. Whereas, there are some areas – particularly in Africa – where the distribution of evidence is much more sparse.
Use the filters to switch between climate driver – temperature or precipitation – and type of impact – such as on “mountains, snow and ice” and “rivers, lakes and soil moisture”. These give a sense of where evidence on different impact types is distributed differently, but also highlight some limitations of our method.
Because our documents were categorised using a machine-learning model, there will be some that are incorrectly classified. For example, filtering documents to “coastal and marine ecosystems” concentrates the darker patches to coastal areas and ocean regions, but some dark patches remain inland.
When inspecting where our algorithm performed better and worse, we noticed that documents about fish, especially salmon – which migrate to the ocean before returning to freshwater to spawn – were sometimes misclassified between terrestrial and freshwater ecosystems and coastal and marine ecosystems.
Many studies document the impacts of rising temperatures in a particular sector – such as crop yields, human health or biodiversity – without necessarily in the same study showing whether that rise in temperature was attributable to human influence on the climate.
Expert analysis direct to your inbox.
Get a round-up of all the important articles and papers selected by Carbon Brief by email. Find out more about our newsletters here.
Get a round-up of all the important articles and papers selected by Carbon Brief by email. Find out more about our newsletters here.
Our algorithm does not allow for an actual analysis of whether each study formally attributes the observed changes to human-caused climate change. This would probably require a human expert reading the full paper, which is increasingly hard to do for an ever-growing literature base.
In our study, we pursue a very different, data-driven approach to the detection and attribution question. Our algorithm extracts documented impacts and the respective drivers – in our case, temperature and precipitation – on the grid cell level. We then use physical climate science based methods to assess detectable and attributable trends.
Using a well-established method, we assess detectable trends and their attribution to human-caused climate change, over the 1950-2018 period, based on observational and climate model evidence at the grid cell level.
We can show that temperatures have been rising, and can be attributed to human influence, for almost everywhere we have data.
The picture for precipitation is less clear. We have fewer grid cells with sufficient data to analyse in this way, there are fewer cells where trends are outside the range of natural variability, and we have some cells – for example, in west Africa – where precipitation has decreased significantly, although climate models projected an increasing trend. We do not count such cells as showing trends attributable to human influence.
By bringing together our big literature assessment with physical climate information we can then provide an assessment of where climate impacts linked to temperature or precipitation changes may be attributable to human-caused climate change.
§ ‘Attribution gap’
You can see this in the figure below, in which cells are coloured pink where the selected driver displays an attributable trend, and darker pink cells show where there are more studies referring to that place and that climate driver.
Overall, when considering either average temperature or precipitation, we show that attributable trends are observable for 80% of the world’s land area, covering 85% of the world’s population.
For the majority of grid cells, attributable trends in temperature or precipitation are present, with large amounts of evidence about how those trends are impacting human and natural systems. However, this is not the case everywhere. The bar graph shows how this changes in countries from different income classifications.
For example, if we consider all impacts driven by temperature, 90% of people living in high-income countries live in an area where trends can be attributed to human influence on the climate. Of those, almost 90% live in areas where there are large numbers of studies referring to the impacts of those trends on human and natural systems
In low-income countries, the number of people living in areas with attributable trends is 72%. However, of these, only 22% live in areas with high levels of evidence about how temperatures are affecting human and natural systems.
We refer to this phenomenon as the “attribution gap”. It should be noted that lower levels of evidence do not imply that climate change is not affecting people in low-income countries. Rather, the fact that published evidence is sparse – even where we can observe human-caused changes to temperature or precipitation – shows that there is an urgent need for more scientific study of the impacts of climate change in the global south.
The approach we explore in our paper illustrates the potential for deploying deep-learning techniques and the combination of different strands of “big data” to inform scientific assessments of the available evidence – such as those carried out by the IPCC.
We also hope that it allows for ways to more systematically combine climate science information across scales by bringing together physical climate science information that often operates on the global scale with highly regionalised studies of observed sectoral climate impacts.
Our database – which we intend to make publicly available – can, in theory, be continuously updated, and our algorithm can be improved by investing in additional supervised learning. Furthermore, we can continue to integrate information on detectable and attributable changes in climate impact drivers beyond temperature and precipitation alone.
If science advances by standing on the shoulders of giants, in times of ever-expanding scientific literature, giants’ shoulders become harder to reach. Our computer-assisted evidence mapping approach can offer a leg up.