Spotlight: MaxEnt Modelling the Dotterel Mountains
by Deborah Buehler originally published in Wader Study 128(3)
It’s hard to ignore climate change these days. Heatwaves are hotter, floods are fiercer, and some part of the planet is always on fire. In the Arctic, ecosystems based on snow, ice and permafrost are disappearing. These habitats are important breeding grounds for many migratory bird species, particularly waders, and they are vanishing before scientists can determine what the birds need most to survive.
Breeding waders are notoriously difficult to study because their nests are well camouflaged and spread out over huge swaths of largely inaccessible tundra. This is where technological advances can help. In this issue of Wader Study, Hoefs and colleagues combine powerful machine-learning algorithms with remote sensing data to build a statistical model that can predict the breeding distribution of the Eurasian Dotterel Charadrius morinellus in northern Sweden.1
Dotterels are medium-sized waders with striking white eye-stripes and a dove-like appearance. They breed on Arctic or alpine tundra from Scotland to western Alaska, then migrate south to non-breeding areas from North Africa to Iran.
Like many wader species, Dotterel populations are declining, and this species is especially vulnerable because it is specialized to a narrow ecological niche. This same quality makes it an ideal species to study using machine-learning based algorithms.
Hoefs and colleagues wanted to use such algorithms to build maps showing where suitable habitat for Dotterels is located. They also wanted to determine which environmental variables, in freely-available remote sensing data, best predicted where Dotterels breed. To do this they used a statistical technique called maximum entropy (or MaxEnt) modelling. MaxEnt is a method that can predict where birds and other organisms are most likely to be located (their distribution) using limited field data. While most predictive methods need to be trained on input about both where the organisms usually are (presence) and where they are not (absence), MaxEnt requires presence data only.2 This is good news, because it is easier to gather reliable data on presence than absence. For example, finding a bird, or a nest, definitively indicates presence, but not finding a nest, amid a huge swath of tundra, doesn’t necessarily mean the birds aren’t there.
To gather presence data, the researchers surveyed two study areas (~10 km² in total) in the Vindelfjällen Nature Reserve in Swedish Lapland several times during each breeding season from 2016 to 2018. Each visit lasted a full day and a team of three to six people walked through the area in systematic lines regularly scanning with binoculars for birds displaying or showing breeding behavior. This work resulted in a dataset of 23 Dotterel nests and 156 locations where birds were found exhibiting breeding behavior.
In the world of statistical modelling, this is a relatively small dataset. The researchers also knew that they were searching for birds in locations that were, necessarily, more accessible to humans (biased sampling) and therefore tended to have similar characteristics (spatial autocorrelation). Luckily, the MaxEnt method can handle small datasets2 and biased sampling effort, as long the minimum distance between presence points is set to a coarser scale than predictor variables.3 To meet this requirement, the researchers excluded presence points that were less than 300m apart. This avoided overrepresenting the habitat preferences of birds in more intensively searched areas.
MaxEnt modelling also requires a set of environmental variables that describe the suitability of the environment for the species. For this, Hoefs and colleagues used Sentinel-2A satellite data. They needed temporal correspondence between the environmental data and the period when parents raise their chicks, so they used data from July 2018, around the egg hatching date of Dotterels in the area. The satellite data yielded vegetation metrics, texture metrics (grey level textures that show similarities or differences in contrast), and topographic variables.
The next step was to determine which of the over 200 environmental variables would provide the best combination of predictors. Often, researchers pre-select variables based on expert knowledge about the biology of the species. But newer techniques, which include everything and then filter to a subset of highly contributing variables, perform better.4 The researchers chose this second approach and started with a model that included all available variables. To prune this model, they first removed variables that contributed less than 2%. Then they identified the best-performing variable and removed all variables that were correlated with that variable (the variables change together in similar ways), then repeated with the second-best performing variable and so forth. This process trimmed the set of 211 environmental variables to seven uncorrelated and highly contributing predictors.
The researchers now had everything they needed to predict the location of suitable habitat for Dotterel. They used MaxEnt to find the probability distribution of maximum entropy (the most spread out) by comparing known locations of presence against 8,000 background samples randomly distributed over the entire nature reserve to represent potentially unsuitable habitats.
The model and maps predicted that about 1% of the total area of the nature reserve (about 60 km²) is suitable habitat for Dotterel. The researchers built and tested their model based on a well-surveyed study area of about 10 km² with about 30 breeding pairs of Dotterel. Extrapolating, this predicts a breeding population of about 180 pairs in the entire nature reserve.
Scientists are a skeptical bunch, and the researchers wondered if their model could be trusted. After all, the algorithm chose predictor variables like Chlorophyll Vegetation Index and S08 TM Difference Variance 200 m sd that mean little to humans, whereas other variables like altitude, slope, and exposure, were excluded. To check the model’s results for accuracy against the real world, the authors went back to the field in June 2019 and conducted surveys in areas that the model predicted as suitable Dotterel habitat. They also checked the model against 47 presence locations collected through an independent annual survey conducted in the reserve.
The model performed well in the sense that birds were found where they were predicted to be. This result gives some confidence that the variables chosen, and the habitats predicted, by machine-learning are actually important to Dotterels. It also makes sense remembering that the selected variables were highly correlated with things that humans already know are important to Dotterels. The chosen variables might mean less to humans, but they are better predictors of the habitats that Dotterels truly need.
Could this approach also work with other species? Perhaps, though the authors caution that this type of modelling is particularly well suited to species, like Dotterel, with very specific habitat requirements and that predictive power might be lower for generalist species. That said, we need all the tools we can get. We are facing serious challenges with climate change and we’re going to have to rely on technology to help. With appropriate validation, species distribution models combined with climate change models might allow us to better predict shifts in suitable breeding habitat and changes in population size for Arctic-breeding birds of conservation concern.
This study spotlights the potential power of technology, but also reminds us that technology has limits. The authors were careful to work within those limits and to test the technology against known information before relying on it. We need real-life data to train and ground-truth models. Machines can learn, but humans still need to decide what to teach them, and to interpret what they tell us.
1Hoefs, C., T. van der Meer, P. Antkowiak, J. Hagge, M. Green & J. Gottwald. 2021. Exploring the Dotterel Mountains: Improving the understanding of breeding habitat characteristics of an Arctic-breeding specialist bird. Wader Study 128: 226–237.
2Phillips, S.J., R.P. Anderson & R.E. Schapire. 2006. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190: 231–259.
3Fourcade, Y., J.O. Engler, D. Rödder & J. Secondi. 2014. Mapping species distributions with MAXENT using a geographically biased sample of presence data: a performance assessment of methods for correcting sampling bias. PLoS ONE 9: e97122.
4Zeng, Y., B.W. Low & D.C.J. Yeo. 2016. Novel methods to select environmental variables in MaxEnt: A case study using invasive crayfish. Ecological Modelling 341: 5–13.
PDF version of this article is available for download here
Featured Image: Dotterel, Charadrius morinellus, June 2017 ©Hans Norelius from Älvsjö, Sweden.