AI "Bird Census", Cornell University Uses Deep Learning to Analyze the Distribution of North American Warblers

According to statistics from the World Wildlife Fund, the population of representative species in the world decreased by 68% from 1970 to 2016, and biodiversity continued to decline.
To protect biodiversity, we need to accurately analyze local ecological conditions and formulate reasonable ecological protection policies. However, ecological data is too complex and statistical standards are difficult to unify, making large-scale ecological analysis difficult to carry out.
Recently, researchers from Cornell University used deep learning to analyze 9 million sets of bird data and obtained the distribution data of wood warblers in North America, opening a new chapter in ecological data analysis.
Author | Xuecai
Editor | Three Sheep, Iron Tower
This article was first published on HyperAI WeChat public platform~
According to the World Wildlife Fund (WWF),Between 1970 and 2016, the average abundance of 20,811 populations of 4,392 species worldwide decreased by 681%, global biodiversity is declining.

Figure 1:From 1970 to 2016, 4,392 species were represented worldwide.Average population change for 20,811 species
Protecting biodiversity requires accurate large-scale analysis of species distribution in relevant areas.However,Due to the huge amount of data and the lack of a unified statistical method, researchers are currently unable to accurately count the biodiversity (species richness, population size, etc.) and biological composition data (the status of a species in the local ecosystem) of a specific area.
Traditional species richness statistics require superimposing distribution maps of different species for modeling and prediction, or directly predicting through macroecological models.The inference results will be affected by the accuracy of the model, the former will also be affected by map accuracy.
and,The temporal resolution of this forecasting method is poor, it is impossible to make accurate judgments on the seasonal changes in species distribution, let alone study the connections between species, which is not conducive to the formulation of ecological protection policies.
Deep learning provides an effective means for large-scale spatiotemporal research on biodiversity. Researchers at Cornell University in the United States developed the DMVP-DRNets model by combining the Deep Reasoning Network (DRN) and the Deep Multivariate Probit Model (DMVP).The temporal and spatial distribution of Warbler in North America was analyzed from 9,206,241 sets of eBird data., and inferred the connection between the warblers and the environment and other species. The relevant results have been published in "Ecology".

This result has been published in Ecology
Paper link:
https://esajournals.onlinelibrary.wiley.com/doi/10.1002/ecy.4175
Experimental procedures
Dataset: eBird with covariates
The researchers used eBird data from January 1, 2004 to February 2, 2019, between 170°-60° W and 20°-60° N as the dataset for this study. After excluding duplicate data,There are 9,206,241 sets of eBird data,Each set of eBird data includes the time, date, location, and all bird species observed.

Figure 2: eBird data of a group of long-tailed tits
The researchers also introduced 72 covariates, including 5 covariates related to the observer, such as activity status, number of observers, observation time, etc.; 3 covariates related to time, mainly used to bridge the deviation between different time zones; 64 variables related to topography, such as altitude, coastline, islands, etc.
Model framework:Decoder + Latent Space
This study used DRN based on DMVP for data analysis and prediction.This model consists of a three-layer fully-connected network decoder to analyze the correlation of input features, and two structured latent spaces to represent the associations between species and between species and the environment.

Figure 3: Schematic diagram of DMVP-DRNets model results
Finally, the DMVP-DRNets model outputs three ecologically relevant results through an interpretable latent space:
1. Environmental related characteristics: reflects the connections and interactions between different environmental covariates;
2. Species-related characteristics: Reflect the connection between different species through the residual correlation matrix;
3. Biodiversity-related characteristics: Such as the abundance and distribution of a species.
Model Evaluation: Comparison with HLR-S
Before putting the DMVP-DRNets model into large-scale use,The researchers first compared it with the HLR-S model based on spatial Gaussian processes.HLR-S is one of the most commonly used models in ecology to study the joint distribution of multiple species.
First, the two models were trained using 10,000 sets of eBird data. The HLR-S model took more than 24 hours to train, while the DMVP-DRNets model took less than 1 minute.

Table 1:Performance comparison between DMVP-DRNets model and HLR-S model
Subsequently, eBird data of different scales were analyzed.The DMVP-DRNets model outperforms the HLR-S model in 11 evaluation criteria., only lagging behind the HLR-S model in the loss of species richness calibration.
Experimental Results
Distribution area:Appalachian Mountains
After analyzing the eBird data, the DMVP-DRNets model outputs a spatial resolution of 2.9 km.2 The distribution map of North American warblers by month. The distribution of different species of warblers in North America is very dynamic, with different distribution hotspots every month. After superimposing the distribution maps of each month,The researchers found that the Appalachian Mountains are the region with the highest diversity of wood warbler species.

Figure 4:Distribution map of wood warblers in North America
a: Distribution of maximum species richness of wood warblers across North America
b: The main distribution area of the wood warbler in North America
At the same time, the researchers also found hot spots for the distribution of warblers at different migration stages. During the pre-breeding migration period, warblers were mainly distributed near the Appalachian Mountains in Ohio, West Virginia and Pennsylvania. After breeding, the northern Appalachian Mountains were the most distributed area for warblers.

Warbler-Environment:Water, land and season preference
Figure 5: Distribution of wood warblers during the pre-breeding migration period (a) and the post-breeding migration period (b)
Furthermore, the researchers used the DMVP-DRNets model to analyze the interactions between warblers and the environment in the northeastern United States.
first,The researchers were able to roughly distinguish the preferences of different warblers for aquatic and terrestrial environments.Then,They found that different species of warblers have different preferences for their environment during the breeding season.The aquatic-loving Blue-winged, Northern, and Yellow-throated Warblers roost closer during the breeding season, while Pine Warblers stay even closer to other pine forest-associated species, such as Brown-headed Nuthatch and Red-headed Woodpecker.
The distribution of different warblers changes with the seasons.Most warblers roost in groups during the post-breeding migration period, while palm warblers migrate later in the fall. Pine warblers and yellow-rumped whitethroats roost year-round in the northeastern United States.

Figure 6: Correlations between warblers during the breeding season and the environment and other species

Figure 7: Correlations between warblers and the environment and other species during post-breeding migration
Interspecies associations:Competition and Cooperation
Warblers show different relationships with other species during the breeding, non-breeding and migratory seasons.
During the breeding season, wood warblers mainly defend their own habitats and have weak associations with other species.There is even a negative correlation between species that have similar habitats and are more aggressive, such as the Black-necked Wilson's Warbler and the Orange-tailed Warbler.
During the migration period, most warblers showed strong positive correlations with each other and with other species in the forest.This is consistent with observations that forest warblers form mixed migration groups with other species such as red-eyed green cuckoos and black-crowned chickadees.
During this period, the warblers had poor relationships with predators such as the great-winged buzzard, barred hawk, chicken hawk, and red-shouldered buzzard, with a high negative correlation coefficient between the two.

Figure 8: Correlation coefficients between warblers and other species during the breeding period (a) and post-breeding migration period (b)
The above results show thatThe DMVP-DRNets model can make accurate judgments on the distribution of warblers in different periods, and can infer the connection between warblers and the environment and other species, providing a basis for formulating ecological policies.
AI "Bird Population Census"
In addition to data analysis, data collection is also an important part of ecological research.Unlike plants, birds are highly alert and move quickly, and some species are small, making it difficult to observe them accurately.
Traditional methods rely on telephoto cameras, high-powered telescopes and stationary cameras to observe birds from a distance.Although this method avoids disturbing the birds, it requires a lot of manpower and material resources, and also requires the observer to have considerable knowledge of ecology and taxonomy.
Through deep neural networks,AI can perform efficient image and sound recognition, providing new methods for bird observation.Audio and video recording equipment is deployed in the main activity areas of birds. The equipment can upload the recorded data to the server, and then analyze the data through AI to extract the information in the audio and video, and finally get the distribution of birds in the area. This method has been widely used by the National Forestry and Grassland Administration in parks, wetlands and ecological reserves.

Figure 9: Bird smart monitoring system deployed in the Yellow River Delta
At the same time, this skill of AI can also reduce the workload of scientific researchers. AI can eliminate interference from background and noise, focus on the features of the image, and quickly solve problems that ecologists find difficult to judge.For example, in the photo below, if you don’t have any knowledge of birds, it is difficult to quickly determine the number of chicks from the complex feathers.

Figure 10: Photo of a nest of chicks. Can you tell how many chicks there are in the picture?
AI is being widely used in bird activity monitoring and bird distribution analysis, building a complete system for bird research from the bottom up and realizing a "bird census" in specific areas.I believe that with the help of AI, we will be able to have a more thorough understanding of the ecosystem, formulate ecological policies that are more in line with local conditions, gradually restore the earth's biodiversity, and protect our home planet.
Reference Links:
[1]https://www.worldwildlife.org/publications/living-planet-report-2020
[2]https://phys.org/news/2023-09-ai-birds-easier.html
[3]https://www.forestry.gov.cn/main/586/20230118/094644604451331.html
This article was first published on HyperAI WeChat public platform~