HyperAI

Jeff Dean Likes Google's New Research: Whale Bioacoustic Model Can Identify 8 Types of Whales

6 months ago
Information
zhaorui
特色图像

Whale sound identification is of great significance for protecting marine ecology. Scientists can analyze the sounds of whales to understand their species, migration routes, breeding habits and social structure, and thus formulate more effective protection policies.

However, whale sound recognition is not an easy task.First, there are more than 94 known species of whales in the world, and the acoustic frequency range is extremely wide, ranging from blue whales as low as 10 Hz to toothed whales as high as 120 kHz. Second, even recordings of the same whale species can vary significantly depending on location and time, which further increases the difficulty of model development. Finally, researchers have very limited knowledge of the acoustic characteristics of some rare whales, so they cannot accurately distinguish the sounds of different whale species.

To address this, the Google Research team developed a new whale bioacoustic model that can identify eight different species out of the 94 currently known whale species.These include humpback whales, killer whales, blue whales, fin whales, minke whales, Bryde's whales, North Atlantic right whales, and North Pacific right whales. The researchers also extended the model to Biotwang and used it to label data from more than 200,000 hours of underwater recordings.

The related research was published on the official website of Google Research under the title "Whistles, songs, boings, and biotwangs: Recognizing whale vocalizations with AI".
Research highlights:

* Identifies 8 different species of 94 whales, including multiple calls for 2 species

* Includes the Biotwang sound which was recently confirmed to be the call of a Bryde's whale

* Models can be called individually through the TensorFlow SavedModel API

Paper address:

https://research.google/blog/whistles-songs-boings-and-biotwangs-recognizing-whale-vocalizations-with-ai

The open source project "awesome-ai4s" brings together more than 100 AI4S paper interpretations and provides massive data sets and tools:

https://github.com/hyperai/awesome-ai4s

Datasets: Create 4 new whale call datasets, covering 8 of the approximately 94 whale species

Based on the existing whale call recognition data, the researchers established four new whale call datasets.These include the "boing" calls of minke whales, the "upcalls" and "gunshot" calls of North Pacific right whales, and the calls of blue whales and fin whales.

The "boing" sound of a minke whale

The mysterious Biotwang sound, recorded decades ago, had never been identified as being produced by any particular species of whale.Until recently, new research from the National Oceanic and Atmospheric Administration (NOAA) pointed out that the sound was produced by Bryde's whales.

Minke whale vocalizations have been documented even further back than Bryde's whale vocalizations, dating back to submarine recordings in the 1950s. It wasn't until 2005 that NOAA scientists attributed this specific sound to minke whales.

The label set that the researchers initially obtained from the Pacific Islands Fisheries Science Center (PIFSC) did not include this sound called "boing". Therefore, when Google researchers used this data for initial model training, the model identified this sound as an incorrect pattern. The researchers then conducted in-depth research on these newly discovered sounds. Eventually, the sounds of minke whales were accurately identified and incorporated into the multi-species recognition model.

Spectrum of a minke whale "boing"

North Pacific right whale "rise call" and "gunshot" call

The North Pacific Right Whale (NPRW) is an extremely endangered whale species that is mainly found in the waters of the North Pacific Ocean. The North Pacific Right Whale was almost completely hunted to extinction by whaling activities, and the remaining population is very small. It is estimated that there are only 30-35 right whales in the eastern population.

Meanwhile, the North Pacific right whale population is the only known right whale population that can "sing." While the "rising call" sound can come from right whales, bowhead whales, or even humpback whales, the North Pacific right whale can be distinguished by its unique "gunshot" call.

Spectrum of the North Pacific right whale's "upward call"
Spectrum of North Pacific right whale "gunfire"

Blue Whale and Fin Whale Sounds Tags

The researchers said that before their initial collaboration with the Pacific Islands Fisheries Science Center (PIFSC) to develop the humpback whale model, the PIFSC had already annotated some of their data, identifying the presence of blue and fin whales, which live not only around the Hawaiian Islands but also in the pelagic waters of the world's oceans.

In this study, the researchers focused specifically on data collected by the MARS hydrophone managed by the Monterey Bay Aquarium Research Institute (MBARI). However, since there are no baseline labels for MARS data, the researchers trained a model specifically for identifying blue whales and fin whales on PIFSC data and used it to generate pseudo-labels for MBARI data.

Spectrum of blue whale calls in the central Pacific
Spectrum of fin whale calls

Model architecture: Classifying spectrograms based on raw audio

The researchers noted that the model first converts the raw audio data into spectrogram image data to represent each 5-second sound clip.The front end of the model uses a mel-scaled frequency axis, log amplitude compression, and normalizes by subtracting the 5%-ile log amplitude of each frequency band. Finally, the model classifies the images into any of 12 whale species or vocalization types.

In addition, the model can be called independently through TensorFlow's SavedModel API.This means that not only can we use this model to identify the species and sounds included when the model was trained, but we can also use the pre-trained embeddings of this model to search, identify new sounds or whale species, and quickly build corresponding classifiers.

Model testing: The model has good discrimination performance for each category

Long-term passive acoustic monitoring requires not only the correct classification of species, but also the correct removal of background and non-animal sound events. Therefore, the researchers did not limit the training to positive labels, but also extensively extracted negative data (negative labels) and background data from recordings provided by other partner institutions.

To validate the model, the researchers randomly selected a uniform subset of 20% from the available training data as the test set.The figure below describes the performance of the model on the test sets of different species.

* A high value of AUC (ROC) indicates that the model is able to distinguish between positive and negative labels well.

* Sensitivity @ 0.99 represents the fraction of true positive label classification results that score above a threshold that excludes true negative labels of 99%.

* Precision @ 0.5 represents the proportion of correctly predicted species at a reasonable sensitivity threshold (less than 50% of true positive label classification results).

Model performance on test sets of different species

Overall, the model can accurately identify any of the eight whale species, including humpback whales, killer whales, blue whales, fin whales, minke whales, Bryde’s whales, North Atlantic right whales (NARW) and North Pacific right whales (NPRW).For the Minke, North Pacific, North Atlantic, and Bryde’s whale classes, all three metrics have values close to 1, demonstrating excellent model performance with fewer tradeoffs between false positives and false negatives. This tradeoff is more pronounced for killer whale echolocation and whistles.

Combining AI and machine learning technologies to contribute to marine life protection

The release of Google Research's latest results is of great significance for understanding and even achieving cross-species communication. Jeff Dean, chief scientist of Google DeepMind and Google Research, said on social media: "The human language LLM is outdated. We should all be excited about this breakthrough!"

A senior manager focused on data science also said: "Finally I can decipher the gossip of whales on the seafloor! Can't wait to see if they are chatting about the latest krill trends or arguing about the best seafloor hotspots!"

Some netizens also believe that "this is an important step towards being able to communicate with other species on Earth, and it is of milestone significance!"

In order to help scientists better understand how whales communicate, Google began exploring how to use AI and machine learning technologies to analyze and identify whale sounds in 2018, thereby realizing the beautiful vision of protecting more endangered marine species and maintaining a healthy marine ecosystem.

In 2018, Google Research partnered with the Pacific Islands Fisheries Science Center (PIFSC) of the National Oceanic and Atmospheric Administration (NOAA) toA classification model based on convolutional neural networks was developed to detect humpback whale calls, officially launching research on whale sound wave classification.

The model was used to identify humpback whale calls in more than 187,000 hours of audio collected by NOAA, confirming the spatiotemporal patterns of humpback whale song and discovering a new site on Kingman Reef where humpback whale sounds had not been observed before.
Paper address:
https://research.google/blog/acoustic-detection-of-humpback-whales-using-a-convolutional-neural-network/

In 2019, researchers collaborated with Google Creative Lab to launch an interactive visualization tool called “Pattern Radio” based on this model.Shows a year's worth of underwater audio data about whales collected near Hawaii.

The model annotated the audio, and some of the data came with additional insights from experts, allowing the researchers to more accurately analyze the vocal patterns of whales, especially humpback whale songs.
Pattern Radio tool address:
https://patternradio.withgoogle.com/

In fact, in addition to Google, CETI has also been committed to the research of whale calls for a long time.In May of this year, CETI collaborated with researchers from MIT to use machine learning to analyze recordings of sperm whales, confirming that the sounds made by sperm whales are structured and separating the sperm whale pronunciation alphabet, which was found to be highly similar to the human language expression system.

Click the link to view the detailed report: MIT/CETI team uses machine learning technology to separate the sperm whale pronunciation alphabet! It is highly similar to the human language system and has a stronger information carrying capacity!

As research continues to deepen, a new way of cross-species communication may become a reality. This prospect will not only change our understanding of marine life, but also redefine the relationship between humans and nature, and usher in a new era of harmonious coexistence between humans and animals.