Google’s new AI model tries to understand dolphins

DolphinGemma will be shared as an open model this summer, Google announced.

Yesterday (14 April) was recognised as the National Dolphin Day in the US – not to be confused with World Dolphin Day, which is on 12 September. It makes sense that we have two separate days to celebrate these fascinating cetaceans.

With a brain to body ratio second only to humans, these animals have specialised brain cells associated with advanced abilities such as recognising, remembering, reasoning, communicating and even problem-solving.

Not only that, scientists have even learnt that dolphins can process emotions, and that the part of their brain which processes emotions seems to be more complex than our own.

To understand more about these intelligent creatures, especially in relation to how they communicate, Google has struck up a collaboration with researchers at Georgia Institute of Technology and the Wild Dolphin Project (WDP) to build an artificial intelligence model which can learn the patterns of dolphin sounds and generate them back.

Since 1985, the WDP has conducted the world’s longest-running underwater dolphin research project – specifically studying a community of wild Atlantic spotted dolphins. The team has collected several decades of underwater video and audio from the cetaceans, paired with individual dolphin identities, histories and observed behaviours.

Google used this trove of data to create DolphinGemma, a 400m-parameter AI model for dolphin sounds, which is run directly on the phones used by the WDP team while in the field.

The model builds on Google’s Gemma, a collection of its lightweight open models, the company said in its progress announcement yesterday, which coincided with the US National Dolphin Day.

DolphinGemma – an audio-in, audio-out – model, is built entirely on WDP’s collection of dolphin sounds. It processes sequences of natural dolphin sounds to identify patterns and predicts the likely subsequent sound in the sequence, in a process similar to how large language models for humans work.

In addition, WDP is also working on a potential two-way communication channel using the model, leading to the development of the Cetacean Hearing Augmentation Telemetry (CHAT) system, in partnership with Georgia Institute of Technology.

With CHAT, researchers are hoping to build a simpler, shared vocabulary with the dolphins by associating unique whistle sounds – distinct from natural dolphin sounds – with specific objects the animals enjoy, such as sargassum, seagrass or scarves the researchers use. They hope that the curious dolphins will learn to mimic the whistling sounds, which can then be used to reinforce a positive connection with these objects.

WDP is beginning to deploy DolphinGemma this field season, Google said. In addition, the tech giant is also releasing it as an open model this summer. Google hopes that the model can be potentially utilised by researchers trying to study other cetacean species like bottlenose or spinner dolphins.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.