What is Google's new AI, AudioPalm?

Jun 26, 2023

Ankita Baidya

AudiopaLM is a large language model for voice production and comprehension.

(Photo Credits: Reuters)

New AI

Text-based and voice-based language models, PaLM-2 and AudioLM and AudioPaLM, respectively, are combined into a single multimodal architecture.

(Photo Credits: Reuters)

models

This multimodal architecture can process and generate both text and speech for use in speech recognition and speech-to-speech translation applications.

(Photo Credits: Reuters)

uses

The linguistic information found solely in large language models like PaLM-2 and AudioLM is passed down to AudioPaLM.

(Photo Credits: Reuters)

Info

The capacity to preserve paralinguistic information like speaker identification and tone is also passed down to AudiopaLM.

(Photo Credits: Reuters)

other Info

The model performs voice translation tasks substantially better and and it can execute zero-shot speech-to-text translation for numerous language.

(Photo Credits: Reuters)

model

AudioPaLM shows how audio language models work by transferring voices between languages in response to a brief spoken prompt.

(Photo Credits: Reuters)

voices

Speech-to-speech translation and automatic speech recognition are examples from the AudioPaLM model.

Follow FE Tech Bytes 

Twitter

LinkedIn

Instagram

Facebook

(Photo Credits: Reuters)

Examples