New AI Models: Making a wider connection

Multimodal AI can improve customer understanding

March 10, 2024 14:18 IST

Multimodal AI integrates and processes multiple modalities simultaneously

By Uma Ganesh

The impact and advantages of AI applications are expanding day by day and businesses are trying to build their unique value propositions that are giving them an edge in the marketplace. The current day AI systems are unimodal that use one type of data and algorithms developed for the specific modality. ChatGPT for instance is built around text content and the output it produces is also in the text format.
Multimodal AI integrates and processes multiple modalities simultaneously and produces more than one type of output. This new paradigm of AI is at the intersection of computer vision, natural language processing (NLP) and audio processing that promises a radical change in the human-machine interaction. It has the ability to create new value for the business by processing and generating insights from multiple data types.
Multimodal AI is essential for the development of robots, enabling them to interact with humans, other robots and interact with the environment. This is made possible because data from multiple devices such as cameras, sensors, GPS and microphones interact with one another to guide the robots.
With multimodal AI, the understanding of the customer profile could become more refined by combining customer feedback, voice conversations, sentiment analysis of social media interactions and user patterns on the website. The combination of voice recognition, NLP and generative AI could create efficient summary and notes of proceedings of meetings. In the context of healthcare, by blending medical images of the patient history with genetic information and diagnostic data, patient treatment could get more focussed with the help of a multimodal AI solution.
Presently designing multimodal AI applications could be challenging as compared to unimodal AI. Firstly, clean datasets that cover multiple modalities are not easily available. Without such multimodal datasets, training for large scale models would not be possible. Further data from each modality that is required to construct the models would be in different formats and representation methods. Therefore the efforts involved in data synchronisation, alignment and ensuring consistency in data quality could be complex and time consuming.
Additionally, while building AI models, use of multiple algorithms and computation as applicable to each modality for deployment in scalable scenarios could also pose challenges.
There are ongoing research studies to address the challenges outlined above and it is only a matter of time that new tools and techniques would be available to migrate from unimodal AI to multimodal AI. As the models become more sophisticated, innovative applications would emerge. As human-computer interactions improve, recommendations and support for decision making would become more impactful. In particular, multimodal AI that combines generative and predictive AI could make businesses more proactive and resilient.

The author is chairperson, Global Talent Track, a corporate training solutions company

Follow us on Twitter, Facebook, LinkedIn

TOPICS

Artificial Intelligence

Business Technology

Information Governance

This article was first uploaded on March ten, twenty twenty-four, at eighteen minutes past two in the afternoon.

Related News

Dream Act 2025, Green Card, Dreamers, H-1B holders, permanent resident, deportation

Dream Act 2025: Green card pathway for Dreamers, children of H-1B visa holders, introduced

Canada work permit, Canada Temporary Foreign Worker Program, Canada TFWP Rules, Canada LMIA process, LMIA process

Canada drops wage assessment under TFWP; Work permits now based on LMIA

F-1 visa, F-2 Visa requirements, F-1 Visa dependents, F-2 Visa rules, F-2 Visa working conditions

Are international students allowed to bring their family members to the US?

‘Laying off US workers while hiring H-1B workers’: US senator slams large companies’ use of visa program

ED has filed a chargesheet against Anil Ambani Reliance Group company Reliance Power

ED chargesheet names Reliance Power, top execs in alleged fake bank guarantee case; company says ‘we are victims of fraud’

Goa victims suffocated to death in underground kitchen – 10 glaring developments

India News11 min ago

A tragic fire at the popular nightclub Birch By Romeo Lane in North Goa’s Arpora on Saturday night has claimed 25 lives and injured six others. The fire broke out during a “Bollywood Banger Night,” where about 100 tourists were enjoying the event.

View all shorts