Apple and NVIDIA partner to boost AI language models’ speed and efficiency

Apple has announced a new partnership with NVIDIA aimed at improving the performance of large language models (LLMs).

December 20, 2024 17:14 IST

Apple and NVIDIA partner to boost AI language models' speed and efficiency

Apple has announced a new partnership with NVIDIA aimed at improving the performance of large language models (LLMs). This collaboration introduces a cutting-edge text generation technique that significantly speeds up AI applications, benefiting industries that rely on powerful language models.

Also Read

Apple criticises Meta’s interoperability requests over privacy concerns

Earlier this year, Apple unveiled a machine learning approach called Recurrent Drafter (ReDrafter). This technology combines two advanced methods: beam search and dynamic tree attention. Beam search works by evaluating multiple potential text sequences at once, allowing for more accurate and diverse results. Meanwhile, tree attention helps streamline these sequences, reducing redundant information and improving processing efficiency. The result is faster and more efficient text generation.

Now, Apple has integrated ReDrafter into NVIDIA’s TensorRT-LLM framework. TensorRT-LLM is a tool designed to optimize large language models running on NVIDIA’s powerful GPUs, which are often used for AI tasks. The integration of ReDrafter into this framework has yielded impressive results, with Apple reporting a 2.7x increase in the speed at which tokens (the basic units of text) are generated. This boost was achieved while running a production model containing tens of billions of parameters, which is a standard size for complex AI models.

The improvements brought by ReDrafter do more than just increase speed. By making the text generation process more efficient, the technology also reduces GPU usage and lowers power consumption, which is particularly valuable for large-scale AI applications. This means that developers can now create AI applications that are faster, more energy-efficient, and more cost-effective to run.

Apple highlighted that the integration of ReDrafter into NVIDIA’s framework is a big step forward for LLMs, as these models are becoming an essential part of many production applications. Reducing computational costs and latency is crucial for businesses relying on these models, especially in real-time services where speed is critical.

Related News

Who is Alexandr Wang? 28-year-old hired for 14 billion US dollar by Mark Zukerberg to lead Meta superintelligence labs

Who is Alexandr Wang? 28-year-old hired for 14 billion US dollar by Mark Zuckerberg to lead Meta superintelligence labs

Aadhaar Card Update: Simple step-by-step guide to change your address online without hassle

Aadhaar Card Update: How to change address in Aadhaar Card online in simple steps

Aadhaar Card update December: Here’s how to change name, date of birth, address and phone number in simple steps

Ajay Devgn and Kajol’s luxurious 5BHK Goa villa

Step inside Ajay Devgn–Kajol’s lavish 5BHK villa: Luxury pool, stunning interiors and a 100-year-old well

Meet Jeetu Patel: Indian-origin Cisco executive who works 18 hours a day but follows 2 simple rules for work–life balance

Job uncertainty: AI will snatch CEO roles? Expert reiterates Sundar Pichai’s warning

World News7 min ago

AI risk expert Stuart Russell warns of the potential job uncertainty in the age of artificial intelligence, with AI taking over professional work and even potentially replacing CEOs. He believes that companies are prioritizing profits over human life, and Google CEO Sundar Pichai acknowledges the potential for job displacement.

View all shorts