On December 6, 2023, Google unveiled its “largest and most capable AI model” called Gemini. Information provided by Demis Hassabis, co-founder and CEO, Google DeepMind, through Google’s official website, stated that Gemini is the result of large-scale collaborative efforts by teams across Google, with the inclusion of Google Research colleagues. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video,” the website added.
According to the official website, Gemini is Google’s most flexible model which can be functional on everything from data centers to mobile devices. From what it’s understood, the AI model’s ‘state-of-the-art’ abilities can expand the way developers and enterprise customers work around AI. It’s believed that Gemini 1.0, Google’s first version, has been built for three different sizes, such as Gemini Ultra, which’s the most capable for highly complex tasks, Gemini Pro, which is the best for performing a wide range of tasks, and Gemini Nano, which’s the most efficient for on-device tasks. Reportedly, Gemini Ultra’s performance crossed the current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
“With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities. Our new benchmark approach to MMLU enables Gemini to use its reasoning capabilities to think more carefully before answering difficult questions, leading to significant improvements over just using its first impression. Gemini Ultra also achieves a state-of-the-art score of 59.4% on the new MMMU benchmark, which consists of multimodal tasks spanning different domains requiring deliberate reasoning. With the image benchmarks we tested, Gemini Ultra outperformed previous state-of-the-art models, without assistance from object character recognition (OCR) systems that extract text from images for further processing. These benchmarks highlight Gemini’s native multimodality and indicate early signs of Gemini’s more complex reasoning abilities,” the website highlighted.
In terms of “next-generation capabilities,” Google’s Gemini is expected to deliver the facilities such as “sophisticated reasoning”, “understanding text, images, audio and more”, and “advanced coding”. With regards to being more reliable, scalable and efficient, Gemini 1.0 has been prepared in accordance with Google’s in-house prepared Tensor Processing Units (TPUs) v4 and v5e. “On TPUs, Gemini runs significantly faster than earlier, smaller and less-capable models. These custom-designed AI accelerators have been at the heart of Google’s AI-powered products that serve billions of users like Search, YouTube, Gmail, Google Maps, Google Play and Android. They’ve also enabled companies around the world to train large-scale AI models cost-efficiently. Today, we’re announcing the most powerful, efficient and scalable TPU system to date, Cloud TPU v5p, designed for training cutting-edge AI models. This next generation TPU will accelerate Gemini’s development and help developers and enterprise customers train large-scale generative AI models faster, allowing new products and capabilities to reach customers sooner,” the website specified.
Furthermore, in terms of responsibility and safety, Google stated that its Gemini model has been prepared using benchmarks called Real Toxicity Prompts, which’s a set of 100,000 prompts with varying degrees of toxicity pulled from the web, developed by experts at the Allen Institute for AI. In the concluding parts of the announcement, Google mentioned about Gemini 1.0 availability through its different products such as Bard, Pixel, and Search Generative Experience (SGE), along with highlighting that from December 13, 2023, onwards, developers and customers will get access to Gemini Pro through Gemini API in Google AI Studio or Google Cloud Vertex AI and that Gemini Ultra will be coming soon.