By Siddharth Pai
Last week, Google and OpenAI made groundbreaking announcements. They introduced new Gen AI-powered assistants that can engage in real-time conversations, even adapting when interrupted, mirroring human interaction. These assistants are not just limited to conversation; they can also analyse your surroundings through live video and translate conversations on the spot.
Google announced new tools at its I/O Conference last Tuesday, including enhancements to its bewildering array of products under the Gemini AI tool banner, with a faster “Flash” version and a “Live” version to compete with OpenAI’s new ChatGPT 4o announced the day before. Google also announced that it is building Gemini into a “do everything” model that will run across almost all its product suites — as it already does across search and Google’s web-based tools and applications such as Docs and Sheets, much like Microsoft’s Copilot AI assistant tool for its Office suite of applications.
For its part, OpenAI’s conversational ChatGPT 4o model can supposedly respond with a lag time of 320 milliseconds, which is about the same as human speech. (Some long-married couples might claim that their partner’s response is 10 times as fast, but that’s a topic for another forum.) It also incorporates humour, sarcasm, song, and other human-like aspects in its responses to the user.
ChatGPT 4o will be free, but usage caps will be set. If you need more than the capped level, you can access a significantly faster model for $20 a month. Google’s Gemini Advanced (one among its confusing suite of Gemini products) will have a two-month free trial and cost $20 a month after that.
Both OpenAI and Google claim that their models are well-tested. OpenAI says GPT-4o was evaluated by more than 70 experts in fields like misinformation and social psychology. Google has said that Gemini “has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity”.
These companies are building a future where AI models search, vet, and evaluate the world’s information for us to serve up a quick — and hopefully correct — answer to our questions. But the truth is that these models can “hallucinate” and provide patently wrong answers to your questions. The nature of GenAI models means that they can make up things as they go along, simply due to how they are engineered. This hallucination is not to mention biases and other issues these companies claim to have ironed out.
Speaking of which, there was other disturbing news last week. On Friday, Wired magazine said that OpenAI’s Long Term AI Risk team had been disbanded and that Ilya Sutskever, OpenAI’s co-founder, had left the company. Sutskever co-led this team with Jan Leike, who has also left the company, as did many top researchers from that team. (bit.ly/3WTvg89)
This isn’t comforting. Although Sutskever had helped CEO Sam Altman start OpenAI in 2015 and set the direction of the research that led to ChatGPT, he was also one of the four board members who fired Altman in November. Altman was restored as CEO after pandemonium reigned for five days and a mass revolt by OpenAI staff led to the brokering of a deal in which Sutskever and two other company directors left the board. Sutskever’s staying on at the company at least meant that it had retained some of its conscience since he had been asked just a few months earlier (in July) to co-lead the governance and internal research policing unit of the company.
With Sutskever, Leike, and many of those key team members now absent, OpenAI will have to work hard to ensure that it retains its believability in its attempts to regulate itself. It claims that the remainder of the team has now been reabsorbed into its other research efforts. To me, this is astonishing, considering that it was just in July last year that OpenAI formed this team, with a promise that it would receive 20% of the company’s computing power. How one swallows an independent one-fifth of a company’s research capacity into other research departments is unclear, especially since the absorbed fifth was put in place precisely to keep the other four-fifths in check.
To be fair to OpenAI, Sutskever, Leike, and team were focused on “artificial general intelligence” or AGI, a step up from today’s AI, where AI equals or eclipses general human-like intelligence, which, at least for now, seems far off — there are other efforts at the firm that aim to keep its AI releases responsible until then. OpenAI’s focus on long-term safety has certainly had impacts across the large language model (LLM)and Gen AI space. That said, losing a co-founder tasked with internal regulation and corporate responsibility for its products is not welcome news.
But back to the new tools from Google and OpenAI. We will soon be able to explore them for ourselves to gauge whether we can turn to these tools in our day-to-day tasks as much as these firms hope we will — or whether they’re more like party tricks that eventually lose their charm. The internet is littered with many such experiments. And then there is the issue of “hallucination” I referenced before. According to IBM, “AI hallucination is a phenomenon wherein a large language model (LLM) — often a generative AI chatbot or computer vision tool — perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.” (bit.ly/3UMZOpk).
While engaging with these new tools, even more so than the simpler chatbots that are already available, it’s wise to stay sceptical.
By Siddharth Pai, technology consultant and venture capitalist