The way we interact with AI is about to change dramatically. Mira Murati (former CTO of OpenAI) and her new company, Thinking Machines, unveiled Interaction Models — a new class of AI designed from the ground up for natural, real-time collaboration rather than the familiar back-and-forth prompting we’re used to with ChatGPT, Claude, or Gemini.

Note that Thinking Machines, which currently has a valuation of $12 billion post-money, isn’t the first one to introduce something similar. Segment rival, Google, promotes its Gemini Live as an AI buddy, whereas OpenAI offers its GPT-Realtime-2 model. These versions are tuned for lower latency, aiming to make the voice versions of regular AI chatbots sound as close to natural human conversations.

However, Murati’s startup has designed its model to be preemptive, i.e., it can cut in while you are speaking and correct you if wrong, as demonstrated by the video. This is what’s drawing attention from the AI world.

Traditional AI chatbots: All about waiting for their turn

Most current AI chatbots, including ChatGPT and Claude, are turn-based systems. Their concept of arriving at an answer comes like:

– You type or speak your full message.

– The AI waits until you finish.

– It thinks and generates a complete response.

 Then it’s your turn again.

While the process seems easy, it creates a narrow “chat window” experience. You have to batch your thoughts, phrase everything clearly upfront, and wait for the model to finish before you can correct, interrupt, or show something on your screen. Even voice modes (like GPT-4o real-time or Gemini Live) still rely on external “harnesses”, add-on systems that detect when you stop speaking, which feels artificial and often laggy.

In essence, the AI has no real sense of time, can’t naturally interrupt, struggles with simultaneous input/output, and has limited awareness of what you’re seeing or doing in the moment.

So where do Interaction Models fit in?

Interaction Models, as introduced by Thinking Machines, change this approach. Instead of making the bot react in a turn-by-turn situation, Interaction Models are trained from scratch to handle interaction natively, i.e., just like humans. 

Compared to regular AI chatbot models, some key differences include:

Micro-turns (200ms chunks): The model processes tiny slices of audio, video, and text continuously, allowing near-instant reactions.

Full-duplex communication: The AI can listen and speak at the same time (e.g., live translation while you talk).

Natural interruptions: It can jump in when you pause, correct itself, or react to visual cues without waiting for you to finish.

Multimodal awareness: It sees your screen, watches what you’re doing, and reacts in context (e.g., spotting a bug in your code as you type).

Time awareness: The model understands elapsed time and can proactively remind you of things at specific intervals.

Dual-system architecture: A fast “interaction model” keeps the conversation fluid, while a more powerful background model handles deep reasoning, web searches, or complex tasks — feeding results naturally into the ongoing chat.

In essence, Interaction Models feel much closer to talking and working with a smart human colleague rather than messaging a chatbot-type assistant.

Interaction Models vs current-gen ChatGPT: Why it matters to you? 

On paper, the gains with Thinking Machines’ Interaction Models outsmart the traditional turn-based AI chatbot as far as natural conversations and multimodal interactions are concerned.

1. More natural conversations: You won’t need to type perfect prompts or wait awkwardly. You can think out loud, change your mind mid-sentence, show your screen, or point at things, just like in a real conversation.

2. Better for learning & work: Imagine explaining a math problem while the AI draws diagrams on the fly, or coding while it catches errors in real time without you asking.

3. Live assistance: Real-time translation during calls, live sports commentary while watching a match, or step-by-step cooking help while you’re in the kitchen with your camera on.

4. Less mental friction: As a human, you stay in flow. The AI adapts to your pace and style instead of forcing you to adapt to its limitations.

5. Proactive help: It can notice when you’re stuck, offer gentle suggestions, or remind you of things at the right moment. It won’t wait until you find the correct voice prompt. In essence, it eradicates the concept of AI prompts to get work done.