OpenAI releases two open-source models GPT OSS 120B and 20B in surprise move before GPT-5 launch

For the first time since releasing GPT-2 over five years ago, OpenAI has introduced two open-weight AI models to the public.

openai
The platform is slated for an official launch by mid-2026.

For the first time since releasing GPT-2 over five years ago, OpenAI has introduced two open-weight AI models to the public. Named GPT OSS-120b and GPT OSS-20b, these models are now downloadable from Hugging Face. Distributed under the flexible Apache 2.0 license, they are fully open for use by both developers and businesses.

The model comes in two variants:

The model is available in two configurations: one with 120 billion parameters and another with 20 billion. The larger variant is capable of operating on a single Nvidia GPU and delivers performance on par with OpenAI’s o4-mini model. Meanwhile, the smaller version is optimized for efficiency, requiring only 16GB of RAM and offering comparable results to the o3-mini model.

These open-weight AI models are available for local deployment on machines that meet the specified hardware requirements. Since they operate entirely offline without needing to connect to any external servers, users can run them without an internet connection.

These models are built on a Mixture-of-Experts (MoE) framework, which activates just a fraction of the total parameters—approximately 5.1 billion per token for the 120b version—boosting both speed and resource efficiency. After initial training, they undergo an intensive reinforcement learning phase using significant computational power, refining their reasoning skills and aligning their performance with OpenAI’s advanced o-series models.

Availability:

Starting today, two models—one with 120 billion parameters and another with 20 billion—can be accessed through services such as Hugging Face, Databricks, Azure, and AWS. Released under the Apache 2.0 license, both models are open for commercial use and customization.

“These models work seamlessly with our Responses API and are built to integrate smoothly into agent-based systems. They excel at following detailed instructions, performing tasks like executing Python code or conducting web searches, and demonstrating strong reasoning skills. Notably, they can adapt their level of reasoning based on task complexity, making them ideal for low-latency scenarios that don’t demand deep analysis. Fully adaptable, the models support complete chain-of-thought processes and enable generation of structured outputs,” the firm said in a statement.

Read Next
Get live Share Market updates, Stock Market Quotes, and the latest India News
This article was first uploaded on August six, twenty twenty-five, at thirty-six minutes past eleven in the morning.
X