OpenAI Launches New Open-Weight Models for Local and Cloud Use

OpenAI has announced new open-weight language models aimed at local and cloud deployments, enhancing accessibility and functionality for developers.

Key Points

  • • OpenAI released two new models: gpt-oss-120b and gpt-oss-20b.
  • • Cloudflare is a launch partner, integrating the models into Workers AI.
  • • The models support chain-of-thought reasoning and web access.
  • • CEO Sam Altman emphasizes accessibility and benefits for all in AI.

OpenAI has officially launched two new open-weight language models, gpt-oss-120b and gpt-oss-20b, on August 5, 2025, targeting both local device deployment and integration into platforms like Cloudflare Workers AI. This enhances accessibility for developers seeking cost-effective alternatives to traditional cloud services.

The gpt-oss-120b model features 117 billion parameters and can operate on a single GPU with 80 GB of RAM, while the smaller gpt-oss-20b, with 21 billion parameters, requires only 16 GB of RAM, making it suitable for a broader range of devices, including laptops. Both models can perform chain-of-thought reasoning and access the web, providing performance that rivals certain modes available in ChatGPT. OpenAI's CEO, Sam Altman, emphasized the importance of making AI beneficial and accessible for all, stating, ‘We’re excited to make this model, the result of billions of dollars of research, available to the world.’ (Source ID: 22924)

A significant highlight of this release is the collaboration with Cloudflare, which positions itself as a Day 0 launch partner. This partnership allows the new models to integrate seamlessly with Cloudflare Workers AI using a new Responses API format. Innovative features include a Code Interpreter that enables developers to execute stateful Python code securely, enhancing the overall functionality of the models. According to Cloudflare, this allows the models to deliver improved performance while maintaining a minimal GPU memory footprint due to the models’ use of Mixture-of-Experts architecture and FP4 quantization, which is more efficient than traditional models running at FP16. (Source ID: 22922)

The launch of these open models marks a significant move in OpenAI's strategy to allow users to download and fine-tune the models while offering flexibility in their integration. However, details regarding the training datasets used are not publicly disclosed, raising questions about transparency in this new offering compared to other competing models. Altman has not indicated a timeline for future releases of open models after these latest announcements, keeping developer interest piqued regarding potential advancements in OpenAI's open AI initiatives.