The initial reactions to the groundbreaking OpenAI Open Source GPT-OSS models are very diverse and mixed

The long -awaited return of Opennai to the “open” namesake took place yesterday with the release of two recent large language models (LLM): GPT-OS-1220B and GPT-OS-20B.

But despite the achievement of technical comparative tests on an equal footing with other powerful own offers, the AI OpenAI model, a wider program for artificial intelligence and the initial of the user community The answer has been on the map so far. If this edition were a film premiere and rated in Rotten Tomatoes, we’d look at almost 50% of the division based on my observations.

- Advertisement -

First background: OpenAI has released these two recent language models only (without generating or analyzing an image) Both under the use of the Apache 2.0 Open Source license – – For the first time since 2019 (before chatgpt) that the company did it with the help of the latest language model.

. The whole era of chatgpt from the last 2.7 years has been powered by its own or closed models so farThose that controlled Opeli and users had to pay for access (or use the free level from which the limits were limited), with a limited option of adjusting and there is no way to start their offline or on private computer equipment.

AI scaling hits its limits

Power capitals, the growing costs of the token and inference delay are transforming AI Enterprise. Join our exclusive salon to discover how the best teams are:

Changing energy into a strategic advantage
Architect of effective inference regarding real capability profits
Unlocking competitive roi using balanced AI systems

Secure your house to remain ahead: https://bit.ly/4mwgni

But all this has modified thanks to the release of a pair of GPT-OSS models yesterday, one larger and stronger for use on one GPU NVIDIA H100 processor on Say, a small or medium size data center or server farm and an even smaller farm, which works on one consumer laptop or a computer on a computer.

Of course, the models are so recent that the community advanced by AI AI took independent launch and testing them on its own individual comparative tests (measurements) and tasks.

AND Now we get a wave of feedback from optimistic enthusiasm About the potential of those powerful, free and efficient recent models for the foundation of dissatisfaction and terror what some users consider to be necessary problems and restrictionsEspecially compared to the Similarly Apache license wave 2.0 Powerful Open Source, Multimodal LLM from Chinese startups (which may also be taken, adapted, operated locally on American equipment for free by American corporations or corporations in every single place around the world).

High indicators, but still behind Chinese leaders of Open Source

Intelligence benchmarks place GPT-OSS models before most American Open Source offers. According to independent third parties AI Benchmarking Artificial AnalysisGPT-OS-120B is “the most intelligent model of American open weights”, though He still does not meet Chinese heavyweight, equivalent to Deepseek R1 and Qwen3 235B.

“After reflection, that’s all they did. They moved on comparative tests,” the self -appointed Deepseek “Stan” wrote @teortaxestex. “No good derivative models will be trained … No new use … Steal theorem about the praise of rights.”

That skepticism is repeated by the nickname Open Source AI Teknium researcher (@Teknium1)co -founder of Rival Open Source Model AI Nous ResearchWho called edition “Justified nothing of burger” on X and it was expected that the Chinese model will soon overshadow it. “Generally very disappointed and legally I contributed to this,” they wrote.

Bench-Maxxing on mathematics and coding at the expense of writing?

Another criticism focused on Apparent narrow utility of GPT models.

Influence “Oral ali -voltage (@scaling01)“He noticed that the models stand out in mathematics and coding, but” they lack the taste and common sense. ” He added: “So it’s just a model of mathematics?”

In the tests of creative writing, some users have found that the model injecting the equations to poetic outputs. “This happens when benchmarkmax” Teknium noticedSharing a screenshot in which the model added an integral MID-POM formula.

AND @KalamaseA researcher from decentralized AI Model Training Company Main intellectHe wrote that “GPT-OSS-120B knows less about the world than what a good 32b does. I probably wanted to avoid copyright problems, so they probably went to most synthesizers. Quite destructive things”

Former Googler and an independent AI programmer Kyle Corbitt agreed that a pair of GPT-OS models seemed To be trained primarily in the field of synthetic data – that is, data generated by the AI model especially for recent training – making it “extremely spicy”.

It is “great in the tasks on which he is trained, really bad in everything,” wrote Corbitt, i.e. Great in coding and mathematical problems, and bad in a larger variety of language tasks, equivalent to creative writing or generation of reports.

In other words, the fee consists in the proven fact that Opeli deliberately trained a model of more synthetic data than facts and numbers in the real world to avoid using data protected by protected law from web sites and other repositories, of which there is no use, which is something that is what is what, and many other AI corporations have been accused in the past and are addressed to ongoing rights.

Others speculated Openai, they may train a model above all synthetic data Avoid security and safety problemswhich ends up in worse quality than if it was trained in a more real (and presumably -protected data.

Regarding comparative results of other corporations

In addition, the assessment of models in comparative tests of other corporations got here in terms of indicators in the eyes of some users.

Speechmap – which measures LLM performance in terms of compliance with user prompts to generate prohibited, biased or politically sensitive results – showed the results of compliance for GPT-OSS 120B floating below 40%IN near the bottom of peer models, Which indicates resistance to observing the user’s demands and by default in the handrail, potentially at the expense of providing accurate information.

IN Assistance of polyglot assistanceIN GPT-OS-1220B obtained only 41.8%in multilingual reasoning-in the case of competitors equivalent to Kim-K2 (59.1%) and Deepseek-R1 (56.9%).

Some users also said that their tests indicate that the model is strangely resistant to generating criticism of China or Russia, Contrast with the treatment of the US and the EU, raising questions about bias and filtering of coaching data.

Other experts applauded the edition and what it signals for the OPEN SOURCE AI

To be honest, not all comments are negative. Software engineer and close AI Watcher Simon Willison called the “really impressive” edition on X, developing In the post on the blog ON The performance and ability of models models to achieve parity thanks to the reserved O3-Mini and OPENAI OPENAI models.

He praised their good results in the field of reasoning and the STEM test zone and hailed the recent format of the “Harmony” template-which offers programmers more structured model response dates-and support for the use of third-party tools as a significant contribution.

IN long postClem Delangue, CEO and co -founder of sharing the AI code and the Open Source community HuggingHe encouraged users not to hurry to judgment, indicating that inference for these models is complex, and early problems may result from infrastructure instability and insufficient optimization among hosting suppliers.

“The power of open Source is that there is no fraud,” said Delangue. “We will discover all strong and restrictions … gradually.”

Wharton School of Business at the University of Pennsylvania Professor Ethan Mollick was much more careful, Who wrote on x That “the US probably has leading models of open weight (or close to it)”, but it was asked if it is one -off by OpenAI. “Lead will evaporate quickly when others are catching up” He noticed, adding that it is not clear what incentives Otnai must inform the models.

Nathan Lambert, leading AI researcher at Rival Open Source Lab Allen Institute for AI (AI2) and commentator, He praised the symbolic importance of the release on his blog Interconnectscalling it “A phenomenal step for an open ecosystem, especially for the West and its allies, The fact that the best -known brand in AI has returned to the opening of the release of models. ”

But he Warning at X this GPT-OS Is “It is unlikely to slow down significantly [Chinese e-commerce giant Aliaba’s AI team] Qwen ” Citing usability, efficiency and diversity.

He argued that the edition means an necessary change in the US towards open models, but that OpenAI still has a “long path” to catch up in practice.

Divided verdict

The verdict is divided for now.

Openai GPT-OSS models are a landmark in terms of licensing and availability.

But although reference research looks solid, real “vibrations”-as many users describe-they are less convincing.

Whether programmers can build strong applications and derivatives at the GPT-OSS summit will determine whether the edition can be remembered as a breakthrough or.

Daily observations in matters of business use with VB every day

If you would like to impress your boss, VB Daily is covered by you. We offer you an internal measure about what corporations do with generative artificial intelligence, from regulatory changes to practical implementation, so you possibly can share insights for the maximum roi.

Read our Privacy Policy

Thanks for the subscription. Check out more VB newsletter here.

There was a mistake.

The initial reactions to the groundbreaking OpenAI Open Source GPT-OSS models are very diverse and mixed

High indicators, but still behind Chinese leaders of Open Source

Bench-Maxxing on mathematics and coding at the expense of writing?

Regarding comparative results of other corporations

Other experts applauded the edition and what it signals for the OPEN SOURCE AI

Divided verdict

Latest Posts

Recomended