This "cheap" model of artificial open source intelligence actually burns your calculation budget

Complex New study It revealed that the models of the Open Source artificial intelligence devour much more computing resources than their closed source competitors when performing an identical tasks, potentially undermining their cost benefits and transforming the method of assessing AI implementation strategy.

Research conducted by AI Nous ResearchHe stated that models open in the amount of 1.5 to 4 times more tokens-AI-Niż closed models closed. Openai AND Anthropic. In the case of easy questions about knowledge, Luka has dramatically expanded, and some open models use as much as 10 times more tokens.

Measurement of considering performance in reasoning models: missing reference pointhttps://t.co/b1e1rjx6vz
We measured the use of tokens in reasoning models: open output models 1.5-4x more tokens than models enclosed in an identical tasks, but with a huge variance depending on the type of task (to … pic.twitter.com/ly1083won8
– Nous Research (@Nousresearch) August 14, 2025

- Advertisement -

“Open weight models use 1.5-4 × more tokens than closed ones (up to 10 × to get simple knowledge about knowledge), which sometimes makes them more expensive to ask despite lower testing costs,” wrote scientists in their report published on Wednesday.

The discoveries undermine the dominant assumption in the AI industry that Open Source models offer clear economic advantages in comparison with reserved alternatives. While Open Source models often cost less for the token, the study suggests that this advantage might be “easy to shift if they require more tokens to reason the problem.”

AI scaling hits its limits

Power capitals, the growing costs of the token and inference delay are transforming AI Enterprise. Join our exclusive salon to find how the best teams are:

Changing energy into a strategic advantage
Architect of effective inference regarding real capability profits
Unlocking competitive roi using balanced AI systems

Secure your place to stay ahead: https://bit.ly/4mwgni

The actual cost of artificial intelligence: why “cheaper” models can break your budget

Examined examination 19 different AI models In three categories of tasks: basic questions about knowledge, mathematical problems and logical puzzles. The team measured “token performance” – how many models of computing units they use in relation to the complexity of their solutions – the record that received small systematic research despite its significant consequences of costs.

“The efficiency of the token is a critical measure for several practical reasons,” the scientists noted. “Although hosting of open mass models can be cheaper, this cost advantage can easily be balanced if they require more tokens to reason a given problem.”

AI Open Source models use as much as 12 times more computing resources than the most effective closed models for basic knowledge questions. (Credit: Nous Research)

Ineffectiveness is particularly clear for large reasoning models (LRM) that use prolonged “Thought chains“To solve complex problems. These models, designed to rethink step by step problems, can devour hundreds of tokens wondering easy questions that ought to require minimal calculations.

In the case of basic questions about knowledge, similar to “What is the capital of Australia?” The study showed that reasoning models publish “hundreds of tokens thinking about simple knowledge questions”, which might be answered in one word.

Which AI models actually provide bang for your zloty

Studies revealed clear differences between models suppliers. Openai models, especially his O4-mini and newly published Open Source GPT-OSS The variants have shown exceptional token performance, especially in the case of mathematical problems. The study showed that the OpenAI models “are distinguished by the extreme efficiency of the token in mathematical problems”, consuming as much as three times fewer tokens than other industrial models.

Among the Open Source, Nvidia’s options Llama-3.3-NEWOTRON-SUPER-49B-V1 He appeared as “the most token model of open weight in all domains”, while newer models of firms similar to the bus showed “extremely high use of tokens” as protruding values.

The performance difference varied significantly depending on the type of task. While open models used about twice as many tokens for mathematical and logical problems, the difference was balloon for easy questions about knowledge in which effective reasoning needs to be unnecessary.

The latest OPENAI models achieve the lowest costs of easy questions, while some Open Source alternatives can cost much more despite lower valuations. (Credit: Nous Research)

What enterprise leaders should know about the cost of AI computers

Discoveries have immediate consequences for the adoption of AI enterprises in which calculation costs can scale quickly with use. Companies evaluating AI models often focus on the accuracy of references and prices on the token, but may overlook the total calculation requirements for real tasks.

“Better efficiency of the token of closed mass models often compensates for higher prices of API of these models,” said scientists when analyzing the total costs of inference.

The study also revealed that closed models suppliers appear to actively optimize in terms of performance. “Models of the closed weight have been iteratively optimized to use fewer tokens to reduce the costs of application”, while the Open Source models “increased the use of tokens in newer versions, probably reflecting the priority for better reasoning.”

Computational care differs dramatically between AI suppliers, and some models use over 1000 tokens to internal reasoning of easy tasks. (Credit: Nous Research)

As scientists broke the code on measuring artificial intelligence performance

The research team faced exceptional challenges in the performance of performance in various models architecture. Many closed models do not reveal their strict reasoning processes, as an alternative providing compressed summaries of their internal calculations to forestall the copying of their competitors techniques.

To solve this problem, scientists used tokens completion – a total computing unit calculated for each query – as a representative of the effort of reasoning. They discovered that “the latest models of closed sources would not share their strict traces of reasoning,” and as an alternative “use smaller language models for transcription of the thoughts chain into summaries or compressed representations.”

The study methodology included tests with modified versions of well -known problems to reduce the impact of remembered solutions, similar to changing variables in problems with mathematical competition with American Invitational Mathematics Exams (Aime).

Different AI models show different relationships between calculations and the output, and some suppliers squeeze signs of reasoning while others give full details. (Credit: Nous Research)

The future of artificial intelligence performance: what’s going to occur next

Scientists suggest that the efficiency of the token should grow to be the fundamental purpose of optimization along with the accuracy to the future development of models. “A more concentrated cot will also allow more efficient use of context and can counteract context degradation during difficult reasoning” ” They wrote.

Openai’s Open Source release GPT-OSS modelswhich shows the most up-to-date performance with a “freely available cot”, might be used as a reference point for optimizing other Open Source models.

Full Set of Research Data and Evaluation Code are Available at GitHubenabling other researchers to verify and extend the arrangements. When the AI industry is racing towards stronger reasoning, this study suggests that real competition may not apply to who can build the smartest artificial intelligence – but who can build the most effective.

After all, in a world where every token counts, the most prodigal models might be valued on the market, regardless of how well they will think.

Daily observations in matters of business use with VB every day

If you wish to impress your boss, VB Daily is covered by you. We provide you with an internal measure about what firms do with generative artificial intelligence, from regulatory changes to practical implementation, so you may share insights for the maximum roi.

Read our Privacy Policy

Thanks for the subscription. Check out more VB newsletter here.

There was a mistake.

This “cheap” model of artificial open source intelligence actually burns your calculation budget

The actual cost of artificial intelligence: why “cheaper” models can break your budget

Which AI models actually provide bang for your zloty

What enterprise leaders should know about the cost of AI computers

As scientists broke the code on measuring artificial intelligence performance

The future of artificial intelligence performance: what’s going to occur next

Latest Posts

Recomended