AI startup from Singapore Sapient’s intelligence He developed a new AI architecture, which may equal, and in some cases significantly exceeds large language models (LLM) in complex tasks of reasoning, and at the same time is much smaller and more practical.
Architecture, often called Hierarchical reasoning model (HRM), is inspired by how the human brain uses a clear Systems of slow, intentional planning and quick, intuitive calculations. The model achieves impressive results with a fraction of data and memory required by today’s LLM. This performance may have valid implications for actual applications AI enterprises in which data is rare and the calculation resources are limited.
The limits of chain reasoning
In the face of a complex problem, the current LLM largely rely on hints on a chain (COT), spreading problems at indirect text stages, mainly forcing the model to “loud thinking” when it really works on the solution.
Although COT has improved the ability to reason LLM, it has fundamental limitations. In them paperScientists from Sapient Intelligence say that “the reasoning cot is a ball, not a satisfactory solution. It is based on fragile, defined by man’s distributions, in which a single error or mission of steps can completely deraise the process of reasoning.”
The AI Impact series returns to San Francisco – August 5
The next AI phase is here – are you ready? Join the leaders from Block, GSK and SAP to see the exclusive look at how autonomous agents transform the flows of the work of the company-decision-making in real time for comprehensive automation.
Secure your home now – the space is limited: https://bit.ly/3guplf
This dependence on generating explicit language expresses the reasoning of the model to the token level, often requiring huge amounts of training data and the production of long, slow answers. This approach also omits the type of “latent reasoning”, which occurs internally, without clear articulation in the language.
As scientists note, “a more efficient approach is needed to minimize these data requirements.”
Hierarchical approach inspired by the brain
To go beyond Cot, scientists examined “latent reasoning”, where as a substitute of generating “thinking tokens”, model reasons in the internal, abstract representation of the problem. This is more consistent with how people think; As the article states: “The brain maintains long, coherent reasoning chains with extraordinary performance in a hidden space, without constant translation back into the tongue.”
However, achieving this level of deep, internal reasoning in artificial intelligence is difficult. Simply putting more layers in the deep learning model often results in the problem of a “disappearing gradient”, in which learning signals weaken layers, which makes the training ineffective. Alternative, repetitive architecture that loopes over calculations may suffer from “early convergence”, in which the model settles too quickly on the solution without a full examination of the problem.
Looking for a higher approach, the Sapient Team turned to the Neuronuki for a solution. “The human brain is a convincing plan to achieve an effective computing depth, which is lacking in contemporary artificial models,” scientists write. “Hierarchically organizes calculations in cortical regions operating in various time scales, enabling deep, multi -stage reasoning.”
Inspired by the incontrovertible fact that they designed a HRM with two conjugated, recurrent modules: a high -level module (H) to slow, abstract planning and low level (L) module to fast, detailed calculations. This structure allows the process that the team calls “hierarchical convergence”. The intuitively fast L module deals with part of the problem, taking many steps until it reaches a stable, local solution. At this point, the slow Hodule H adopts this result, updates its overall strategy and gives Module L a new, sophisticated subprob with a work. This effectively reset the L module, stopping it from getting stuck (early convergence) and enabling the entire system of a long sequence of steps reasoning using the architecture of the Lean model, which does not suffer from the disappearance of gradients.

According to the article “This process allows HRM to perform a sequence of separate, stable, nested calculations in which the H module manages the general strategy of problem solving, and module L performs intensive search or improvement required for each stage.” This nested design of the loop allows the model to deeply reason in its hidden space without the need for long hints of the cot or huge amounts of data.
The natural query is whether this “latent reasoning” is associated with the cost of interpretation. Guan Wang, founder and general director of Sapient Intelligence, pushes this concept, explaining that the model’s internal processes may be decoded and visualized, similar to COT provides a model for considering of the model. It also indicates that COT itself can mislead. “Cot really does not reflect the internal reasoning of the model,” said Wang Vangebeat, referring to research showing that models can sometimes give correct answers with incorrect reasoning and vice versa. “It remains basically a black box.”

HRM in motion
To test their model, scientists brought HRM against testing points, which require intensive search and withdrawal, akin to abstraction and reasoning (Arc-agi), extremely difficult puzzles of Sudoku and complex tasks of resolving the maze.
The results show that HRM learns to unravel problems that are difficult to even advanced LLM. For example, in relation to “Sudoku-Extreme” and “Maze-Hard”, the most up-to-date models of the cot failed completely, gaining 0% accuracy. However, HRM achieved almost perfect accuracy after training only 1000 examples for each task.
With the arc-agi, abstract and generalization test, the 27m-parameter HRM obtained 40.3%. This exceeds leading models based on the cot, akin to much larger O3-Mini-High (34.5%) and Claude 3.7 Sonnet (21.2%). The performance, achieved without a large body before training and with very limited data, emphasizes the power and performance of its architecture.

While solving puzzles shows the power of the model, implications in the real world are in a different class of problems. According to Wang, programmers should proceed to make use of LLM for tasks based on language or creative, but for “complex or deterministic tasks” architecture much like HRM offers excellent performance with a smaller number of hallucinations. It points to “sequential problems requiring complex decision making or long -term planning”, especially in delay -sensitive areas akin to embodied AI and robotics, or data domains akin to scientific exploration.
In these scenarios, HRM not only solves problems; He learns to unravel them higher. “In our Sudoku experiments at the master level … HRM gradually needs fewer steps as the training progresses – for a novice who becomes an expert,” Wang explained.
In the case of an enterprise here, architecture effectiveness translates directly into the lower results. Instead of a serial, token-by-day generation cot, parallel HRM processing allows Wanga estimates to be “100 times the time of completion of the task.” This means lower application delay and the ability to run powerful reasoning on edge devices.
Cost savings are also significant. “Specialized reasoning engines, such as HRM, offer a more promising alternative to specific complex tasks of reasoning compared to large, expensive and demanding delays of API models,” Wang said. To look at performance from a perspective, he noticed that the training of the Sudoku model at the skilled level takes about two hours of GPU, and for a complex arc-agi reference point, from 50 to 200 hours of GPU-Uphabers of resources needed for mass models of foundations. This opens a path to unravel specialized business problems, from logistics optimization to complex system diagnostics, where each data and the budget are finished.
Looking to the future, Sapient Intelligence is already working on HRM evolution from a specialized solution to the problem with a more general reasoning module. “We are actively developing brain inspired by HRM,” Wang said, emphasizing the promising initial health care, climate forecasting and robotics. It was teasing that these new generation models would differ significantly from today’s text systems, especially by including self -order ability.
Work suggests that in the class of problems that were surprised by today’s AI giants, the path forward is probably not larger models, but smarter, more structured architecture inspired by the Ultimate Realking engine: the human brain.
