Last yr was the yr of AI piloting. Companies purchased LLM subscriptions, managers checked worker utilization, and coffee chats abounded in the “AI wrote my note” theme.
Looking around today, there it is widespread disappointment with the impact of AI pilot projects. Add to that the recent sell-off in SaaS stocks, and the query is now not, “Are we using AI?” but somewhat “Does it work?”
Artificial intelligence is an invention that is only becoming an innovation. Invention is a recent possibility; it is not an innovation unless there is a business model. In this light, last yr’s experiments were a smart move.
It is now becoming clear that innovation will take the type of artificial intelligence systems tasked with actual decisions – what Peter Drucker would call executives, and what is today called agentic AI.
When we get to the query: Does it work?, we will turn to one of Drucker’s mental disciples for a framework to guide us forward. Andy Grove, legendary former CEO Intelturned Drucker’s writings into a tough, pragmatic approach to managing knowledge employee organizations. His book “High Output Management” presents a classic framework for measuring the performance of middle managers. This is not an easy thing to measure. But Grove is adamant that it may and needs to be measured.

When addressing the query of whether AI agents deliver tangible value, we must shift our focus away from actions, anecdotes, and initiatives. These are the entrances.
Grove says organizations have to focus as an alternative. If we attempt to think like Grove, we’ll first define the business final result we would like to attain and then measure our agent AI only in terms of whether that performance metric is higher.
Mathematical approach
When we began working on this across our entire software portfolio a few years ago, I used to be very lucky to satisfy him Dario Fanucchia mathematician who used artificial intelligence to unravel real-world problems in a very similar way. He is also the co-founder and CTO of the company Scholar 1a ten-year, over 70-person team of mathematicians and engineers who have accomplished lots of of projects for leading firms around the world.
His approach to them focuses solely on improving core business metrics.
Isazi got here up with the same idea of measuring performance, although starting from mathematics somewhat than organizational behavior. The idea is to approach AI projects as if they were mathematical optimization problems: define a goal metric (similar to throughput or working capital), ask what variables influence that metric, and model the mechanism by which the goal metric moves.
All initiatives are then aligned to this goal measure, and success is measured by its improvement. This suits well with how we build and improve AI models: benchmarking and evaluation are at all times the primary measure of success. In this case, these rankings are directly linked to business metrics.
You have to have an output that you desire to measure. And you then watch that output measurement as a meter and see how long it takes for the meter reading to alter, how much it changes, in what direction, and whether it stays the same.
The time it takes to look at (and maintain) material movement is called “Production Time.” Our theory as to why so many pilot projects fail is that firms are inclined to select an AI tool and a pilot duration, and then have a qualitative conversation with users at the end of that point.
While we Stratham and Isazi appreciate experiments and pilot programs. We have found that the best results may be obtained when this process is reversed. We select the final result we would like to enhance, change the AI tools until we move the dial, and measure the time it takes to positively and sustainably change the final result. The shorter the production time, the higher.
Real world example
Let me share an example.
One of Strattam’s portfolio firms, Trax technologieshelps very large international firms manage their global shipping. A key a part of the offer is to make sure that freight invoices are complete, consistent with the contract, approved for payment and properly accounted for.
Trax operates in all geographies and modes of transportation with 1000’s of carriers. Discrepancies between the invoice and the freight forwarder’s contract are common. Handling these “exceptions” at scale is a key a part of the service, and historically, Trax has had a large internal team to deal with such issues.
In 2024, it identified AI’s ability to deal with some of those exceptions as a key opportunity and developed the AI Audit Optimizer tool internally. The initial goal was clear: some exceptions resolved without human intervention.
In the first quarter after release, Trax AI Audit Optimizer resolved roughly 826,000 exceptions that might otherwise have required human intervention. It was a good start, but not value writing about yet.
However, in the second quarter, as an alternative of improving, the system remained at the same level. So Trax quickly ran experiments to see what would improve the results. In the third quarter, the company discovered that the engineer’s quick interaction with the system made a huge difference. As a result, the variety of resolved exceptions tripled to 2.5 million in the fourth quarter.
Now we’re talking.
With performance metrics in mind, Trax moves forward by adjusting the points of interaction between the high-speed engineer and the system. Data from successful and failed solutions was used to retrain the system. The company also set quarterly goals; next quarter will aim to have Trax AI Audit Optimizer solve more problems than in any previous quarter.
This story shows how performance metric research allowed the company to fine-tune and adapt its AI tools to deliver results that basically matter. Trax intends to unravel its customers’ problems in order to realize market share. The use of artificial intelligence has helped it achieve this, and performance measurements prove the real value of AI innovation.
Measure what matters
Amid all the noise, all of us want our firms to truly adapt, to truly deliver value to customers, and to truly succeed. We know that we cannot proceed doing what we have been doing and that our future may depend on our ability to adapt. But this is different from actual customization.
To adapt effectively, resist the urge to purchase tools and run pilots, tell anecdotes and report on your activities. This is just input. Instead, discover the final result measurement that matters and watch it like a hawk to see if AI is delivering irrefutable business results. If it doesn’t, keep changing your AI until the dial moves. By drawing on the proven wisdom of Drucker and Grove, you’ll make sure that artificial intelligence earns its living in your organization.
