A brand new enterprise AI-focused startup is emerging from hiding today, promising to deliver so-called “task-optimized” models that deliver higher performance at lower costs.
Fastino based in San Francisco also reveals that it has raised $7 million in a pre-seed funding round from Insight Partners and M12, Microsoft’s Venture Fund, in addition to participation from Github CEO Thomas Dohmke. Fastino is building its family of enterprise AI models and developer tools. The models are recent and are not based on any existing large language model (LLM). Like most generative AI providers, Fastino’s models have a transformer architecture, although they use some progressive techniques aimed at improving accuracy and usability for the enterprise. Unlike most other LLM providers, Fastino models will run well on general-purpose CPUs and do not require expensive GPUs to run.
The idea for Fastino was born from the founders’ own experiences in the industry and the real challenges of implementing artificial intelligence on a large scale.
Ash Lewis, the company’s CEO and co-founder, was building a developer agent technology often known as DevGPT. His co-founder, George Hurn-Maloney, was previously the founding father of Waterway DevOps, which was acquired by JFrog in 2023. Lewis explained that his previous company’s developer agent was using OpenAI in the background, which led to some issues.
“We were spending almost a million dollars a year on API,” Lewis said. “We didn’t feel like we had any real control over it.”
Fastino’s approach is a departure from traditional large language models. Instead of making general-purpose AI models, the company has developed task-optimized models that excel at specific enterprise functions.
“The whole idea is that if you narrow down the scope of these models, make them less general, to make them more optimized for your task, they will only be able to respond within a certain range,” Lewis explained.
How a task-optimized model approach can improve enterprise AI performance
The concept of using a smaller model to optimize for a specific use case is not an entirely recent idea. Small Language Models (SLMs) similar to Microsoft’s Phi-2 and vendors similar to Arcee AI have been advocating this approach for some time.
Hurn-Maloney said Fastino calls its models task-optimized reasonably than SLM for a variety of reasons. First, he believes that the term “small” is often associated with less accuracy, which is not the case with Fastino. Lewis said the goal is actually to create a recent category of model that is not a generic model that is simply large or small in terms of the variety of parameters.
Fastino models are task-optimized reasonably than general models. The goal is to make models less broad in scope and more specialized for specific enterprise tasks. By focusing on specific tasks, Fastino claims its models are in a position to achieve greater accuracy and reliability in comparison with general language models.
These models particularly stand out:
- Structuring text data
- RAG (Recovery Assisted Generation) pipeline operation.
- Task planning and reasoning
- Generating a JSON response for a function call
Optimized models mean no GPU is required, lowering AI costs in the enterprise
The key differentiator of Fastino models is the fact that they’ll run on processors and do not require the use of AI GPU accelerator technology.
Fastino enables rapid inference about processors using many different techniques.
“If we’re just talking about absolutely simple terms, just multiply less,” Lewis said. “A lot of our techniques in architecture are simply focused on doing fewer matrix multiplication tasks.”
He added that the models provide answers in milliseconds, not seconds. This performance extends to edge devices, with successful implementations demonstrated on hardware as humble as a Raspberry Pi.
“I think many companies pay attention to TCO [total cost of ownership] for embedding AI in their applications,” added Hurn-Maloney. “So I think being able to take expensive GPUs out of the equation is obviously helpful as well.”
Fastino models are not yet widely available. That said, the company already works with industry leaders in consumer devices, financial services and e-commerce, including a major North American manufacturer of devices for home and automotive applications.
“Our ability to operate locally is really useful in industries that are quite sensitive to their data,” Hurn-Maloney explained. “The ability to run these models locally and on existing processors is quite enticing for financial services, healthcare and industries where data is more sensitive.”