Two years after the public release of ChatGPT, conversations around artificial intelligence are inevitable as companies in every industry look to leverage large language models (LLM) to rework their business processes. However, although LLMs are powerful and promising, many business and IT leaders have come to over-rely on them and overlook their limitations. That’s why I foresee a future in which specialized language models, or SLMs, will play a larger, complementary role in enterprise IT.
SLMs are more commonly known as “small language models” because they require less data and training time and are “more streamlined versions of LLMs”. However, I prefer the word “specialized” because it higher describes the ability of those purpose-built solutions to perform highly specialized work with greater accuracy, consistency and clarity than an LLM. By complementing LLM with SLM, organizations can create solutions that leverage the strengths of each model.
Trust and the LLM “black box” problem
LLMs are extremely powerful, but they are also known to sometimes “lose the plot” or offer results that veer off target resulting from their general training and massive data sets. This trend becomes more problematic resulting from the incontrovertible fact that ChatGPT and other OpenAI LLMs are essentially “black boxes” that do not reveal how they get the answer.
The black box problem will develop into a larger problem in the future, especially for enterprises and business-critical applications where accuracy, consistency and compliance are paramount. Think of healthcare, financial services, and law as prime examples of professions where incorrect answers can have huge financial and even life-or-death consequences. Regulators are already taking notice and are more likely to start demanding explainable AI solutions, especially in industries that rely on data privacy and accuracy.
While companies often implement human-led approaches to alleviate these problems, over-reliance on LLM can result in a false sense of security. Over time, complacency can set in and mistakes can slip by unnoticed.
SLM = greater explainability
Fortunately, SLMs are higher at dealing with many of the limitations of LLMs. Rather than being designed for general-purpose tasks, SLMs are developed with a narrower focus and trained on domain-specific data. This specificity allows them to satisfy diverse linguistic requirements in areas where precision is most significant. Instead of relying on massive, heterogeneous datasets, SLMs are trained on targeted information, giving them contextual intelligence to deliver more consistent, predictable, and accurate responses.
This offers several advantages. First, they are more comprehensible, making it easier to know the source and rationale behind their results. This is crucial in regulated industries where decisions should be traced back to the source.
Secondly, their smaller size means they can often run faster than LLMs, which can be a key factor in real-time applications. Third, SLMs offer companies greater control over data privacy and security, especially if implemented internally or built specifically for the enterprise.
Furthermore, while SLMs may initially require specialized training, they reduce the risks associated with using external LLMs controlled by external providers. This control is invaluable in applications requiring rigorous data handling and compliance.
Focus on developing expertise (and watch out for vendors who overpromise)
I would like to make it clear that LLM and SLM are not mutually exclusive. In practice, SLMs can augment LLMs by creating hybrid solutions where LLMs provide broader context and SLMs provide precise execution. It’s still early days even for LLMs, which is why I at all times advise technology leaders to proceed to explore the many opportunities and advantages of LLMs.
Additionally, while LLM models can scale well for a number of problems, SLM models may not perform well for certain use cases. Therefore, it is necessary to have a clear understanding from the starting of what use cases should be tackled.
It’s also necessary that business and IT leaders devote more time and attention to building the distinct skills required to coach, tune, and test SLM. Fortunately, there is a lot of free information and training available from popular sources resembling Coursera, YouTube, and Huggingface.co. Leaders should ensure their developers have adequate time to learn and experiment with SLM as the battle for AI expertise intensifies.
I also advise leaders to vet partners fastidiously. I recently spoke with a company that asked for my opinion on the claims of a certain technology vendor. In my opinion, they either exaggerated their claims or simply didn’t understand the capabilities of this technology.
The company properly took a step back and implemented a controlled proof of concept to check the vendor’s claims. As I suspected, the solution simply wasn’t ready for primetime, and the company managed to maneuver away from it with relatively little time and money invested.
Whether a company is starting with a proof of concept or a live implementation, my advice is to start out small, test often, and build on early successes. Personally, I have experience working with a small set of instructions and information, but when I then feed the model more information, the results differ from what I expected. Therefore, the cautious approach is to take a slow and regular approach.
In summary, while LLMs will proceed to supply increasingly worthwhile capabilities, their limitations develop into increasingly apparent as enterprises increase their reliance on AI. The SLM complement offers a way forward, especially in high-stakes fields that require accuracy and explainability. By investing in SLM, companies can future-proof their AI strategies by ensuring that their tools not only drive innovation, but also meet the requirements of trust, reliability and control.
People who resolve about data
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including data scientists, can share data-related insights and innovations.
If you ought to read about modern ideas and current information, best practices and the future of information and data technologies, join us at DataDecisionMakers.
You might even consider writing your individual article!