The recently released Large Language Model (LLM) appeared during testing. recognize that he was assessed and commented on the relevance of the information processed. This has led to speculation that this answer could also be an example metacognition, understanding your personal thought processes. While recent LLM training has sparked debate about AI’s potential for self-awareness, the real story lies in the sheer power of the model, exemplifying the new possibilities that emerge as LLM scales.
With them, emerging capabilities and costs also grow and are now reaching astronomical amounts. Just as the semiconductor industry has consolidated around a handful of corporations that may afford modern, multi-billion-dollar chip manufacturing facilities, the field of artificial intelligence may soon be dominated only by the largest tech giants – and their partners – who can cover the costs of developing the latest entry-level models LLM similar to GPT-4 and Claude 3.
The cost of training the latest models, whose capabilities match and in some cases exceed human-level performance, is skyrocketing. In fact, the training costs associated with the latest models approach $200 million, threatening to reshape the industry landscape.
If this exponential growth in performance continues, not only will AI capabilities expand rapidly, but so will the costs. Anthropic is one of the leaders in building language models and chatbots. At least so far as benchmark results show their flagship Claude 3 probably is current leader in performance. Like GPT-4, it is considered a base model that is pre-trained on a diverse and extensive range of data to develop a broad understanding of language, concepts and patterns.
VB event
AI Impact Tour – San Francisco
Ask for an invitation
Most recently, the company’s co-founder and CEO Dario Amodei WITHScursed training costs for these models, which implies training Claude 3 will probably be around $100 million. He added that the models, which are currently in training and will probably be introduced later in 2024 or early 2025, “cost closer to a billion dollars.”
To understand the reason for these rising costs, we’d like to look at the ever-increasing complexity of these models. Each new generation has more parameters that enable more complex query understanding and execution, more training data, and greater amounts of computational resources needed. Amodei believes that the cost of training the latest models will probably be between $5 billion and $10 billion by 2025 or 2026. This will prevent all but the largest corporations and their partners from building these entry-level LLMs.
Artificial intelligence follows the semiconductor industry
In this manner, the artificial intelligence industry is following a similar path to the semiconductor industry. In the second half of the twentieth century, most semiconductor corporations designed and built their very own chips. As the industry followed Moore’s Law – a concept that describes the exponential rate of improvement in chip performance – the costs of each new generation of semiconductor equipment and manufacturing plants rose proportionately.
For this reason, many corporations have finally decided to outsource the production of their products. AMD is a good example. The company produced its own leading semiconductors, but in 2008 it made a decision separate their production plantsalso called factories, to reduce costs.
Due to the capital costs required, currently only three semiconductor corporations are building state-of-the-art factories using the latest process node technologies: TSMC, Intel and Samsung. Most recently, TSMC he said that building a new factory producing cutting-edge semiconductors will cost about $20 billion. Many corporations, including Apple, Nvidia, Qualcomm and AMD, outsource the production of their products to these factories.
Implications for AI – LLM and SLM
The impact of these increased costs varies depending on the AI environment, as not every application requires the latest and strongest LLM. The same applies to semiconductors. For example, in a computer, the central processing unit (CPU) is often made using state-of-the-art semiconductor technology. However, it is surrounded by other memory chips or networks that operate at slower speeds, which implies they do not have to be built with the fastest and strongest technology.
An AI analogy is the many smaller LLM alternatives that have emerged, similar to Mistral and Llama3, which provide several billion parameters as a substitute of over a trillion considered part of GPT-4. Microsoft recently released its own Small Language Model (SLM), Phi-3. How reported by The Verge incorporates 3.8 billion parameters and is trained on a dataset that is relatively smaller LLM similar to GPT-4.
The smaller size and training data set help reduce costs, although they could not offer the same level of performance as larger models. In this manner, these SLM modules resemble the chips in a computer that support the processor.
However, smaller models could also be appropriate for some applications, especially those where full knowledge of multiple data domains is not required. For example, SLM might be used to fine-tune company-specific data and jargon to provide accurate and personalized responses to customer queries. You also can train a person on data for a specific industry or market segment, or use it to generate comprehensive and tailored research reports and responses to queries.
As Rowan Curran, senior AI analyst at Forrester Research he said recently on the different language model options: “You don’t need a sports automobile all the time. Sometimes you wish a minivan or pickup truck. It won’t be one broad class of models that everybody will use for all use cases.
Few players increase their risk
Just as rising costs have historically limited the number of corporations able to build high-end semiconductors, similar economic pressures are now shaping the landscape for the development of large language models. These rising costs threaten to limit AI innovation to a few dominant players, potentially stifling broader creative solutions and reducing diversity in the field. High barriers to entry may prevent start-ups and smaller corporations from contributing to the development of AI, thus narrowing the scope of ideas and applications.
To offset this trend, the industry must support smaller, specialized language models that, like essential components of a broader system, provide critical and efficient capabilities for various area of interest applications. Promoting open source projects and collaborative efforts is key to democratizing AI development, enabling a wider range of participants to influence this evolving technology. By fostering an inclusive environment now, we will make sure that the future of AI maximizes advantages for global communities characterised by broad access and equal opportunities to innovate.
Data decision makers
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including data scientists, can share data-related insights and innovations.
If you would like to read about progressive ideas and current information, best practices and the future of data and data technologies, join us at DataDecisionMakers.
You might even consider writing your personal article!