The uprising emerges from hiding with the new type of AI model

The uprising emerges from hiding with the new type of AI model

BeginningThe new company based in Palo Alto founded by the IT professor Stanford Stefano Ermon, claims that it has developed a new AI model based on the technology of “diffusion”. Inception calls him a model of a large language based on diffusion or “DLM”.

Generative AI models, which are now widely divided into two types: large languages ​​(LLM) and diffusion models. LLM, built on Transformer architectureThey are used to generate text. Meanwhile, diffusive models that provide AI systems, akin to Midjourney and Openai’s Sora, are mainly used for creating images, video and sound.

- Advertisement -

The Inception model offers traditional LLM possibilities, including generating code and answering questions, but in response to the company much faster efficiency and reduced calculation costs.

Ermon told Techcrunch that he has long been studying methods to use diffusion models for the text in his Stanford laboratory. His research was based on the concept that traditional LLM are relatively slow in comparison with diffusion technology.

In the case of LLM, “you can’t generate a second word until you generate the first and you can’t generate the third until you generate the first two,” said Ermon.

Ermon was looking for a option to apply the diffusion approach to the text, because unlike LLM, which operate sequentially, diffusion models begin with the approximate estimate of generating data (e.g. image), and then focus on focusing data.

Ermon Hypothesis of generating and modifying large text blocks was possible with diffusion models. After years of rehearsals, Ermon and his student achieved a serious breakthrough, which they described in detail in Research article Published last yr.

Recognizing the potential of promotion, Ermon founded the Incepcja last summer, tapping two former students, Professor Ucla Aditya Grover and Professor Cornell Volodymyr Kuleshov to run the company.

While Ermon refused to debate the financing of Incepcja, TechCrunch understands that Mayfield Fund has invested.

Emron said that several clients have already been created, including nameless firms from the Fortune 100 list, dealing with their critical need for reduced AI delay and increased speed.

“We have discovered that our models can use GPU much more efficiently,” said Ermon, referring to computer systems commonly used to run models in production. “I think it’s a big deal. This will change the way people build language models. “

Inception offers API, in addition to the options for implementing devices on devices, support for tuning model and a DLM package from a box for various use. The company claims that its DLM can last as long as 10 times faster than traditional LLM, and costs 10 times less.

“Our” small “coding model is as good as [OpenAI’s] GPT-4O Mini, when more than 10 times faster, “said Techcrunch spokesman. “Our” mini “model exceeds small Open Source models such as [Meta’s] Lama 3.1 8b and reaches over 1000 tokens per second. “

“Tokens” is an industry appointment for pieces of raw data. A thousand tokens per second Indeed an impressive speedAssuming that the claims of Incepcja persist.

Latest Posts

Advertisement

More from this stream

Recomended