Currently, large language models (LLM), like ChatGPT and Claude, have develop into an on a regular basis word around the world. Many people began to worry about this AI will come for their jobsso it’s ironic that just about all LLM-based systems fail at a sure bet: counting the number “r” in the word “strawberry”. They cannot handle just the alphabet “r”; other examples include counting the “m” in “mammal” and the “p” in “hippopotamus”. In this text, I’ll discuss the cause of these failures and provide a easy workaround.
LLMs are powerful artificial intelligence systems trained on huge amounts of text to understand and generate human-like language. They excel at tasks resembling answering questions, translating languages, summarizing content, and even writing creative texts by anticipating and constructing coherent answers based on the information received. LLMs are designed to recognize patterns in text, allowing them to perform a wide selection of linguistic tasks with impressive accuracy.
Despite their proficiency, the inability to count the number “r” in the word “strawberry” is a reminder that LLMs are incapable of “thinking” like humans. They don’t process the information we feed them like a human would.
Almost all current high-performance LLMs are built on them transformers. This deep learning architecture does in a roundabout way take text as input. They use a process called tokenizationwhich transforms text into numerical representations, i.e. tokens. Some tokens may consist of full words (resembling “monkey”), while others could also be parts of a word (resembling “mon” and “key”). Each token is like a code that the model understands. By dividing every little thing into tokens, the model can higher predict the next token in the sentence.
LLMs don’t memorize words; they fight to understand how these chips fit together in alternative ways, which makes them good at guessing what’s going to occur next. In the case of the word “hippopotamus”, the model may see the symbols for the letters “hip”, “pop”, “o” and “tamus” and not know that the word “hippopotamus” is made up of the letters – “h”, “i”, “p ”, “p”, “o”, “p”, “o”, “t”, “a”, “m”, “u”, “s”.
A model architecture that may directly parse individual letters without symbolizing them could potentially avoid this problem, but with today’s transformer architectures this is not computationally feasible.
Moreover, looking at how LLMs generate text output: They provide what the next word shall be based on the previous input and output tokens. While this works well for generating text that resembles contextual text, it is not suitable for easy tasks resembling counting letters. When asked about the number of “r”s in the word “strawberry”, LLMs simply predict the answer based on the structure of the input sentence.
Here’s a workaround
Although LLMs may not have the option to “think” or reason logically, they are adept at understanding structured text. An ideal example of structured text is the computer code of many programming languages. If we ask ChatGPT to use Python to count the number of “r” in the word “strawberry”, it would most probably get the correct answer. When there is a need for LLMs to perform counting or one other task that will require logical reasoning or arithmetic calculations, the broader software will be designed such that the prompts include asking the LLMs to use a programming language to process the input query.
Application
A straightforward letter counting experiment reveals the fundamental limitations of LLM like ChatGPT and Claude. Despite their impressive abilities to generate human-like text, write code, and answer any query asked of them, these AI models cannot yet “think” like a human. The experiment shows models for what they are – predictive algorithms that match patterns, not “intelligence” capable of understanding and reasoning. However, having prior knowledge of what types of prompts work well can alleviate the problem to some extent. As artificial intelligence becomes increasingly integrated into our lives, recognizing its limitations is crucial to responsible use and realistic expectations of these models.
People who determine about data
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including data scientists, can share data-related insights and innovations.
If you would like to read about revolutionary ideas and current information, best practices and the future of data and data technologies, join us at DataDecisionMakers.
You might even consider writing your personal article!