Implementation GPT-5 OPENAI is not going smoothly

Launching a long expected recent OPENAI, GPT-5 model is for a rocky start At least.

- Advertisement -

Even Forgiving errors on the charts AND Voice data Yesterday Live presentation the recent model (actually 4 separate models and “thinking” mode that may be involved in three of them), and The variety of user reports has appeared since the GPT-5 release showing that it hurts badly When solving relatively easy problems, which are preceded by OpenAI models – and rivals from competitive AI laboratories – answer accurately.

For example, a data scientist Colin Fraser has published screenshots showing GPT-5 error with mathematical evidence in mind (or repeating 8,888 is after all 9).

AI scaling hits its limits

Power capitals, the growing costs of the token and inference delay are transforming AI Enterprise. Join our exclusive salon to find how the best teams are:

Changing energy into a strategic advantage
Architect of effective inference regarding real capability profits
Unlocking competitive roi using balanced AI systems

Secure your home to stay ahead: https://bit.ly/4mwgni

It is too Algebra’s easy arithmetic was not succeeded problem This basic students could probably nail 5.9 = x + 5.11.

Using GPT-5 to judge their very own erroneous charts of OPENAI presentations also did not bring helpful or correct answers.

It also failed This tougher problem with the mathematical word below (which, truthfully, at the starting surprised this man …Although Elon Musk’s Grok 4 Ai replied accurately. To get a clue, think that flags in this case can’t be divided into smaller portions. They must remain in tact, like 80 separate units, so no halves or quarters).

Made older model 4O Better for me At least one of those mathematical problems. Unfortunately, OpenAI is slowly observing these older models-in this previous default GPT-4O and a powerful modeling model O3 – For CHATGPT users, although they’ll still be available in the application programming interface (API) for programmers in the foreseeable future.

Not nearly as good in coding as reference studies indicate

Although OpenAI internal references and some outer exterior have shown GPT-5 to surpass all other models in codingIN It seems that in real use, the recently updated Claude Opus 4.1 Anthropica seems higher to perform in the “one shooting” of some tasksThat is, supplementing the desired application or user software for its specification. See The following example from the developer Justin Sun published to X :

One-Shot TEST OPUS 4.1 “Create 3D Capybara Petting Zoo”-interchangeably 8 minutes
It was really quite crazy, not only it is prettier and moving capida, but there are individual levels of affinity, day/night switch, feeding, and even screenshots pic.twitter.com/fikto3fk4
– Justin (@justinsunyt) August 7, 2025

In addition, AREport from the SPLX security company It was found that the inner security layer of OPENAI left serious gaps in areas equivalent to adaptation of activities and susceptibility to a quick injection and darkened logical attacks.

Although anecdotal, checking the temperature of how the model copes with the first AI users seems to point a cool reception.

Ai influencer and former googler balance sidhu published a survey on X with a request for “climate control” from its observers and a wider base of users, and so far, with 172 votes, The overwhelming response is “a little half”.

Okay, checking the climate of GPT-5
– balance sidhu (@bilawalsidhu) August 7, 2025

And as Pseudonym leaks and information account writtenIN “The overwhelming consensus on GPT-5 with X and Reddit AMA is predominantly negative.”

The overwhelming consensus on the GPT-5 with X and Reddit AMA is predominantly negative
Most users are dissatisfied with broken model and non-PRO users who do not have access to older models
What are your initial thoughts on the GPT-5?
– and leaks and news (@ailaeaksandnews) August 8, 2025

Tibor Blaho, chief engineer in AIPRM and popular AI leaks and information poster on X, summed up many problems with Implementation of chatgpt-5 in a great postemphasizing that one of the recent features of the awning -Automatic “router” in chatgpt, which chooses the way of pondering or not pondering for the basic GPT-5 model depending on the difficulty of the inquiry-was one of the principal complaints, Given that the model seemed default, the default mode that did not think for many users.

A bit sad, how the launch of the GPT-5 goes so far, especially after a long expectation and high expectations
– automatic switching between models (router) seems partly broken/unbelievable
– It is not clear, with which model you truly interact (standard or mini, …
– Tibor Blaho (@btibor91) August 8, 2025

A contest waiting for the wings

Therefore The sentiment towards ChatgPT-5 is far from universally positive, emphasizing a significant issue for OpenAI In the face of growing competition from the principal American rivals, equivalent to Google and Anthropic, in addition to the growing list of free, open source and powerful Chinese LLM offering functions that lack many American models.

Take ALIBABA QWEN team AI researchers, Who Just today he updated his highly efficient QWEN 3 model to have 1 million context tokens – – Giving users the opportunity to exchange almost 4x such a great amount of data with the model in one rear interaction/Forth, as GPT-5 offers.

Considering the other great edition of OPENENAI this week-none GPT-OPEN SOURCE models-they also have a mixed party from early users, they are not currently looking for a dedicated AI company by users (700 million lively ChatgPT users from this month).

Indeed, it is also illustrated by Market users of the Polimarket plants In the prevailing deciding on the release of GPT-5 to Google would probably have the best AI model by the end of this month, August 2025.

Other advanced users like Other AI co -founder and general director Matt Shumerwhich received early access to the GPT-5 i I blogged about it positively in the review hereIN He said that his views would change when more and more people invent the best ways to make use of the recent model and adapt their integration approaches:

Many individuals who have bad experience use GPT-5 in agent harnesses that are not yet optimized.
In the case of each new edition of the model, there is a delay between release +, when corporations that integrate the model are really working.
Agent corporations are in a hurry to …
– Matt Shumer (@Mattshumer_) August 8, 2025

Although this is still early days for GPT-5-a sentiment can change significantly, because more users get their hands and try various tasks- Early indications do not appear to be this is the “Home Run” release for OpenAI In the same way as earlier editions, equivalent to GPT-4, and even newer 4o and O3. And this is a disturbing indicator An organization that has just collected the next round of financingBut it stays unprofitable as a result of the high costs of research and development.

Daily observations in matters of business use with VB every day

If you must impress your boss, VB Daily is covered by you. We offer you an internal measure about what corporations do with generative artificial intelligence, from regulatory changes to practical implementation, so you’ll be able to share insights for the maximum roi.

Read our Privacy Policy

Thanks for the subscription. Check out more VB newsletter here.

There was a mistake.

Active US investors were busy cutting checks in October

From Air Force officer to director general of space defense: why even Rogers left to build weapons for orbit

Cluely’s Roy Lee suggests that viral hype isn’t enough

Replika founder raises $20 million in pre-release content for Wabi, the ‘YouTube app’

Tech makers are piling up huge bets on startups even as appetite for mergers and acquisitions wanes

How entrepreneurs recover from life events without burning out

5 tips to engage Generation Z in email marketing

The pressure to start is real: why 72% of founders have mental health issues

5 questions startups should ask before implementing AI

5 email delivery tips to help you increase sales

From asking to offering: the mindset shift every founder needs

4 Strategies to Become a Category Creator

One book every new business owner should read

Why perfectionism delays your startup and how to think about it

4 things I will do differently when I start my next company

Startup funding continued to decline in November, with the number of mega rounds reaching a three-year high

German AI image generator Black Forest Labs raises $300 million at a $3.25 billion valuation as European AI funding ramps up

Funding for Edtech-specific startups remains low

Bezos launches AI startup with reported $6.2 billion in funding

10 Biggest Funding Rounds This Week: Artificial Intelligence and Defense Technologies Are Taking the Lead

Implementation GPT-5 OPENAI is not going smoothly

Not nearly as good in coding as reference studies indicate

A contest waiting for the wings

Latest Posts

Why AI coding agents aren’t production ready: fragile context windows, broken...

Tonight on StrictlyVC Palo Alto, the future of deep tech will...

“Truth serum” for artificial intelligence: a new OpenAI method for training...

This VC charges $0 for PR and has 12 unicorns to...

Why AI coding agents aren’t production ready: fragile context windows, broken...

“Truth serum” for artificial intelligence: a new OpenAI method for training...

AI Denial Becomes a Risk for the Enterprise: Why Ignoring “Weaknesses”...

Yes, I’m biased. Still, leading unicorns like Anthropic should be preparing...

Recomended

Why AI coding agents aren’t production ready: fragile context windows, broken refactors, lack of operational awareness

Tonight on StrictlyVC Palo Alto, the future of deep tech will be explained to you

“Truth serum” for artificial intelligence: a new OpenAI method for training models to confess errors

This VC charges $0 for PR and has 12 unicorns to show

Sources: Aaru, an artificial intelligence research startup, raises Series A value at a “principal” valuation of $1 billion

The 10 biggest financing rounds this week: Investors are back to writing big checks