When AI Sounds Right — But Gets It Completely Wrong.

In April 2025, OpenAI introduced two advanced models: o3 and o4-mini. These models do more than just generate text; they reason, utilize toolswrite codeanalyze imagescreate visuals, and even browse the web.

They are undeniably powerful.

However, when tested with simple questions, an astonishing finding emerged:

Up to 80% of their answers were incorrect.

Impressive Machines. Still Vulnerable to Hallucinations.

According to OpenAI’s official system card:

  • o3 generated hallucinated answers 51% of the time on straightforward questions (Simple QA) and 33% when asked about real people (Person QA).
  • o4-mini performed even worse: 79% and 41%, respectively.

These questions are basic and should be easily answered by any assistant, whether AI or human.

Fluency ≠ Accuracy

Today’s AI is articulate, providing confident and persuasive responses.

However, it’s important not to confuse tone with truth.

The most dangerous mistakes are those that sound correct.

What This Means for You

If you’re using AI for tasks like writing lessons, debugging code, summarizing contracts, or supporting research, there’s one crucial step before placing your trust in it:

  • Verify the facts
  • Challenge the output
  • Be mindful of what AI can — and can’t — accurately understand

Especially for New Users

For students, seniors, or anyone new to AI, hallucination isn’t merely a technical issue; it represents a gap in digital literacy. This makes education essential.

📎 Access the complete OpenAI system card (33 pages of official testing data):

 o3 & o4-mini System Card – Download PDF

💬 How can we assist new users — particularly those most vulnerable — in identifying hallucinations before they lead to misconceptions?

No AI is infallible. But you can be informed. 

#AI2025 #OpenAI #Hallucination #TrustButVerify #ResponsibleTech #DigitalLiteracy #AIWithHumansInMind