OpenAI's Deep Research Crushes Competition with Unbelievable 26.6% Accuracy!

On Sunday, OpenAI introduced Deep Research – a versatile AI tool designed to perform multi-stage web research for intricate tasks. According to the creators of ChatGPT, this tool mimics a human research analyst, asserting that what the agent completes in ten minutes typically takes several hours for a human counterpart.

It appears that Deep Research is indeed meeting the high expectations set for it. Just over two weeks since its launch, it has demonstrated superior performance in the challenging AI exam known as Humanity’s Last Exam, outperforming ChatGPT03-mini and the DeepSeek R1 V3 model (as reported by TechRadar).

In essence, the AI test under consideration was developed by brilliant minds worldwide, boasting intricate queries. Earlier, DeepSeek had shown remarkable performance compared to other exclusive models, boasting an impressive 9.4% precision rate.

After the introduction of OpenAI’s o3-mini model, it managed to unseat the previously leading Chinese AI by achieving an accuracy score of 10.5%. Matters became intriguing as the setting was altered to o3-mini-high, which boosted the accuracy to 13%. The disparity in performance between these settings is due to the fact that the latter requires more time to process and analyze complex questions.

Conversely, OpenAI’s latest Deep Research agent AI achieved a score of 26.6% on the Final Humanity Exam, representing an impressive 183% enhancement in the precision of its results.

Sure thing!

This tool comes with powerful search features, enabling it to find solutions to many of the challenging trivia questions on the exam. This makes it more advanced than many other similar tools, giving it an edge in competitions.

A worker at OpenAI described his interaction with Deep Research as a “personal breakthrough in artificial general intelligence (AGI),” suggesting that it was a particularly significant or enlightening experience for him in the field of AGI.

Discovering how to utilize Deep Research has felt like a breakthrough in my understanding of AI. In just 10 minutes, it produces comprehensive and reliable competitive and market analysis with sourced data – something I would spend three hours on manually before.

2025-02-05 15:09

OpenAI’s Deep Research Crushes Competition with Unbelievable 26.6% Accuracy!

Read More