Will OpenAI's next-generation model score 65% or higher on the GPQA benchmark? | Manifold

Will OpenAI's next-generation model score 65% or higher on the GPQA benchmark?

GPT-5 Capabilities #AI #Technology #OpenAI #Technical AI Timelines #Science

11

76

Ṁ253

Ṁ1k

2027

66%

chance

1D

1W

1M

ALL

Resolve to YES if OpenAI's next generation language model scores 65% or higher on the GPQA benchmark(extended set).

If OpenAI's existing model gets 65% or higher by post-training enhancements, that also counts.

There's room for improvement via prompt engineering after the release, but I don't know how long I should wait, so I will resolve this question as soon as OpenAI releases their next model.

Get Ṁ600 play money

Related in AI

Will Apple announce a partnership with OpenAI regarding Siri during WWDC 2024?

+12% 1d71% chance

Will I find GPT-4o more helpful than Claude 3 Opus for doing web development tomorrow?

+13% 1d62% chance

See more AI questions

Related in Technology

In early 2028, will an AI be able to generate a full high-quality movie to a prompt?

Will GPT-5 be released before 2025?

+4% 1d67% chance

See more Technology questions

Related in OpenAI

Will a prompt that enables GPT-4 to solve easy Sudoku puzzles be found? (2023)

Will OpenAI hint at or claim to have AGI by 2025 end?

See more OpenAI questions

Related in Technical AI Timelines

By the end of 2026, will we have transparency into any useful internal pattern within a Large Language Model whose semantics would have been unfamiliar to AI and cognitive science in 2006?

Will Superalignment succeed? (self assessment)

-7% 1d2% chance

See more Technical AI Timelines questions

Related in Science

Will the average global temperature in 2024 exceed 2023?

Will the LK-99 room temp, ambient pressure superconductivity pre-print replicate before 2025?

See more Science questions

More related questions

What will be true of OpenAI's next major LLM release (GPT-4.5 or GPT-5)?

Will "OpenAI" hit 50% of its previous all-time high search interest this week? (US Google Trends)

+18% 1d90% chance

Will OpenAI's next-gen math-focused model score at least 95% on the MATH benchmark?

Will OpenAI's next major LLM (after GPT-4) surpass 74% accuracy on the GPQA benchmark?

Will the "OpenAI hint at or claim to have AGI before 2025 end" market go above 60% before 2024 ends?

Will OpenAI offer a higher-tier version of ChatGPT, priced above US$49, by 2025?

Will there be a model that has a 75% win rate against the latest iteration of GPT-4 as of January 1st, 2025?

Will an AI model outperform 95% of Manifold users on accuracy before 2026?

Will a single model achieve superhuman performance on all OpenAI gym environments by 2025?

-20% 1d39% chance

Will openAI have the most accurate LLM across most benchmarks by EOY 2024?

AI questions

Will Apple announce a partnership with OpenAI regarding Siri during WWDC 2024?

Will I find GPT-4o more helpful than Claude 3 Opus for doing web development tomorrow?

Technology questions

In early 2028, will an AI be able to generate a full high-quality movie to a prompt?

Will GPT-5 be released before 2025?

OpenAI questions

Will a prompt that enables GPT-4 to solve easy Sudoku puzzles be found? (2023)

Will OpenAI hint at or claim to have AGI by 2025 end?

Technical AI Timelines questions

By the end of 2026, will we have transparency into any useful internal pattern within a Large Language Model whose semantics would have been unfamiliar to AI and cognitive science in 2006?

Will Superalignment succeed? (self assessment)

Science questions

Will the average global temperature in 2024 exceed 2023?

Will the LK-99 room temp, ambient pressure superconductivity pre-print replicate before 2025?

Related questions

What will be true of OpenAI's next major LLM release (GPT-4.5 or GPT-5)?

Will OpenAI offer a higher-tier version of ChatGPT, priced above US$49, by 2025?

Will "OpenAI" hit 50% of its previous all-time high search interest this week? (US Google Trends)

Will there be a model that has a 75% win rate against the latest iteration of GPT-4 as of January 1st, 2025?

Will OpenAI's next-gen math-focused model score at least 95% on the MATH benchmark?

Will an AI model outperform 95% of Manifold users on accuracy before 2026?

Will OpenAI's next major LLM (after GPT-4) surpass 74% accuracy on the GPQA benchmark?

Will a single model achieve superhuman performance on all OpenAI gym environments by 2025?

Will the "OpenAI hint at or claim to have AGI before 2025 end" market go above 60% before 2024 ends?

Will openAI have the most accurate LLM across most benchmarks by EOY 2024?