Close Menu
Core Bulletin

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Planning to post a video of your layoff online? You may want to think twice | Gene Marks

    August 10, 2025

    Can an AI chatbot of Dr Karl change climate sceptics’ minds? He’s willing to give it a try | Artificial intelligence (AI)

    August 10, 2025

    Sophy Romvari’s Shattering Debut Feature

    August 10, 2025
    Facebook X (Twitter) Instagram
    Core BulletinCore Bulletin
    Trending
    • Planning to post a video of your layoff online? You may want to think twice | Gene Marks
    • Can an AI chatbot of Dr Karl change climate sceptics’ minds? He’s willing to give it a try | Artificial intelligence (AI)
    • Sophy Romvari’s Shattering Debut Feature
    • Science Says We Age in Spikes—Here’s How To Slow Them Down
    • Learning to live with the torture of tinnitus | Deafness and hearing loss
    • Vikings’ Rondale Moore hurts knee in ‘heartbreaking’ situation
    • Tom Hanks pays tribute to Apollo 13 astronaut Jim Lovell | US news
    • Confusion over the Alaska summit shows Vladimir Putin still calls the shots | Vladimir Putin
    Sunday, August 10
    • Home
    • Business
    • Health
    • Lifestyle
    • Politics
    • Science
    • Sports
    • Travel
    • World
    • Technology
    • Entertainment
    Core Bulletin
    Home»Science»Elon Musk’s New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up
    Science

    Elon Musk’s New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up

    By Liam PorterJuly 11, 2025No Comments5 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Telegram Email
    Elon Musk's New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up
    Share
    Facebook Twitter LinkedIn Pinterest Email

    New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up

    Elon Musk has launched xAI’s Grok 4—calling it the “world’s smartest AI” and claiming it can ace Ph.D.-level exams and outpace rivals such as Google’s Gemini and OpenAI’s o3 on tough benchmarks

    By Deni Ellis Béchard edited by Dean Visser

    Elon Musk released the newest artificial intelligence model from his company xAI on Wednesday night. In an hour-long public reveal session, he called the model, Grok 4, “the smartest AI in the world” and claimed it was capable of getting perfect SAT scores and near-perfect GRE results in every subject, from the humanities to the sciences.

    During the online launch, Musk and members of his team described testing Grok 4 on a metric called Humanity’s Last Exam (HLE)—a 2,500-question benchmark designed to evaluate an AI’s academic knowledge and reasoning skill. Created by nearly 1,000 human experts across more than 100 disciplines and released in January 2025, the test spans topics from the classics to quantum chemistry and mixes text with images. Grok 4 reportedly scored 25.4 percent on its own. But given access to tools (such as external aids for code execution or Web searches), it hit 38.6 percent. That jumped to 44.4 percent with a version called Grok 4 Heavy, which uses multiple AI agents to solve problems. The two next best-performing AI models are Google’s Gemini-Pro (which achieved 26.9 percent with the tools) and OpenAI’s o3 model (which got 24.9 percent, also with the tools). The results from xAI’s internal testing have yet to appear on the leaderboard for HLE, however, and it remains unclear whether this is because xAI has yet to submit the results or because those results are pending review. Manifold, a social prediction market platform where users bet play money (called “Mana”) on future events in politics, technology and other subjects, predicted a 1 percent chance, as of Friday morning, that Grok 4 would debut on HLE’s leaderboard with a 45 percent score or greater on the exam within a month of its release. (Meanwhile xAI has claimed a score of only 44.4.)

    During the launch, the xAI team also ran live demonstrations showing Grok 4 crunching baseball odds, determining which xAI employee has the “weirdest” profile picture on X and generating a simulated visualization of a black hole. Musk suggested that the system may discover entirely new technologies by later this year—and possibly “new physics” by the end of next year. Games and movies are on the horizon, too, with Musk predicting that Grok 4 will be able to make playable titles and watchable films by 2026. Grok 4 also has new audio capabilities, including a voice that sang during the launch, and Musk said new image generation and coding tools are soon to be released. The regular version of Grok 4 costs $30 a month; SuperGrok Heavy—the deluxe package with multiple agents and research tools—runs at $300.


    On supporting science journalism

    If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


    Artificial Analysis, an independent benchmarking platform that ranks AI models, now lists Grok 4 as highest on its Artificial Analysis Intelligence Index, slightly ahead of Gemini 2.5 Pro and OpenAI’s o4-mini-high. And Grok 4 appears as the top-performing publicly available model on the leaderboards for the Abstraction and Reasoning Corpus, or ARC-AGI-1, and its second edition, ARC-AGI-2—benchmarks that measure progress toward “humanlike” general intelligence. Greg Kamradt, president of ARC Prize Foundation, a nonprofit organization that maintains the two leaderboards, says that when the xAI team contacted the foundation with Grok 4’s results, the organization then independently tested Grok 4 on a dataset to which the xAI team did not have access and confirmed the results. “Before we report performance for any lab, it’s not verified unless we verify it,” Kamradt says. “We approved the [testing results] slide that [the xAI team] showed in the launch.”

    According to xAI, Grok 4 also outstrips other AI systems on a number of additional benchmarks that suggest its strength in STEM subjects (read a full breakdown of the benchmarks here). Alex Olteanu, a senior data science editor at AI education platform DataCamp, has tested it. “Grok has been strong on math and programming in my tests, and I’ve been impressed by the quality of its chain-of-thought reasoning, which shows an ingenious and logically sound approach to problem-solving,” Olteanu says. “Its context window, however, isn’t very competitive, and it may struggle with large code bases like those you encounter in production. It also fell short when I asked it to analyze a 170-page PDF, likely due to its limited context window and weak multimodal abilities.” (Multimodal abilities refer to a model’s capacity to analyze more than one kind of data at the same time, such as a combination of text, images, audio and video.)

    On a more nuanced front, issues with Grok 4 have surfaced since its release. Several posters on X—owned by Musk himself—as well as tech-industry news outlets have reported that when Grok 4 was asked questions about the Israeli-Palestinian conflict, abortion and U.S. immigration law, it often searched for Musk’s stance on these issues by referencing his X posts and articles written about him. And the release of Grok 4 comes after several controversies with Grok 3, the previous model, which issued outputs that included antisemitic comments, praise for Hitler and claims of “white genocide”—incidents that xAI publicly acknowledged, attributing them to unauthorized manipulations and stating that the company was implementing corrective measures.

    At one point during the launch, Musk commented on how making an AI smarter than humans is frightening, though he said he believes the ultimate result will be good—probably. “I somewhat reconciled myself to the fact that, even if it wasn’t going to be good, I’d at least like to be alive to see it happen,” he said.

    [source_link

    Elon Exam Grok heats Humanitys Musks race takes
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Liam Porter
    • Website

    Liam Porter is a seasoned news writer at Core Bulletin, specializing in breaking news, technology, and business insights. With a background in investigative journalism, Liam brings clarity and depth to every piece he writes.

    Related Posts

    Tom Hanks pays tribute to Apollo 13 astronaut Jim Lovell | US news

    August 10, 2025

    Elon Musk’s Tesla applies to supply electricity to households in Great Britain | Tesla

    August 10, 2025

    Jim Lovell obituary | Space

    August 10, 2025

    Did Disease Defeat Napoleon? | Scientific American

    August 10, 2025

    Meteorite that hit home is older than Earth, scientists say

    August 10, 2025

    A huge stick insect has been discovered in Australia. Here’s why that’s important | Gwen Pearson

    August 10, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Medium Rectangle Ad
    Don't Miss
    Business

    Planning to post a video of your layoff online? You may want to think twice | Gene Marks

    August 10, 2025

    An employee shares a video of her “real time, emotional reaction” to being one of…

    Can an AI chatbot of Dr Karl change climate sceptics’ minds? He’s willing to give it a try | Artificial intelligence (AI)

    August 10, 2025

    Sophy Romvari’s Shattering Debut Feature

    August 10, 2025

    Science Says We Age in Spikes—Here’s How To Slow Them Down

    August 10, 2025
    Our Picks

    Reform council confirms ‘patriotic’ flag policy

    July 4, 2025

    Trump references bankers with antisemitic slur in Iowa speech to mark megabill’s passage – as it happened | Donald Trump

    July 4, 2025

    West Indies v Australia: Tourists bowled out for 286 in Grenada Test

    July 4, 2025

    Beards may be dirtier than toilets – but all men should grow one | Polly Hudson

    July 4, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Medium Rectangle Ad
    About Us

    Welcome to Core Bulletin — your go-to source for reliable news, breaking stories, and thoughtful analysis covering a wide range of topics from around the world. Our mission is to inform, engage, and inspire our readers with accurate reporting and fresh perspectives.

    Our Picks

    Planning to post a video of your layoff online? You may want to think twice | Gene Marks

    August 10, 2025

    Can an AI chatbot of Dr Karl change climate sceptics’ minds? He’s willing to give it a try | Artificial intelligence (AI)

    August 10, 2025
    Recent Posts
    • Planning to post a video of your layoff online? You may want to think twice | Gene Marks
    • Can an AI chatbot of Dr Karl change climate sceptics’ minds? He’s willing to give it a try | Artificial intelligence (AI)
    • Sophy Romvari’s Shattering Debut Feature
    • Science Says We Age in Spikes—Here’s How To Slow Them Down
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Disclaimer
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions
    © 2025 Core Bulletin. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.