• Nextool AI
  • Posts
  • OpenAI’s new benchmark shows where AI still breaks

OpenAI’s new benchmark shows where AI still breaks

Plus: Anthropic’s first climate move comes with pressure attached.

In partnership with

Anthropic, Apple, and OpenAI all pointed to the same shift this week: AI is moving from promise to pressure. Anthropic joined a major carbon removal coalition as AI’s energy footprint grows. Apple is preparing price hikes as AI demand pushes chip costs higher. OpenAI launched LifeSciBench to test whether AI can handle real scientific work, not just polished answers. Different stories, same signal: AI’s real cost is becoming harder to hide.

In today’s post:

  • AI science still fails where real work begins

  • Anthropic’s first climate move says more than it seems

  • Apple’s next price hike has an AI problem

SPONSORED BY

HubSpot AEO

Picture this. A buyer opens ChatGPT and asks for a recommendation in your category. Your competitor's name comes up. Yours doesn't. And that buyer never makes it to your website.

That's happening right now in markets everywhere. And most teams don't know it's happening because it never shows up in their analytics.

HubSpot AEO shows you exactly where your brand stands in AI search, where competitors are getting recommended instead of you, and tells you specifically what to fix. No expertise needed.

Try it free for 28 days. Just $50 a month after.

What’s Trending Today

LAUNCH

OpenAI’s LifeSciBench shows AI is useful, but not ready

Image Credits: Open AI

OpenAI introduced LifeSciBench to test AI in real science work. Not clean quizzes. Not simple fact recall. The benchmark tests messy research judgment. That is where AI’s limits become more visible.

Here's everything you need to know:

  • LifeSciBench includes 750 expert-written life science tasks.

  • The tasks span drug discovery, biology, and research workflows.

  • Many questions require evidence handling and multi-step reasoning.

  • Over half require models to interpret attached artifacts.

  • GPT-Rosalind improved over GPT-5.5, but pass rates stayed modest.

  • AI performed better on synthesis and communication tasks.

  • It struggled most with artifacts, exact outputs, and design work.

This benchmark matters because it feels honest. It does not ask whether AI sounds scientific. It asks whether AI can help scientists decide. That is a much harder bar. AI may become a useful research partner. But in science, partial confidence can still be dangerous.

BREAKTHROUGH

AI’s climate problem is now impossible to ignore

Image Credits: Anthropic

Anthropic just joined Frontier, the carbon removal coalition. That makes it the first AI startup in the group. The timing matters. AI companies are buying enormous amounts of energy. Now, they need a cleaner story around that growth.

Here's everything you need to know:

  • Anthropic is contributing to Frontier’s new $915 million funding round.

  • Frontier’s total pledges now reach about $1.8 billion.

  • The group funds projects that remove carbon from the atmosphere.

  • These credits help companies offset emissions they cannot cut today.

  • Anthropic has not yet published a sustainability report.

  • Its “all of the above” energy stance leaves questions open.

  • Frontier will now fund fewer projects with stronger long-term potential.

This move is less about virtue. It is more about pressure. AI companies know their energy use is becoming visible. Carbon removal helps, but it is not a free pass. The real question is simple. Will AI companies reduce emissions, or just account for them better?

STRATEGY

AI is now raising costs far beyond AI companies

Image Credits: Apple

Apple plans to raise prices across its products. The reason is not just inflation. It is the rising cost of memory and storage chips. AI companies are buying these parts aggressively. Now, that pressure is reaching everyday consumers.

Here's everything you need to know:

  • Tim Cook said chip costs are rising sharply.

  • Memory and storage chips are becoming harder to price calmly.

  • AI companies need huge amounts of hardware to scale.

  • That demand is pushing costs across the wider market.

  • Apple may pass some of those costs to customers.

  • This shows how AI growth affects non-AI products too.

  • The real impact may show up at checkout first.

AI costs will not stay hidden. They will move through the supply chain slowly. Then they will appear in familiar places. Phones. Laptops. Cloud tools. Subscriptions. Apple is just an early signal. The bigger question is who absorbs the AI boom’s cost.

Free Guides

My Free Guides to Download:

🚀 Founders & AI Builders, Listen up!

If you’ve built an AI tool, here’s an opportunity to gain serious visibility.

Nextool AI is a leading tools aggregator that offers:

  • 500k+ page views and a rapidly growing audience.

  • Exposure to developers, entrepreneurs, and tech enthusiasts actively searching for innovative tools.

  • A spot in a curated list of cutting-edge AI tools, trusted by the community.

  • Increased traffic, users, and brand recognition for your tool.

Take the next step to grow your tool’s reach and impact.

That's a wrap:

Please let us know how was this newsletter:

Login or Subscribe to participate in polls.

Reach 150,000+ READERS:

Expand your reach and boost your brand’s visibility!

Partner with Nextool AI to showcase your product or service to 140,000+ engaged subscribers, including entrepreneurs, tech enthusiasts, developers, and industry leaders.

Ready to make an impact? Visit our sponsorship website to explore sponsorship opportunities and learn more!