API Performance Benchmark

KushoAI Benchmark Finds AI Coding Tools Struggle With Complex API Bugs

First comparative benchmark of AI agents for API bug detection shows strong performance on simple checks, but major gaps on cross-field and business-logic failures SAN FRANCISCO, June 3, 2026 ...

MarketWatch

KushoAI Unveils APIEval-20 to Benchmark AI Agents in API Testing

SAN FRANCISCO, April 8, 2026 /PRNewswire/ -- KushoAI, an AI-native platform for API testing and software reliability, has introduced APIEval-20, an open benchmark designed to evaluate how effectively ...

1mon

MiniMax-M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro on key benchmark performance for just 5-10% of the cost

M3 demonstrates that the next phase of agent development will not just be driven by larger datasets, but by efficient architectural choices.

11d

No Claude Fable 5? No problem: Sakana achieves frontier performance with new Fugu multi-model, auto synthesis system

As enterprises increasingly demand fail-safes against single-vendor reliance, Sakana is proving that packaging collective ...

TweakTown

UL announces its 3DMark benchmark suite now runs natively on macOS, using Metal API

Use left and right arrow keys to seek audio. UL has just announced that its 3DMark benchmark suite now runs on macOS, and while it was already available on iOS, the company saw that many users were ...

ZDNet

Databricks' TPC-DS benchmarks fuel analytics platform wars

As data sources and volumes grow, and as a data-driven orientation is increasingly deemed to be a competitive necessity, the war between platform vendors to provide the primary repository for our data ...

Anthropic launches Claude Sonnet 5 with improved coding, reasoning and lower API pricing

Anthropic has launched Claude Sonnet 5 with improved coding, reasoning and cybersecurity safeguards, alongside updated API pricing, expanded availability across plans, and enhanced benchmark ...

Hosted on MSN

Grok Voice Agent API sets a new benchmark for real-time audio AI

Today marks an exciting moment for the developer community as xAI officially introduces the Grok Voice Agent API, opening the door for anyone to build powerful, real-time voice agents with ease.

OfficeChai

Sakana AI Launches Sakana Fugu That Matches Fable And Mythos On Some Benchmarks By Coordinating And Orchestrating Multiple Models

While frontier labs are competing on building the best AI models, smaller startups are looking to match their performance through innovative approaches.

Yahoo Finance

KushoAI Benchmark Finds AI Coding Tools Struggle With Complex API Bugs

SAN FRANCISCO, June 3, 2026 /PRNewswire/ -- KushoAI today released the first comparative benchmark study of how leading AI coding and testing agents perform at finding bugs in live APIs. While AI ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results