Programming Language Benchmarks

Quesma Releases OTelBench: Independent Benchmark Reveals Frontier LLMs Struggle with Real-World SRE Tasks

New benchmark shows top LLMs achieve only 29% pass rate on OpenTelemetry instrumentation, exposing the gap between ...

Qwen3-Coder-Next offers vibe coders a powerful open source, ultra-sparse model with 10x higher throughput for repo tasks

On SWE-Bench Verified, the model achieved a score of 70.6%. This performance is notably competitive when placed alongside ...

TMCnet

Bito's AI Architect Achieves Highest Success Rate of 60.8% on SWE-Bench Pro

The evaluation used identical Claude Sonnet 4.5 agents under two conditions. In the baseline condition, the agent relied on native file search and tool-driven exploration to infer repository structure ...

12h

Alibaba Launches A Large Model Trained Inside a Coding Platform

SINGAPORE, SG / ACCESS Newswire / February 3, 2026 /Alibaba today announced the release of Qwen- Coder-Qoder, a large language model custom-trained end-to-end for an agentic coding platform. The model ...

27d

Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment

B, an open-source AI coding model trained in four days on Nvidia B200 GPUs, publishing its full reinforcement-learning stack ...

The Official Microsoft Blog

Maia 200: The AI accelerator built for inference

Today, we’re proud to introduce Maia 200, a breakthrough inference accelerator engineered to dramatically improve the economics of AI token generation. Maia 200 is an AI inference powerhouse: an ...

Microsoft Extends Its Phi Models To Physical AI With Rho-Alpha

Physical AI marks a transition from robots as programmed tools to robots as adaptable collaborators. That transition will ...

Open Source Kimi K2.5 Resets the AI Pecking Order

Kimi K2.5 adds Agent Swarm with up to 100 parallel helpers and a 256k window, so teams solve complex work faster.

Evolving Into An AI-Native Product Organization

AI lets product teams turn ideas into working prototypes in hours. When building is easier than it's ever been, the hard part ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results