Researchers from Sakana AI and the University of Tokyo propose DiffusionBlocks. It trains transformer-based networks one block at a time. […]
Category: New Releases
NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code
Reinforcement learning for language agents is growing more complex. Agents now manage multi-turn tool use, long-running contexts, and multi-agent orchestration. […]
Meet EAGLE 3.1: The Speculative Decoding Algorithm That Fixes Attention Drift in LLM Inference
Speculative decoding is a technique for speeding up large language model inference. A small, fast draft model proposes several tokens. […]
Stability AI Releases Stable Audio 3: A Family of Fast Latent Diffusion Models for Audio Generation and Editing
Stability AI has released open weights for Stable Audio 3 along with a technical research paper. Stable Audio 3 is […]
Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
ElevenLabs charges between $5 and $330 per month for voice AI services. Every audio file you process goes through their […]
Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving
Long-context inference makes the KV cache one of the main costs of serving LLMs. During autoregressive decoding, the cache grows […]
WorkOS Releases auth.md: An Open Agent Registration Protocol Built on OAuth Standards
For years, authentication on the web followed one design assumption: a human sits behind a browser. Click a button. Fill […]
StepFun Releases StepAudio 2.5 Realtime: An End-to-End Voice Model with Roleplay-Specific RLHF and Paralinguistic Comprehension
StepFun, the Shanghai-based AI lab, released StepAudio 2.5 Realtime. It is an end-to-end real-time speech large language model with fully […]
Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%
Most web agents today drive a browser one action at a time. The model receives the current page state — […]
NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule
Linear attention replaces the unbounded KV cache of softmax attention with a fixed-size recurrent state. This cuts sequence mixing to […]
