Generative AI has reshaped how software gets built. What began as line-by-line autocomplete now spans full application generation, multi-agent build […]
Category: Machine Learning
DFlash Speculative Decoding Drafts Whole Token Blocks in Parallel for Up to 15x Higher Throughput on NVIDIA Blackwell
Autoregressive large language models generate text one token at a time. Each token waits for the one before it. This […]
Datalab Releases lift: A 9B Open-Weights Vision Model That Extracts Structured JSON From PDFs Using Schemas
Datalab has released lift, a 9B open-weights vision model for structured extraction. You pass it a JSON schema, and it […]
Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads
Prime Intellect has released prime-rl version 0.6.0. The framework targets reinforcement learning on trillion-parameter Mixture-of-Experts (MoE) models. It focuses on […]
GLM-5.2 OpenAI-Compatible API: A Hands-On Guide to Reasoning Effort, Function Calling, and Long-Context Retrieval
In this tutorial, we work with GLM-5.2 and use its hosted, OpenAI-compatible API instead of running the full model locally. […]
Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed
TLDR YaFF is Yandex’s open-source zero-copy wire format for Protobuf — Apache 2.0, currently C++, v0.1.0. The .proto file stays […]
NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interface for Spatial Reasoning
NVIDIA Research has released SpatialClaw, a training-free framework for spatial reasoning. It targets a persistent weakness in vision-language models (VLMs). […]
Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction Models for Fast Multilingual Search Across 11 Languages
This week, Liquid AI released two new retrieval models. They are LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M. Both hold 350M parameters. Both are […]
The KV Cache Compression Race: TurboQuant vs OSCAR vs EpiCache
Long-context large language models (LLMs) face a memory bottleneck that has nothing to do with model weights. During decoding, transformers […]
OpenAI Releases LifeSciBench, a 750-Task Benchmark Grading AI Models on Real Life-Science Research With Expert-Written Rubric
Most biology benchmarks ask narrow, fact-based questions with clean answers. Scientists weigh imperfect evidence and make decisions. OpenAI released LifeSciBench […]
