AI Paper Summary – Page 14

Meet ‘kvcached’: A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs

Large language model serving often wastes GPU memory because engines pre-reserve large static KV cache regions per model, even when […]

Google AI Introduces FLAME Approach: A One-Step Active Learning that Selects the Most Informative Samples for Training and Makes a Model Specialization Super Fast

Open vocabulary object detectors answer text queries with boxes. In remote sensing, zero shot performance drops because classes are fine […]

PokeeResearch-7B: An Open 7B Deep-Research Agent Trained with Reinforcement Learning from AI Feedback (RLAIF) and a Robust Reasoning Scaffold

Pokee AI has open sourced PokeeResearch-7B, a 7B parameter deep research agent that executes full research loops, decomposes a query, […]

Google AI Introduces VISTA: A Test Time Self Improving Agent for Text to Video Generation

TLDR: VISTA is a multi agent framework that improves text to video generation during inference, it plans structured prompts as […]

Google AI Research Releases DeepSomatic: A New AI Model that Identifies Cancer Cell Genetic Variants

A team of researchers from Google Research and UC Santa Cruz released DeepSomatic, an AI model that identifies cancer cell […]

DeepSeek Just Released a 3B OCR Model: A 3B VLM Designed for High-Performance OCR and Structured Document Conversion

DeepSeek-AI released 3B DeepSeek-OCR, an end to end OCR and document parsing Vision-Language Model (VLM) system that compresses long text […]

Weak-for-Strong (W4S): A Novel Reinforcement Learning Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMs

Researchers from Stanford, EPFL, and UNC introduce Weak-for-Strong Harnessing, W4S, a new Reinforcement Learning RL framework that trains a small […]

Microsoft AI Proposes BitNet Distillation (BitDistill): A Lightweight Pipeline that Delivers up to 10x Memory Savings and about 2.65x CPU Speedup

Microsoft Research proposes BitNet Distillation, a pipeline that converts existing full precision LLMs into 1.58 bit BitNet students for specific […]

AutoCode: A New AI Framework that Lets LLMs Create and Verify Competitive Programming Problems, Mirroring the Workflow of Human Problem Setters

Are your LLM code benchmarks actually rejecting wrong-complexity solutions and interactive-protocol violations, or are they passing under-specified unit tests? A […]

Baidu’s PaddlePaddle Team Releases PaddleOCR-VL (0.9B): a NaViT-style + ERNIE-4.5-0.3B VLM Targeting End-to-End Multilingual Document Parsing

How do you convert complex, multilingual documents—dense layouts, small scripts, formulas, charts, and handwriting—into faithful structured Markdown/JSON with state-of-the-art accuracy […]