How do you convert complex, multilingual documents—dense layouts, small scripts, formulas, charts, and handwriting—into faithful structured Markdown/JSON with state-of-the-art accuracy […]
Category: Language Model
Google AI Releases C2S-Scale 27B Model that Translate Complex Single-Cell Gene Expression Data into ‘cell sentences’ that LLMs can Understand
A team of researchers from Google Research, Google DeepMind, and Yale released C2S-Scale 27B, a 27-billion-parameter foundation model for single-cell […]
Anthropic Launches Claude Haiku 4.5: Small AI Model that Delivers Sonnet-4-Level Coding Performance at One-Third the Cost and more than Twice the Speed
Anthropic released Claude Haiku 4.5, a latency-optimized “small” model that delivers similar levels of coding performance to Claude Sonnet 4 […]
Meta AI’s ‘Early Experience’ Trains Language Agents without Rewards—and Outperforms Imitation Learning
How would your agent stack change if a policy could train purely from its own outcome-grounded rollouts—no rewards, no demos—yet […]
Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints
Do you actually need a giant VLM when dense Qwen3-VL 4B/8B (Instruct/Thinking) with FP8 runs in low VRAM yet retains […]
Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Style Pipeline You Can Train in ~4 Hours for ~$100
Andrej Karpathy has open-sourced nanochat, a compact, dependency-light codebase that implements a full ChatGPT-style stack—from tokenizer training to web UI […]
NVIDIA Researchers Propose Reinforcement Learning Pretraining (RLP): Reinforcement as a Pretraining Objective for Building Reasoning During Pretraining
NVIDIA AI has introduced Reinforcement Learning Pretraining (RLP), a training objective that injects reinforcement learning into the pretraining stage rather […]
SwiReasoning: Entropy-Driven Alternation of Latent and Explicit Chain-of-Thought for Reasoning LLMs
SwiReasoning is a decoding-time framework that lets a reasoning LLM decide when to think in latent space and when to […]
Meet OpenTSLM: A Family of Time-Series Language Models (TSLMs) Revolutionizing Medical Time-Series Analysis
A significant development is set to transform AI in healthcare. Researchers at Stanford University, in collaboration with ETH Zurich and […]
Liquid AI Releases LFM2-8B-A1B: An On-Device Mixture-of-Experts with 8.3B Params and a 1.5B Active Params per Token
How much capability can a sparse 8.3B-parameter MoE with a ~1.5B active path deliver on your phone without blowing latency […]