Reasoning language models, or RLMs, are increasingly used to simulate step-by-step problem-solving by generating long, structured reasoning chains. These models […]
Category: AI Shorts
Rethinking Toxic Data in LLM Pretraining: A Co-Design Approach for Improved Steerability and Detoxification
In the pretraining of LLMs, the quality of training data is crucial in determining model performance. A common strategy involves […]
Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with Minimal Supervision and Maximum Generalization
Equipping LLMs with external tools or functions has become popular, showing great performance across diverse domains. Existing research depends on […]
RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement Learning
LLMs have gained outstanding reasoning capabilities through reinforcement learning (RL) on correctness rewards. Modern RL algorithms for LLMs, including GRPO, […]
OpenAI Releases HealthBench: An Open-Source Benchmark for Measuring the Performance and Safety of Large Language Models in Healthcare
OpenAI has released HealthBench, an open-source evaluation framework designed to measure the performance and safety of large language models (LLMs) […]
Multimodal AI Needs More Than Modality Support: Researchers Propose General-Level and General-Bench to Evaluate True Synergy in Generalist Models
Artificial intelligence has grown beyond language-focused systems, evolving into models capable of processing multiple input types, such as text, images, […]
Offline Video-LLMs Can Now Understand Real-Time Streams: Apple Researchers Introduce StreamBridge to Enable Multi-Turn and Proactive Video Understanding
Video-LLMs process whole pre-recorded videos at once. However, applications like robotics and autonomous driving need causal perception and interpretation of […]
PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning
As language models scale in parameter count and reasoning complexity, traditional centralized training pipelines face increasing constraints. High-performance model training […]
AG-UI (Agent-User Interaction Protocol): An Open, Lightweight, Event-based Protocol that Standardizes How AI Agents Connect to Front-End Applications
The current generation of AI agents has made significant progress in automating backend tasks such as summarization, data migration, and […]
This AI Paper Introduces Effective State-Size (ESS): A Metric to Quantify Memory Utilization in Sequence Models for Performance Optimization
In machine learning, sequence models are designed to process data with temporal structure, such as language, time series, or signals. […]