VLMs have become central to building general-purpose AI systems capable of understanding and interacting in digital and real-world settings. By […]
Category: Applications
Rime Introduces Arcana and Rimecaster (Open Source): Practical Voice AI Tools Built on Real-World Speech
The field of Voice AI is evolving toward more representative and adaptable systems. While many existing models have been trained […]
Meta AI Introduces CATransformers: A Carbon-Aware Machine Learning Framework to Co-Optimize AI Models and Hardware for Sustainable Edge Deployment
As machine learning systems become integral to various applications, from recommendation engines to autonomous systems, there’s a growing need to […]
This AI Paper Investigates Test-Time Scaling of English-Centric RLMs for Enhanced Multilingual Reasoning and Domain Generalization
Reasoning language models, or RLMs, are increasingly used to simulate step-by-step problem-solving by generating long, structured reasoning chains. These models […]
Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with Minimal Supervision and Maximum Generalization
Equipping LLMs with external tools or functions has become popular, showing great performance across diverse domains. Existing research depends on […]
RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement Learning
LLMs have gained outstanding reasoning capabilities through reinforcement learning (RL) on correctness rewards. Modern RL algorithms for LLMs, including GRPO, […]
OpenAI Releases HealthBench: An Open-Source Benchmark for Measuring the Performance and Safety of Large Language Models in Healthcare
OpenAI has released HealthBench, an open-source evaluation framework designed to measure the performance and safety of large language models (LLMs) […]
Multimodal AI Needs More Than Modality Support: Researchers Propose General-Level and General-Bench to Evaluate True Synergy in Generalist Models
Artificial intelligence has grown beyond language-focused systems, evolving into models capable of processing multiple input types, such as text, images, […]
Offline Video-LLMs Can Now Understand Real-Time Streams: Apple Researchers Introduce StreamBridge to Enable Multi-Turn and Proactive Video Understanding
Video-LLMs process whole pre-recorded videos at once. However, applications like robotics and autonomous driving need causal perception and interpretation of […]
AG-UI (Agent-User Interaction Protocol): An Open, Lightweight, Event-based Protocol that Standardizes How AI Agents Connect to Front-End Applications
The current generation of AI agents has made significant progress in automating backend tasks such as summarization, data migration, and […]