Large Language Model – Page 16

Hugging Face Releases SmolVLA: A Compact Vision-Language-Action Model for Affordable and Efficient Robotics

Despite recent progress in robotic control via large-scale vision-language-action (VLA) models, real-world deployment remains constrained by hardware and data requirements. […]

This AI Paper Introduces WEB-SHEPHERD: A Process Reward Model for Web Agents with 40K Dataset and 10× Cost Efficiency

Web navigation focuses on teaching machines how to interact with websites to perform tasks such as searching for information, shopping, […]

Incorrect Answers Improve Math Reasoning? Reinforcement Learning with Verifiable Rewards (RLVR) Surprises with Qwen2.5-Math

In natural language processing (NLP), RL methods, such as reinforcement learning with human feedback (RLHF), have been utilized to enhance […]

LLMs Can Now Reason Beyond Language: Researchers Introduce Soft Thinking to Replace Discrete Tokens with Continuous Concept Embeddings

Human reasoning naturally operates through abstract, non-verbal concepts rather than strictly relying on discrete linguistic tokens. However, current LLMs are […]

Qwen Researchers Proposes QwenLong-L1: A Reinforcement Learning Framework for Long-Context Reasoning in Large Language Models

While large reasoning models (LRMs) have shown impressive capabilities in short-context reasoning through reinforcement learning (RL), these gains do not […]

NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning Model Optimized for Edge AI and Scientific Tasks

NVIDIA has released Llama Nemotron Nano 4B, an open-source reasoning model designed to deliver strong performance and efficiency across scientific […]

NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning through Reinforcement Learning

Reasoning capabilities represent a fundamental component of AI systems. The introduction of OpenAI o1 sparked significant interest in building reasoning […]

Optimizing Assembly Code with LLMs: Reinforcement Learning Outperforms Traditional Compilers

LLMs have shown impressive capabilities across various programming tasks, yet their potential for program optimization has not been fully explored. […]

Researchers from the National University of Singapore Introduce ‘Thinkless,’ an Adaptive Framework that Reduces Unnecessary Reasoning by up to 90% Using DeGRPO

The effectiveness of language models relies on their ability to simulate human-like step-by-step deduction. However, these reasoning sequences are resource-intensive […]

Microsoft AI Introduces Magentic-UI: An Open-Source Agent Prototype that Works with People to Complete Complex Tasks that Require Multi-Step Planning and Browser Use

Modern web usage spans many digital interactions, from filling out forms and managing accounts to executing data queries and navigating […]