Large Language Model – Page 49

FutureHouse Researchers Propose Aviary: An Extensible Open-Source Gymnasium for Language Agents

Artificial intelligence (AI) has made significant strides in developing language models capable of solving complex problems. However, applying these models […]

Meta AI Introduces EWE (Explicit Working Memory): A Novel Approach that Enhances Factuality in Long-Form Text Generation by Integrating a Working Memory

Large Language Models (LLMs) have revolutionized text generation capabilities, but they face the critical challenge of hallucination, generating factually incorrect […]

Meet Android Agent Arena (A3): A Comprehensive and Autonomous Online Evaluation System for GUI Agents

The development of large language models (LLMs) has significantly advanced artificial intelligence (AI) across various fields. Among these advancements, mobile […]

This AI Paper Introduces LLM-as-an-Interviewer: A Dynamic AI Framework for Comprehensive and Adaptive LLM Evaluation

Evaluating the real-world applicability of large language models (LLMs) is essential to guide their integration into practical use cases. One […]

Qwen Researchers Introduce CodeElo: An AI Benchmark Designed to Evaluate LLMs’ Competition-Level Coding Skills Using Human-Comparable Elo Ratings

Large language models (LLMs) have brought significant progress to AI applications, including code generation. However, evaluating their true capabilities is […]

NVIDIA Research Introduces ChipAlign: A Novel AI Approach that Utilizes a Training-Free Model Merging Strategy, Combining the Strengths of a General Instruction-Aligned LLM with a Chip-Specific LLM

Large language models (LLMs) have found applications in diverse industries, automating tasks and enhancing decision-making. However, when applied to specialized […]

This AI Paper from Tencent AI Lab and Shanghai Jiao Tong University Explores Overthinking in o1-Like Models for Smarter Computation

Large language models (LLMs) have become pivotal tools in tackling complex reasoning and problem-solving tasks. Among them, o1-like models, inspired […]

Category: Large Language Model