Large Language Model – Page 45

Meet EvaByte: An Open-Source 6.5B State-of-the-Art Tokenizer-Free Language Model Powered by EVA

Tokenization, the process of breaking text into smaller units, has long been a fundamental step in natural language processing (NLP). […]

Google AI Releases Gemini 2.0 Flash Thinking model (gemini-2.0-flash-thinking-exp-01-21): Scoring 73.3% on AIME (Math) and 74.2% on GPQA Diamond (Science) Benchmarks

Artificial Intelligence has made significant strides, yet some challenges persist in advancing multimodal reasoning and planning capabilities. Tasks that demand […]

Snowflake AI Research Open-Sources SwiftKV: A Novel AI Approach that Reduces Inference Costs of Meta Llama LLMs up to 75% on Cortex AI

Large Language Models (LLMs) have become pivotal in artificial intelligence, powering a variety of applications from chatbots to content generation […]

DeepSeek-AI Releases DeepSeek-R1-Zero and DeepSeek-R1: First-Generation Reasoning Models that Incentivize Reasoning Capability in LLMs via Reinforcement Learning

Large Language Models (LLMs) have made significant progress in natural language processing, excelling in tasks like understanding, generation, and reasoning. […]

OmniThink: A Cognitive Framework for Enhanced Long-Form Article Generation Through Iterative Reflection and Expansion

LLMs have made significant strides in automated writing, particularly in tasks like open-domain long-form generation and topic-specific reports. Many approaches […]

ChemAgent: Enhancing Large Language Models for Complex Chemical Reasoning with Dynamic Memory Frameworks

Chemical reasoning involves intricate, multi-step processes requiring precise calculations, where small errors can lead to significant issues. LLMs often struggle […]

Meet Tensor Product Attention (TPA): Revolutionizing Memory Efficiency in Language Models

Large language models (LLMs) have become central to natural language processing (NLP), excelling in tasks such as text generation, comprehension, […]

Sakana AI Introduces Transformer²: A Machine Learning System that Dynamically Adjusts Its Weights for Various Tasks

LLMs are essential in industries such as education, healthcare, and customer service, where natural language understanding plays a crucial role. […]

Microsoft AI Research Introduces MVoT: A Multimodal Framework for Integrating Visual and Verbal Reasoning in Complex Tasks

The study of artificial intelligence has witnessed transformative developments in reasoning and understanding complex tasks. The most innovative developments are […]

Kyutai Labs Releases Helium-1 Preview: A Lightweight Language Model with 2B Parameters, Targeting Edge and Mobile Devices

The growing reliance on AI models for edge and mobile devices has underscored significant challenges. Balancing computational efficiency, model size, […]