Skip to content
Saturday, May 30, 2026
The TechBriefs
  • Home
  • Technology
  • AI
  • Computers
  • Security
  • Internet
  • Press Releases
    • GlobeNewswire
    • PRNewswire
  • Contact

Category: Large Language Model

  • Home
  • Large Language Model
NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points on Llama-3.2-1B
  • agentic AI
  • AI
  • AI Paper Summary
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • Staff
  • Tech News
  • Technology

NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points on Llama-3.2-1B

  • 0

Knowledge distillation (KD) transfers “dark knowledge” from a large teacher model to a smaller student. The student learns from the […]

StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows
  • agentic AI
  • AI
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Open Source
  • Software Engineering
  • Staff
  • Tech News
  • Technology
  • Vision Language Model

StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows

  • 0

StepFun today released Step 3.7 Flash, a multimodal Mixture-of-Experts model targeting agentic use cases. It adds native vision input and […]

Liquid AI Releases LFM2.5-8B-A1B: An On-Device MoE Model With 8.3B Total and 1.5B Active Parameters
  • AI
  • AI infrastructure
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Software Engineering
  • Staff
  • Tech News
  • Technology

Liquid AI Releases LFM2.5-8B-A1B: An On-Device MoE Model With 8.3B Total and 1.5B Active Parameters

  • 0

Liquid AI just shipped LFM2.5-8B-A1B. It is an on-device Mixture-of-Experts (MoE) model built for tool calling. The model holds 8.3B […]

Anthropic Ships Claude Opus 4.8 Alongside Dynamic Workflows and Cheaper Fast Mode, With Workflows Capped at 1,000 Subagents
  • agentic AI
  • AI
  • AI Agents
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Software Engineering
  • Staff
  • Tech News
  • Technology

Anthropic Ships Claude Opus 4.8 Alongside Dynamic Workflows and Cheaper Fast Mode, With Workflows Capped at 1,000 Subagents

  • 0

Anthropic just launched Claude Opus 4.8. Also, there two Claude Code updates shipped with it. Dynamic workflows run many subagents […]

Sakana AI Proposes DiffusionBlocks: a Block-wise Training Framework That Converts Residual Networks into Independently Trainable Denoising Modules
  • AI
  • AI infrastructure
  • AI Paper Summary
  • AI Shorts
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Staff
  • Tech News
  • Technology

Sakana AI Proposes DiffusionBlocks: a Block-wise Training Framework That Converts Residual Networks into Independently Trainable Denoising Modules

  • 0

Researchers from Sakana AI and the University of Tokyo propose DiffusionBlocks. It trains transformer-based networks one block at a time. […]

NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code
  • agentic AI
  • AI
  • AI Paper Summary
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Open Source
  • Staff
  • Tech News
  • Technology

NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code

  • 0

Reinforcement learning for language agents is growing more complex. Agents now manage multi-turn tool use, long-running contexts, and multi-agent orchestration. […]

MEMO: A Modular Framework for Training a Dedicated Memory Model on New Knowledge Without Modifying LLM Parameters
  • AI
  • AI Paper Summary
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Software Engineering
  • Staff
  • Tech News
  • Technology

MEMO: A Modular Framework for Training a Dedicated Memory Model on New Knowledge Without Modifying LLM Parameters

  • 0

Large language models become static after pretraining. Their knowledge does not update as the world changes. Retraining a full LLM […]

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments
  • agentic AI
  • AI
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Staff
  • Tech News
  • Technology
  • Tutorials

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

  • 0

In this tutorial, we implement the Langfuse (an open-source LLM engineering platform) pipeline for tracing, prompt management, scoring, datasets, and […]

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%
  • agentic AI
  • AI
  • AI Agents
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • New Releases
  • Open Source
  • Software Engineering
  • Staff
  • Tech News
  • Technology

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

  • 0

Most web agents today drive a browser one action at a time. The model receives the current page state — […]

NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule
  • AI
  • AI infrastructure
  • AI Paper Summary
  • Applications
  • Artificial Intelligence
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Open Source
  • Physical AI
  • Software Engineering
  • Staff
  • Tech News
  • Technology

NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule

  • 0

Linear attention replaces the unbounded KV cache of softmax attention with a fixed-size recurrent state. This cuts sequence mixing to […]

Posts pagination

1 2 … 66 Next
  • Privacy Policy
  • Terms of use
Theme: Terminal News By Adore Themes.