Building a structured dataset from the web is still a pipeline problem. You identify a data source, write or configure […]
Category: Applications
Alibaba’s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform
Alibaba’s Qwen team has released Qwen3.7-Plus. The model is now available through Alibaba Cloud’s Bailian platform. Bailian is the console […]
JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines
JetBrains released Mellum2, open-sourcing the weights under the Apache 2.0 license. The first version of Mellum was a completion-focused 4B […]
How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp
In this tutorial, we work through an implementation of NVIDIA Apex, focusing on the components that still matter in modern […]
MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding
MiniMax officially released MiniMax M3 on June 1, 2026. The model introduces MSA (MiniMax Sparse Attention), a new sparse attention […]
Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent
Hermes Agent already remembers across sessions. The open-source agent from Nous Research ships with curated memory files and full-text session […]
Parallax: A Parameterized Local Linear Attention That Keeps Softmax and Adds a Learned Covariance Correction Branch
The Transformer’s attention mechanism has barely changed since 2017. Most efficiency work has tried to replace softmax attention outright. A […]
Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughput Gain
Trajectory’s concurrent multi-LoRA stack reports a 2.81× experiment-throughput gain over single-tenant RL, with all code in the NovaSky-AI/SkyRL GitHub repository. […]
Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robotics Foundation Model Evaluation
Genesis AI released Genesis World 1.0. The platform consists of four components: the Genesis World physics engine, Nyx (a real-time […]
Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4
Nous Research’s open-source Hermes Agent now ships a Tool Search feature. It directly addresses a growing bottleneck in AI agent […]
