Table of contents Pre-Training Supervised Finetuning LoRA QLoRA RLHF Reasoning (GRPO) Deployment Training a modern large language model (LLM) is […]
Category: Large Language Model
NVIDIA and the University of Maryland Researchers Released Audio Flamingo Next (AF-Next): A Super Powerful and Open Large Audio-Language Model
Understanding audio has always been the multimodal frontier that lags behind vision. While image-language models have rapidly scaled toward real-world […]
Meta AI and KAUST Researchers Propose Neural Computers That Fold Computation, Memory, and I/O Into One Learned Model
Researchers from Meta AI and the King Abdullah University of Science and Technology (KAUST) have introduced Neural Computers (NCs) — […]
MiniMax Just Open Sourced MiniMax M2.7: A Self-Evolving Agent Model that Scores 56.22% on SWE-Pro and 57.0% on Terminal Bench 2
MiniMax has officially open-sourced MiniMax M2.7, making the model weights publicly available on Hugging Face. Originally announced on March 18, […]
Researchers from MIT, NVIDIA, and Zhejiang University Propose TriAttention: A KV Cache Compression Method That Matches Full Attention at 2.5× Higher Throughput
Long-chain reasoning is one of the most compute-intensive tasks in modern large language models. When a model like DeepSeek-R1 or […]
Alibaba’s Tongyi Lab Releases VimRAG: a Multimodal RAG Framework that Uses a Memory Graph to Navigate Massive Visual Contexts
Retrieval-Augmented Generation (RAG) has become a standard technique for grounding large language models in external knowledge — but the moment […]
NVIDIA Releases AITune: An Open-Source Inference Toolkit That Automatically Finds the Fastest Inference Backend for Any PyTorch Model
Deploying a deep learning model into production has always involved a painful gap between the model a researcher trains and […]
Five AI Compute Architectures Every Engineer Should Know: CPUs, GPUs, TPUs, NPUs, and LPUs Compared
Modern AI is no longer powered by a single type of processor—it runs on a diverse ecosystem of specialized compute […]
Google AI Research Introduces PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing
Writing a research paper is brutal. Even after the experiments are done, a researcher still faces weeks of translating messy […]
Z.AI Introduces GLM-5.1: An Open-Weight 754B Agentic Model That Achieves SOTA on SWE-Bench Pro and Sustains 8-Hour Autonomous Execution
Z.AI, the AI platform developed by the team behind the GLM model family, has released GLM-5.1 — its next-generation flagship […]
