MiniMax released MSA (MiniMax Sparse Attention), a sparse attention method built directly on Grouped Query Attention (GQA). It targets one […]
Category: Machine Learning
How to Build Memory-Efficient Transformers with xFormers Using Packed Sequences, GQA, ALiBi, SwiGLU, and Causal Attention
In this tutorial, we implement xFormers: a practical toolkit for building fast, memory-efficient Transformer models on GPUs. We begin by […]
Google Cloud Introduces Open Knowledge Format (OKF): A Vendor-Neutral Markdown Spec for Giving AI Agents Curated Context
Foundation models keep getting stronger, yet they still stall on the same thing: context. A model can write code or […]
Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch
GLM-5.2 is the latest large language model from Z.ai, becoming the third major release in the GLM-5 line. It follows […]
A Coding Hands-On on FineWeb for Streaming, Filtering, Deduplication, Tokenization, and Large-Scale Web Corpus Analytics
In this tutorial, we explore the FineWeb dataset through an advanced hands-on workflow. We stream a manageable sample of the […]
Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6
This week, Moonshot AI released Kimi K2.7-Code. It is a coding-focused, agentic model. The model weights ship on Hugging Face […]
A Coding Implementation on Spatial Graph Neural Networks for Urban Function Inference Using city2graph, OSMnx, and PyTorch Geometric
In this tutorial, we build an end-to-end spatial graph learning pipeline using city2graph. We start by collecting real urban POI […]
Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude
Zyphra has released Zamba2-VL, a family of open vision-language models. The release covers three sizes: 1.2B, 2.7B, and 7B parameters. […]
Meet ‘North Mini Code’: Cohere’s 30B Open-Weight Mixture-of-Experts Model With 3B Active Parameters for Agentic Coding
This week, Cohere AI team shipped its first developer-facing coding model named ‘North Mini Code‘. ‘North Mini Code’ is open-weight […]
Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Faster Generation
Google AI team including the Google DeepMind researchers have just released DiffusionGemma, an experimental open model for text generation. It […]
