Most enterprise data still sits inside PDFs, scans, and slide decks. Large language models and agents cannot use that data […]
Category: Staff
Qwen’s Former Lead on What Hybrid Thinking Got Wrong — and Why He Now Backs Agents
Junyang Lin was the technical lead of Alibaba’s Qwen project. He announced he was stepping down on March 3, 2026. […]
Anthropic Launches Claude Science Beta: A Multi-Agent AI Workbench for Reproducible Genomics, Proteomics, and Cheminformatics Pipelines
This week, Anthropic released Claude Science. It is an app for scientists, available in beta. It runs on Anthropic’s existing […]
NVIDIA HORIZON: A Hands-Free Agent that Evolves Git Worktrees and Hits 100% RTL Benchmark Completion
NVIDIA Research introduced HORIZON, a hands-free agent framework for hardware design. It treats hardware design as repository-level code evolution. This […]
NVIDIA AI Introduces ASPIRE: A Self-Improving Robotics Framework Reaching 31% Zero-Shot on LIBERO-Pro Long Tasks
Traditional robot programming is hard to scale. It requires orchestrating multimodal perception, physical contact dynamics, diverse configurations, and execution failures […]
Mistral AI Releases Leanstral 1.5: An Apache-2.0 Lean 4 Code Agent Model Solving 587 of 672 PutnamBench Problems
Today, Mistral AI released Leanstral 1.5. It is a code agent model built for Lean 4. The release targets automated […]
Designing a Schema-Guided Invoice Intelligence Pipeline with lift-pdf for Accounts-Payable Extraction, Validation, and Ledger Generation
In this tutorial, we build an end-to-end accounts-payable extraction pipeline with lift-pdf, using synthetic invoice PDFs as controlled test documents […]
Meet WebBrain: An Open-Source, Local-First AI Browser Agent That Reads Pages and Automates Tasks in Chrome and Firefox
WebBrain is a free, open-source browser agent for Chrome and Firefox. It reads pages, extracts data, and automates multi-step tasks. […]
RAG-Anything Tutorial: Build a Multimodal Retrieval Pipeline for Text, Tables, Equations, and Images in Colab
In this tutorial, we build a RAG-Anything workflow and use it to explore how multimodal retrieval works across text, tables, […]
Meet Alibaba’s Page Agent: A JavaScript In-Page GUI Agent That Controls Web Interfaces With Natural Language Through the DOM
Most browser automation runs from the outside. Playwright, Puppeteer, Selenium, and browser-use all drive a browser from an external process. […]
