Robbyant, the embodied AI unit inside Ant Group, has open sourced LingBot-World, a large scale world model that turns video […]
Category: Large Language Model
AI2 Releases SERA, Soft Verified Coding Agents Built with Supervised Training Only for Practical Repository Level Automation Workflows
Allen Institute for AI (AI2) Researchers introduce SERA, Soft Verified Efficient Repository Agents, as a coding agent family that aims […]
Ant Group Releases LingBot-VLA, A Vision Language Action Foundation Model For Real World Robot Manipulation
How do you build a single vision language action model that can control many different dual arm robots in the […]
Google DeepMind Unveils AlphaGenome: A Unified Sequence-to-Function Model Using Hybrid Transformers and U-Nets to Decode the Human Genome
Google DeepMind is expanding its biological toolkit beyond the world of protein folding. After the success of AlphaFold, the Google’s […]
Alibaba Introduces Qwen3-Max-Thinking, a Test Time Scaled Reasoning Model with Native Tool Use Powering Agentic Workloads
Qwen3-Max-Thinking is Alibaba’s new flagship reasoning model. It does not only scale parameters, it also changes how inference is done, […]
MBZUAI Releases K2 Think V2: A Fully Sovereign 70B Reasoning Model For Math, Code, And Science
Can a fully sovereign open reasoning model match state of the art systems when every part of its training pipeline […]
Moonshot AI Releases Kimi K2.5: An Open Source Visual Agentic Intelligence Model with Native Swarm Execution
Moonshot AI has released Kimi K2.5 as an open source visual agentic intelligence model. It combines a large Mixture of […]
DSGym Offers a Reusable Container Based Substrate for Building and Benchmarking Data Science Agents
Data science agents should inspect datasets, design workflows, run code, and return verifiable answers, not just autocomplete Pandas code. DSGym, […]
StepFun AI Introduce Step-DeepResearch: A Cost-Effective Deep Research Agent Model Built Around Atomic Capabilities
StepFun has introduced Step-DeepResearch, a 32B parameter end to end deep research agent that aims to turn web search into […]
A Coding Implementation to Automating LLM Quality Assurance with DeepEval, Custom Retrievers, and LLM-as-a-Judge Metrics
We initiate this tutorial by configuring a high-performance evaluation environment, specifically focused on integrating the DeepEval framework to bring unit-testing […]
