Build a Nanobot-Style AI Agent in Google Colab with Tool Calling, Session Memory, Skills, and MCP Servers

In this tutorial, we build a lightweight personal AI agent inspired by the core architecture of nanobot, while keeping every part understandable and runnable in Google Colab. We start from the provider abstraction, then move through tool registration, session memory, lifecycle hooks, skills, and an MCP-style tool server. As we progress, we do not just use an external agent framework; we recreate the core building blocks ourselves so we can clearly see how messages, tools, memory, and model responses work together within a practical agent loop.

Building the Provider Abstraction and Mock LLM

import subprocess, sys def _pip_install(*pkgs):    try:        subprocess.run([sys.executable, "-m", "pip", "install", "-q", *pkgs], check=True)    except Exception as e:        print(f"(pip install skipped/failed for {pkgs}: {e})") _HAVE_OPENAI = False try:    import openai    _HAVE_OPENAI = True except Exception:    _pip_install("openai>=1.0.0")    try:        import openai        _HAVE_OPENAI = True    except Exception:        _HAVE_OPENAI = False try:    import nest_asyncio    nest_asyncio.apply() except Exception:    try:        _pip_install("nest_asyncio")        import nest_asyncio        nest_asyncio.apply()    except Exception:        pass import os import re import json import time import math import asyncio import inspect import textwrap import contextlib import io from dataclasses import dataclass, field from typing import Any, Callable, Optional, Awaitable, get_type_hints def banner(title: str) -> None:    line = "═" * 78    print(f"n{line}n  {title}n{line}") @dataclass class ToolCall:    """A normalized request from the model to run one tool."""    id: str    name: str    arguments: dict @dataclass class Usage:    prompt_tokens: int = 0    completion_tokens: int = 0    @property    def total(self) -> int:        return self.prompt_tokens + self.completion_tokens @dataclass class LLMResponse:    """The single shape every provider must return."""    content: Optional[str]    tool_calls: list[ToolCall] = field(default_factory=list)    finish_reason: str = "stop"    usage: Usage = field(default_factory=Usage) class Provider:    """Base class. A provider turns (messages, tools) into an LLMResponse."""    name = "base"    async def complete(self, messages: list[dict], tools: list[dict]) -> LLMResponse:        raise NotImplementedError class OpenAICompatibleProvider(Provider):    """    Works with OpenAI and every OpenAI-compatible gateway (OpenRouter, DeepSeek,    Together, vLLM, LM Studio, Ollama's /v1, ...). This mirrors how nanobot speaks    to most providers under the hood.    """    name = "openai-compatible"    def __init__(self, api_key: str, model: str, base_url: Optional[str] = None):        from openai import AsyncOpenAI        self.model = model        self.client = AsyncOpenAI(api_key=api_key, base_url=base_url)    async def complete(self, messages: list[dict], tools: list[dict]) -> LLMResponse:        kwargs: dict[str, Any] = {"model": self.model, "messages": messages}        if tools:            kwargs["tools"] = tools            kwargs["tool_choice"] = "auto"        resp = await self.client.chat.completions.create(**kwargs)        choice = resp.choices[0]        msg = choice.message        calls: list[ToolCall] = []        for tc in (msg.tool_calls or []):            try:                args = json.loads(tc.function.arguments or "{}")            except json.JSONDecodeError:                args = {"_raw": tc.function.arguments}            calls.append(ToolCall(id=tc.id, name=tc.function.name, arguments=args))        usage = Usage(            prompt_tokens=getattr(resp.usage, "prompt_tokens", 0) or 0,            completion_tokens=getattr(resp.usage, "completion_tokens", 0) or 0,        )        return LLMResponse(            content=msg.content,            tool_calls=calls,            finish_reason=choice.finish_reason or "stop",            usage=usage,        ) class MockProvider(Provider):    """    A deterministic, rule-based "LLM" so this entire tutorial runs with NO API key    and NO network — letting you watch the agent loop, tool calls, and memory work.    It imitates the ONE thing that matters for the loop: deciding to emit a tool call    (in the exact normalized shape a real model would) and then, once tool results    come back, producing a final natural-language answer. The agent loop cannot tell    it apart from OpenAI — that's the whole point of the provider contract.    """    name = "mock"    def __init__(self, model: str = "mock-1"):        self.model = model    @staticmethod    def _last_user_text(messages: list[dict]) -> str:        for m in reversed(messages):            if m.get("role") == "user":                c = m.get("content")                return c if isinstance(c, str) else json.dumps(c)        return ""    @staticmethod    def _already_called(messages: list[dict], tool_name: str) -> bool:        for m in messages:            if m.get("role") == "assistant" and m.get("tool_calls"):                for tc in m["tool_calls"]:                    if tc["function"]["name"] == tool_name:                        return True        return False    @staticmethod    def _extract_math(text: str) -> str:        """Pull the first math-looking chunk out of a sentence (mock-only helper)."""        t = re.sub(r"square roots? of (d+(?:.d+)?)", r"sqrt(1)", text)        t = t.replace("^", "**")        pattern = (r"(?:sqrt(d+(?:.d+)?)|d+(?:.d+)?)"                   r"(?:s*(?:**|[+-*/])s*(?:sqrt(d+(?:.d+)?)|d+(?:.d+)?))*")        m = re.search(pattern, t)        return m.group(0).strip() if m else t.strip()    @staticmethod    def _scan_memory(messages: list[dict]) -> tuple[Optional[str], Optional[str]]:        """Read back simple facts from prior USER turns — proves session memory is        actually being fed to the model (mock-only convenience)."""        name = love = None        for m in messages:            if m.get("role") == "user" and isinstance(m.get("content"), str):                tx = m["content"].lower()                nm = re.search(r"my name is (w+)", tx)                if nm:                    name = nm.group(1).title()                lv = re.search(r"i (?:love|like) (w+)", tx)                if lv:                    love = lv.group(1).title()        return name, love    async def complete(self, messages: list[dict], tools: list[dict]) -> LLMResponse:        await asyncio.sleep(0)        user = self._last_user_text(messages).lower()        tool_names = {t["function"]["name"] for t in tools}        usage = Usage(prompt_tokens=sum(len(str(m)) for m in messages) // 4, completion_tokens=12)        def call(name, args):            return LLMResponse(                content=None,                tool_calls=[ToolCall(id=f"call_{name}_{int(time.time()*1000)%100000}",                                     name=name, arguments=args)],                finish_reason="tool_calls",                usage=usage,            )        has_digit = bool(re.search(r"d", user))        wants_math = has_digit and (            bool(re.search(r"[+-*/^]", user)) or "sqrt" in user            or "square root" in user            or any(w in user for w in ["calculate", "compute", "evaluate", "what is", "what's"]))        if "calculator" in tool_names and wants_math and not self._already_called(messages, "calculator"):            return call("calculator", {"expression": self._extract_math(user)})        if "get_current_time" in tool_names and not self._already_called(messages, "get_current_time"):            if any(w in user for w in ["time", "date", "today", "now", "o'clock"]):                tz = "UTC"                m = re.search(r"in ([a-zA-Z_/ ]+)", user)                if m:                    cand = m.group(1).strip().title().replace(" ", "_")                    tz = {"Tokyo": "Asia/Tokyo", "Delhi": "Asia/Kolkata",                          "New_York": "America/New_York", "London": "Europe/London"}.get(cand, cand)                return call("get_current_time", {"timezone": tz})        if "remember_fact" in tool_names and not self._already_called(messages, "remember_fact"):            m = re.search(r"my favorite (?:programming )?language is (w+)", user)            if m:                return call("remember_fact", {"key": "favorite_language", "value": m.group(1)})        if "recall_fact" in tool_names and not self._already_called(messages, "recall_fact"):            if any(w in user for w in ["my favorite", "do you remember", "recall", "what did i tell"]):                key = "favorite_language" if "language" in user else "note"                return call("recall_fact", {"key": key})        if "run_python" in tool_names and not self._already_called(messages, "run_python"):            py_kw = any(w in user for w in ["fibonacci", "prime", "factorial", "simulate"])            py_action = "python" in user and any(                w in user for w in ["run", "write", "code", "print", "execute", "snippet"])            if py_kw or py_action:                if "fibonacci" in user:                    code = ("def fib(n):n a,b=0,1n out=[]n"                            " for _ in range(n):n  out.append(a); a,b=b,a+bn return outn"                            "print(fib(12))")                elif "prime" in user:                    code = ("primes=[n for n in range(2,50) "                            "if all(n%d for d in range(2,int(n**0.5)+1))]nprint(primes)")                elif "factorial" in user:                    code = "import math; print(math.factorial(10))"                else:                    code = "print(sum(range(1,101)))"                return call("run_python", {"code": code})        if "web_search" in tool_names and not self._already_called(messages, "web_search"):            if any(w in user for w in ["search", "look up", "latest", "news about", "find information"]):                return call("web_search", {"query": self._last_user_text(messages)})        if any(p in user for p in ["my name", "who am i", "what do i love", "what i love"]):            name, love = self._scan_memory(messages)            bits = []            if name:                bits.append(f"your name is {name}")            if love:                bits.append(f"you love {love}")            if bits:                return LLMResponse(content="From our conversation, " + " and ".join(bits) + ".",                                   tool_calls=[], finish_reason="stop", usage=usage)        tool_outputs = [m["content"] for m in messages if m.get("role") == "tool"]        if tool_outputs:            joined = " ".join(tool_outputs)            answer = f"Based on the tool results, here's what I found: {joined}"        elif any(w in user for w in ["hello", "hi", "hey"]):            answer = "Hello! I'm a mock nanobot agent. Ask me to calculate, tell time, run Python, or remember things."        else:            answer = ("[mock LLM] I would normally reason about this with a real model. "                      "Set NANOBOT_API_KEY to use a live LLM. For now, try prompts with math, "                      "time, Python, or memory so you can see the tool loop fire.")        return LLMResponse(content=answer, tool_calls=[], finish_reason="stop", usage=usage)

We set up the environment, install optional dependencies, and prepare the imports needed for the full tutorial. We define a provider abstraction that allows the agent to work with either a real OpenAI-compatible model or a deterministic mock provider. We also build the normalized response structures so the rest of the agent loop can work independently of the backend model.

Creating the Tool Registry and Token-Budgeted Memory

_PYTYPE_TO_JSON = {str: "string", int: "integer", float: "number", bool: "boolean",                   list: "array", dict: "object"} @dataclass class Tool:    name: str    description: str    parameters: dict    func: Callable    is_async: bool    def spec(self) -> dict:        """OpenAI-style tool spec the model sees."""        return {"type": "function",                "function": {"name": self.name,                             "description": self.description,                             "parameters": self.parameters}}    async def __call__(self, **kwargs) -> str:        try:            result = self.func(**kwargs)            if inspect.isawaitable(result):                result = await result            return result if isinstance(result, str) else json.dumps(result, default=str)        except Exception as e:            return f"ERROR running tool '{self.name}': {type(e).__name__}: {e}" def tool(func: Optional[Callable] = None, *, name: Optional[str] = None):    """    Decorator that turns a plain function into a Tool, deriving the JSON schema from    type hints and the first line of the docstring. Param descriptions can be added    with a simple 'param: description' block in the docstring.    Example:        @tool        def calculator(expression: str) -> str:            '''Evaluate a math expression and return the result.            expression: a math expression like "2 + 2 * 3" or "sqrt(16)"'''            ...    """    def make(f: Callable) -> Tool:        hints = get_type_hints(f)        sig = inspect.signature(f)        doc = inspect.getdoc(f) or ""        summary = doc.split("n", 1)[0].strip() or f.__name__        param_docs: dict[str, str] = {}        for line in doc.splitlines()[1:]:            m = re.match(r"s*(w+)s*:s*(.+)", line)            if m and m.group(1) in sig.parameters:                param_docs[m.group(1)] = m.group(2).strip()        props, required = {}, []        for pname, p in sig.parameters.items():            if pname == "self":                continue            jtype = _PYTYPE_TO_JSON.get(hints.get(pname, str), "string")            schema = {"type": jtype}            if pname in param_docs:                schema["description"] = param_docs[pname]            props[pname] = schema            if p.default is inspect.Parameter.empty:                required.append(pname)        parameters = {"type": "object", "properties": props, "required": required}        return Tool(name=name or f.__name__, description=summary,                    parameters=parameters, func=f, is_async=inspect.iscoroutinefunction(f))    return make(func) if func else make class ToolRegistry:    def __init__(self):        self._tools: dict[str, Tool] = {}    def add(self, t: Tool) -> None:        self._tools[t.name] = t    def add_function(self, f: Callable) -> None:        self.add(tool(f))    def get(self, name: str) -> Optional[Tool]:        return self._tools.get(name)    def specs(self) -> list[dict]:        return [t.spec() for t in self._tools.values()]    def names(self) -> list[str]:        return list(self._tools) @tool def calculator(expression: str) -> str:    """Evaluate an arithmetic expression and return the numeric result.    expression: a math expression, e.g. '2 + 2 * 3', 'sqrt(16)', '2 ** 10'"""    allowed = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}    allowed.update({"abs": abs, "round": round, "min": min, "max": max, "sqrt": math.sqrt})    expr = expression.replace("^", "**")    value = eval(expr, {"__builtins__": {}}, allowed)    return f"{expression} = {value}" @tool def get_current_time(timezone: str = "UTC") -> str:    """Return the current date and time for an IANA timezone name.    timezone: IANA tz like 'UTC', 'Asia/Tokyo', 'Asia/Kolkata', 'America/New_York'"""    from datetime import datetime    try:        from zoneinfo import ZoneInfo        now = datetime.now(ZoneInfo(timezone))    except Exception:        from datetime import timezone as _tz        now = datetime.now(_tz.utc)        timezone = "UTC (fallback)"    return f"Current time in {timezone}: " @tool def run_python(code: str) -> str:    """Execute a short Python snippet in a restricted namespace and return its stdout.    code: Python source code to run; use print(...) to produce output"""    safe_builtins = {"print": print, "range": range, "len": len, "sum": sum, "min": min,                     "max": max, "abs": abs, "sorted": sorted, "enumerate": enumerate,                     "list": list, "dict": dict, "set": set, "str": str, "int": int,                     "float": float, "bool": bool, "map": map, "filter": filter,                     "zip": zip, "all": all, "any": any, "round": round}    import math as _m    g = {"__builtins__": safe_builtins, "math": _m}    buf = io.StringIO()    try:        with contextlib.redirect_stdout(buf):            exec(code, g, {})        out = buf.getvalue().strip()        return f"stdout:n{out}" if out else "(ran successfully, no stdout)"    except Exception as e:        return f"Python error: {type(e).__name__}: {e}" @tool def web_search(query: str) -> str:    """Search the web for a query and return short result snippets (STUB).    query: the search query string"""    return (f"[stub results for '{query}'] (1) Overview article. (2) Official docs. "            f"(3) Recent discussion. Swap web_search's body for a real API in production.") def estimate_tokens(messages: list[dict]) -> int:    """Rough token estimate (~4 chars/token) — good enough for budgeting demos."""    chars = 0    for m in messages:        chars += len(str(m.get("content") or ""))        for tc in (m.get("tool_calls") or []):            chars += len(json.dumps(tc))    return max(1, chars // 4) class Memory:    def __init__(self, token_budget: int = 3000):        self.token_budget = token_budget        self._sessions: dict[str, list[dict]] = {}    def history(self, session_key: str) -> list[dict]:        return self._sessions.setdefault(session_key, [])    def append(self, session_key: str, message: dict) -> None:        self.history(session_key).append(message)    def extend(self, session_key: str, messages: list[dict]) -> None:        self.history(session_key).extend(messages)    def compact(self, session_key: str) -> int:        """Drop oldest messages until under the token budget. Returns #dropped.        Keeps tool-call/tool-result pairs consistent by trimming from the front in        whole turns. (nanobot also summarizes; we keep it to trimming for clarity.)"""        hist = self.history(session_key)        dropped = 0        while estimate_tokens(hist) > self.token_budget and len(hist) > 2:            hist.pop(0)            dropped += 1        while hist and hist[0].get("role") == "tool":            hist.pop(0); dropped += 1        return dropped

We create a tool system that allows ordinary Python functions to become callable agent tools. We use type hints and docstrings to automatically generate JSON-style tool schemas, which makes the framework easier to extend. We also add practical offline tools such as a calculator, a time lookup tool, a Python execution tool, a web search stub, and token-budgeted memory.

Implementing Lifecycle Hooks, Skills, and the Agent Loop

@dataclass class AgentHookContext:    iteration: int = 0    messages: list[dict] = field(default_factory=list)    response: Optional[LLMResponse] = None    usage: Usage = field(default_factory=Usage)    tool_calls: list[ToolCall] = field(default_factory=list)    tool_results: list[str] = field(default_factory=list)    final_content: Optional[str] = None    stop_reason: Optional[str] = None    error: Optional[Exception] = None class AgentHook:    """Subclass and override what you need. All async methods are best-effort and    isolated (one failing hook won't crash the agent)."""    def wants_streaming(self) -> bool:        return False    async def before_iteration(self, context: AgentHookContext) -> None: ...    async def on_stream(self, context: AgentHookContext, delta: str) -> None: ...    async def on_stream_end(self, context: AgentHookContext, *, resuming: bool) -> None: ...    async def before_execute_tools(self, context: AgentHookContext) -> None: ...    async def after_iteration(self, context: AgentHookContext) -> None: ...    def finalize_content(self, context: AgentHookContext, content: str) -> str:        return content async def _fan_out(hooks: list[AgentHook], method: str, *args, **kwargs) -> None:    for h in hooks:        try:            await getattr(h, method)(*args, **kwargs)        except Exception as e:            print(f"  (hook {type(h).__name__}.{method} error: {e})") @dataclass class Skill:    name: str    description: str    instructions: str = ""    tools: list[Tool] = field(default_factory=list) class MCPServer:    """Minimal stand-in for an MCP server exposing named tools."""    def __init__(self, name: str):        self.name = name        self._impls: dict[str, dict] = {}    def register(self, name: str, description: str, parameters: dict, handler: Callable):        self._impls[name] = {"description": description, "parameters": parameters, "handler": handler}    def list_tools(self) -> list[dict]:        return [{"name": n, "description": v["description"], "parameters": v["parameters"]}                for n, v in self._impls.items()]    async def call_tool(self, name: str, arguments: dict) -> str:        impl = self._impls[name]        res = impl["handler"](**arguments)        if inspect.isawaitable(res):            res = await res        return res if isinstance(res, str) else json.dumps(res, default=str) def mcp_tools(server: MCPServer) -> list[Tool]:    """Adapt every tool on an MCP server into our native Tool objects."""    out: list[Tool] = []    for spec in server.list_tools():        nm = spec["name"]        async def _runner(_nm=nm, **kwargs):            return await server.call_tool(_nm, kwargs)        out.append(Tool(name=f"{server.name}__{nm}",                        description=f"[MCP:{server.name}] {spec['description']}",                        parameters=spec["parameters"], func=_runner, is_async=True))    return out @dataclass class RunResult:    content: str    tools_used: list[str] = field(default_factory=list)    iterations: int = 0    usage: Usage = field(default_factory=Usage)    messages: list[dict] = field(default_factory=list) class Agent:    def __init__(self, provider: Provider, registry: ToolRegistry, memory: Memory,                 system_prompt: str, max_iterations: int = 6, verbose: bool = True):        self.provider = provider        self.registry = registry        self.memory = memory        self.system_prompt = system_prompt        self.max_iterations = max_iterations        self.verbose = verbose    def _log(self, *a):        if self.verbose:            print(*a)    async def run(self, user_message: str, *, session_key: str = "default",                  hooks: Optional[list[AgentHook]] = None,                  extra_instructions: str = "") -> RunResult:        hooks = hooks or []        system = self.system_prompt        if extra_instructions:            system += "nn" + extra_instructions        self.memory.append(session_key, {"role": "user", "content": user_message})        dropped = self.memory.compact(session_key)        if dropped:            self._log(f"  · memory compaction dropped {dropped} old message(s)")        messages = [{"role": "system", "content": system}, *self.memory.history(session_key)]        ctx = AgentHookContext(messages=messages)        tools_used: list[str] = []        total = Usage()        final_text = ""        for i in range(1, self.max_iterations + 1):            ctx.iteration = i            ctx.messages = messages            await _fan_out(hooks, "before_iteration", ctx)            response = await self.provider.complete(messages, self.registry.specs())            ctx.response = response            total.prompt_tokens += response.usage.prompt_tokens            total.completion_tokens += response.usage.completion_tokens            ctx.usage = total            if response.tool_calls:                ctx.tool_calls = response.tool_calls                self._log(f"  [iter {i}] model requested {len(response.tool_calls)} tool call(s)")                messages.append({                    "role": "assistant",                    "content": response.content,                    "tool_calls": [{"id": tc.id, "type": "function",                                    "function": {"name": tc.name,                                                 "arguments": json.dumps(tc.arguments)}}                                   for tc in response.tool_calls],                })                await _fan_out(hooks, "before_execute_tools", ctx)                results: list[str] = []                for tc in response.tool_calls:                    t = self.registry.get(tc.name)                    if t is None:                        result = f"ERROR: unknown tool '{tc.name}'"                    else:                        result = await t(**tc.arguments)                    tools_used.append(tc.name)                    results.append(result)                    self._log(f"     ↳ {tc.name}({tc.arguments}) -> {result[:120]}")                    messages.append({"role": "tool", "tool_call_id": tc.id,                                     "content": result})                ctx.tool_results = results                await _fan_out(hooks, "after_iteration", ctx)                continue            final_text = response.content or ""            for h in hooks:                try:                    final_text = h.finalize_content(ctx, final_text)                except Exception as e:                    print(f"  (hook {type(h).__name__}.finalize_content error: {e})")            ctx.final_content = final_text            ctx.stop_reason = response.finish_reason            await _fan_out(hooks, "after_iteration", ctx)            self.memory.append(session_key, {"role": "assistant", "content": final_text})            break        else:            final_text = "(stopped: hit max_iterations without a final answer)"        return RunResult(content=final_text, tools_used=tools_used,                         iterations=ctx.iteration, usage=total,                         messages=list(messages))

We implement the lifecycle hooks, skill structure, MCP-style server adapter, and the main agent loop. We use hooks to observe or modify the agent’s behavior without changing the core runtime. We then run the central loop where the model receives messages, requests tools when needed, consumes tool results, and finally returns a plain-text answer.

Wrapping the Agent in a Nanobot SDK Interface

DEFAULT_SYSTEM_PROMPT = (    "You are nanobot, a concise, helpful personal AI agent. You can call tools when "    "they help. Prefer using a tool over guessing for math, the current time, running "    "code, web lookups, or recalling stored facts. After tools run, answer the user "    "directly and clearly." ) class Nanobot:    def __init__(self, provider: Provider, *, system_prompt: str = DEFAULT_SYSTEM_PROMPT,                 token_budget: int = 3000, max_iterations: int = 6, verbose: bool = True):        self.registry = ToolRegistry()        self.memory = Memory(token_budget=token_budget)        self.skills: dict[str, Skill] = {}        self._loaded_skills: set[str] = set()        self._base_system = system_prompt        self.agent = Agent(provider, self.registry, self.memory,                           system_prompt, max_iterations=max_iterations, verbose=verbose)        for t in (calculator, get_current_time, run_python, web_search):            self.registry.add(t)    @classmethod    def auto(cls, **kw) -> "Nanobot":        """Pick a real provider if an API key is set, else the Mock provider."""        api_key = os.environ.get("NANOBOT_API_KEY") or os.environ.get("OPENAI_API_KEY")        model = os.environ.get("NANOBOT_MODEL", "gpt-4o-mini")        base_url = os.environ.get("NANOBOT_BASE_URL")        if api_key and _HAVE_OPENAI:            print(f"→ Using live provider: OpenAI-compatible (model={model}, base_url={base_url or 'api.openai.com'})")            provider: Provider = OpenAICompatibleProvider(api_key, model, base_url)        else:            why = "no API key found" if not api_key else "openai SDK unavailable"            print(f"→ Using Mock provider ({why}). Set NANOBOT_API_KEY for a live model.")            provider = MockProvider()        return cls(provider, **kw)    def add_tool(self, f: Callable) -> "Nanobot":        self.registry.add(tool(f) if not isinstance(f, Tool) else f)        return self    def register_skill(self, skill: Skill) -> "Nanobot":        self.skills[skill.name] = skill        return self    def load_skill(self, name: str) -> "Nanobot":        """Activate a skill: append its instructions and register its tools."""        sk = self.skills[name]        if name not in self._loaded_skills:            self.agent.system_prompt += f"nn## Skill: {sk.name}n{sk.instructions}"            for t in sk.tools:                self.registry.add(t)            self._loaded_skills.add(name)            print(f"  · loaded skill '{name}' (+{len(sk.tools)} tool(s))")        return self    def connect_mcp(self, server: MCPServer) -> "Nanobot":        for t in mcp_tools(server):            self.registry.add(t)        print(f"  · connected MCP server '{server.name}' (+{len(server.list_tools())} tool(s))")        return self    async def run(self, message: str, *, session_key: str = "sdk:default",                  hooks: Optional[list[AgentHook]] = None) -> RunResult:        return await self.agent.run(message, session_key=session_key, hooks=hooks) class AuditHook(AgentHook):    """Print every tool the model decides to call."""    def __init__(self):        self.calls: list[str] = []    async def before_execute_tools(self, context: AgentHookContext) -> None:        for tc in context.tool_calls:            self.calls.append(tc.name)            print(f"     [audit] {tc.name}({tc.arguments})") class TimingHook(AgentHook):    """Measure how long each LLM iteration takes."""    def __init__(self):        self._t = 0.0    async def before_iteration(self, context: AgentHookContext) -> None:        self._t = time.perf_counter()    async def after_iteration(self, context: AgentHookContext) -> None:        ms = (time.perf_counter() - self._t) * 1000        print(f"     [timing] iteration {context.iteration} took {ms:.1f} ms") class CensorHook(AgentHook):    """finalize_content runs as a pipeline — transform the final text."""    def finalize_content(self, context: AgentHookContext, content: str) -> str:        return content.replace("secret", "***") if content else content async def demo_basic(bot: Nanobot):    banner("DEMO 1 — Basic chat (no tools needed)")    r = await bot.run("Hello! Who are you?", session_key="demo-basic")    print("AGENT:", r.content)    print(f"(iterations={r.iterations}, tools={r.tools_used}, ~tokens={r.usage.total})") async def demo_tool_calling(bot: Nanobot):    banner("DEMO 2 — Tool calling: math, time, and Python")    for q in ["What is 2 ** 10 + sqrt(144)?",              "What time is it in Tokyo?",              "Write Python to list the first 12 Fibonacci numbers."]:        print(f"nUSER: {q}")        r = await bot.run(q, session_key="demo-tools")        print("AGENT:", r.content) async def demo_multistep(bot: Nanobot):    banner("DEMO 3 — Multi-step loop with an audit hook")    audit = AuditHook()    q = "Calculate 15 * 23, and also tell me the current time in Asia/Kolkata."    print(f"USER: {q}")    r = await bot.run(q, session_key="demo-multistep", hooks=[audit])    print("AGENT:", r.content)    print("Tools observed by hook:", audit.calls) async def demo_memory(bot: Nanobot):    banner("DEMO 4 — Session memory (independent histories per session_key)")    await bot.run("My name is Ada and I love Python.", session_key="user-ada")    await bot.run("My name is Alan and I love Haskell.", session_key="user-alan")    r1 = await bot.run("What's my name and what do I love?", session_key="user-ada")    r2 = await bot.run("What's my name and what do I love?", session_key="user-alan")    print("ADA  session →", r1.content)    print("ALAN session →", r2.content)    print("(Each session_key kept its own conversation history — like nanobot.)") async def demo_skills(bot: Nanobot):    banner("DEMO 5 — Skills: load a 'research' capability on demand")    research = Skill(        name="research",        description="Web research workflow",        instructions=("When researching, first search the web, then synthesize the "                      "snippets into a short, sourced summary."),        tools=[web_search],    )    bot.register_skill(research).load_skill("research")    r = await bot.run("Search for the latest on retrieval-augmented generation and summarize.",                      session_key="demo-skills")    print("AGENT:", r.content) async def demo_mcp(bot: Nanobot):    banner("DEMO 6 — MCP-style external tool server")    server = MCPServer("weather")    server.register(        name="forecast",        description="Get a (stub) weather forecast for a city.",        parameters={"type": "object",                    "properties": {"city": {"type": "string"}},                    "required": ["city"]},        handler=lambda city: f"Forecast for {city}: 27°C, partly cloudy (stub MCP data).",    )    bot.connect_mcp(server)    print("Registered tools now include:", [n for n in bot.registry.names() if "weather" in n])    t = bot.registry.get("weather__forecast")    print("Direct MCP tool call →", await t(city="Delhi")) async def demo_streaming_and_finalize(bot: Nanobot):    banner("DEMO 7 — finalize_content pipeline + timing hook")    q = "Compute sqrt(2) to show the math tool, then reply."    print(f"USER: {q}")    r = await bot.run(q, session_key="demo-hooks", hooks=[TimingHook(), CensorHook()])    print("AGENT:", r.content) async def demo_capstone(bot: Nanobot):    banner("DEMO 8 — Capstone: a personal agent juggling tools + memory")    print("A short multi-turn 'personal assistant' conversation:n")    turns = [        "What's 144 / 12, and what's my favorite language?",        "Run Python to print all primes under 50.",    ]    for q in turns:        print(f"USER: {q}")        r = await bot.run(q, session_key="capstone", hooks=[AuditHook()])        print("AGENT:", r.content, "n")

We wrap the lower-level agent in a Nanobot-style interface that feels more like a real SDK. We add support for registering tools, loading skills, connecting MCP-style servers, and running the bot with session-specific memory. We also define several demo functions that show basic chat, tool calling, multi-step execution, memory, skills, MCP tools, and hooks in action.

Adding Long-Term Memory and Running the Demos

_FACTS: dict[str, str] = {} @tool def remember_fact(key: str, value: str) -> str:    """Store a fact in long-term key-value memory.    key: short identifier    value: the value to store"""    _FACTS[key] = value    return f"Stored {key} = {value}" @tool def recall_fact(key: str) -> str:    """Recall a previously stored fact by key.    key: the identifier used when storing"""    return _FACTS.get(key, f"(no fact stored under '{key}')") async def main():    banner("🐈  nanobot-from-scratch  —  building & running the core architecture")    bot = Nanobot.auto(verbose=True)    bot.add_tool(remember_fact).add_tool(recall_fact)    print("Registered tools:", bot.registry.names())    await demo_basic(bot)    await demo_tool_calling(bot)    await demo_multistep(bot)    await demo_memory(bot)    await demo_skills(bot)    await demo_mcp(bot)    await demo_streaming_and_finalize(bot)    await demo_capstone(bot)    banner("DONE")    print(textwrap.dedent("""        You just built nanobot's core: a provider-agnostic agent loop with tools,        token-budgeted session memory, lifecycle hooks, skills, and an MCP-style tool        server — the same architecture HKUDS/nanobot ships, kept deliberately small.        ── Run the REAL nanobot ─────────────────────────────────────────────────────          !pip install nanobot-ai          # configure a provider + model in ~/.nanobot/config.json, then:          from nanobot import Nanobot as RealNanobot          bot = RealNanobot.from_config()          result = await bot.run("What time is it in Tokyo?")          print(result.content)        Docs: https://github.com/HKUDS/nanobot  •  Python SDK: docs/python-sdk.md    """)) def _go():    try:        asyncio.run(main())    except RuntimeError:        loop = asyncio.get_event_loop()        loop.run_until_complete(main()) if __name__ == "__main__":    _go()

We add simple long-term key-value memory tools to store and recall facts. We define the main execution function that creates the bot, registers custom tools, and runs every demo from start to finish. We complete the tutorial by showing how the rebuilt nanobot-style architecture connects to the real nanobot package for future extension.

Conclusion

In conclusion, we have a working nanobot-style agent that can call tools, retain session-specific context, load skills, connect to external tool servers, and run a clean, provider-agnostic loop. We also understand how a small and readable architecture can support powerful agent behavior without relying on a heavy orchestration layer. It gives us leverage to extend the agent further with real LLM providers, production tools, persistent memory, and custom skills for more advanced personal AI workflows.

Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Sana Hassan

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

Building the Provider Abstraction and Mock LLM

Creating the Tool Registry and Token-Budgeted Memory

Implementing Lifecycle Hooks, Skills, and the Agent Loop

Wrapping the Agent in a Nanobot SDK Interface

Adding Long-Term Memory and Running the Demos

Conclusion

Sana Hassan

Leave a Reply Cancel reply

Related Posts

Moonshot AI Releases Kimi K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior

A Coding Guide to Build a Procedural Memory Agent That Learns, Stores, Retrieves, and Reuses Skills as Neural Modules Over Time

Adapting to AI agents, growing risks and perimeter focus — identity predictions for 2026

Moonshot AI Releases Kimi K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior