Building deepcrew-ai: The Missing Abstraction Layer for Multi-Agent Python Apps

The Problem Every Multi-Agent Project Starts With

Every time I started a new multi-agent project, I found myself writing the same infrastructure from scratch. Provider-specific API wrappers. Manual JSON schema definitions for tools. Custom async schedulers to run agents in parallel. Brittle streaming code to get real-time events to a frontend. The same five problems, project after project.

The frameworks I tried either did too much — opinionated memory systems, persona managers, retrieval pipelines baked in — or too little, leaving me back at square one. I wanted something in between: a clean abstraction layer over the raw primitives, nothing more.

That's why I built deepcrew-ai.

What deepcrew-ai Solves

Five recurring problems, addressed directly:

Provider abstraction — One Agent interface works across 100+ LLM providers via litellm. Switching from openai/gpt-4o to anthropic/claude-sonnet-4-6 is a one-string change. No provider-specific client code anywhere.

Tool integration — The @tool decorator inspects your Python function's type hints and docstring, and automatically generates the JSON Schema the LLM needs. No more hand-writing schemas.

Parallel execution — Agents that don't depend on each other run concurrently. You describe the dependency graph; deepcrew handles the scheduling.

Real-time streaming — Typed StreamEvent objects flow throughout execution with a .to_sse() method that drops directly into FastAPI or Flask. Frontend gets live updates without you wiring a custom event bus.

Dependency management — Workflow dependencies are declared, not coded. No custom schedulers.

The Architecture

A Unified Agent Interface

from deepcrew import Agent

researcher = Agent(
    name="researcher",
    model="openai/gpt-4o",
    instructions="You are a research specialist. Be thorough and cite sources.",
    tools=[search_web, read_document],
)

Switch the model string and everything else stays identical. litellm normalises the provider differences underneath so you never touch provider-specific code.

Tools via Decoration

from deepcrew import tool

@tool
def search_web(query: str, max_results: int = 5) -> list[dict]:
    """Search the web and return structured results."""
    ...

The decorator reads the type hints and docstring, builds the JSON Schema, and registers the function. The LLM receives a correctly-typed tool definition automatically.

MCP Protocol Support

deepCrew implements all three MCP transports — stdio (local processes), HTTP (remote services), and SSE (legacy servers). Any MCP-compatible tool server attaches to an agent with a single config block.

Two Orchestration Modes

WorkflowBuilder is for known topologies. You define the DAG explicitly; deepcrew executes parallel branches concurrently.

from deepcrew import WorkflowBuilder

wf = WorkflowBuilder()
wf.add_step("research", researcher, depends_on=[])
wf.add_step("write", writer, depends_on=["research"])
wf.add_step("review", reviewer, depends_on=["write"])

result = await wf.run(task="Write a report on agentic AI trends")

Orchestrator is for dynamic routing. The orchestrator agent reads the task and decides whether to handle it alone or delegate to specialists — no hardcoded routing logic.

First-Class Streaming

async for event in agent.stream("Analyse this dataset"):
    print(event.to_sse())  # ready for FastAPI StreamingResponse

Every meaningful moment in execution emits a typed event. Frontends get live progress without a custom event pipeline.

What It Deliberately Doesn't Do

deepCrew has no opinion on memory, retrieval, personas, or vector storage. Those are application-layer concerns. The library sits at the orchestration layer and stays there.

This is a deliberate contrast with heavier frameworks. You bring your own memory provider. You bring your own retrieval strategy. deepCrew just makes the agent wiring clean.

Getting Started

pip install deepcrew-ai

Async/await throughout. Works with any ASGI or asyncio-based backend.

What's Next

The v0.2.0 roadmap includes pluggable memory providers, retry and fallback policies, OpenTelemetry observability hooks, and a declarative workflow CLI.

MIT licensed. Source on GitHub.

Read the full deep-dive on Medium for implementation details, benchmarks, and more code examples.