The Role
You'll build the intelligence layer between our models and the real world. Vecna's Virtual Workers execute 1,000+ step operations across hundreds of tools without losing coherence, context, or intent — and you'll own the orchestration, memory, tool design, and reasoning architectures that make that possible.
Single-task agents are table stakes. The hard problem is the layer above them: how dozens of agents coordinate, hand off work, share context, recover from failure, and stay aligned with a strategic objective across hours or days of autonomous operation. Most agent systems collapse somewhere between step 50 and step 200 — context gets corrupted, plans drift, tools fail silently, sub-agents work at cross purposes. We're building the system that doesn't.
Every capability an agent has — browsing a site, navigating a terminal, executing code in a sandbox, querying a graph, talking to another agent — runs through what you build. You'll work directly with the founders on the architecture decisions that define what our platform can do, and your work will be the substrate every other engineer and researcher on the team builds against.
What You'll Own
- Multi-agent orchestration — supervisor and worker patterns, role-based delegation, sub-agent spawning, escalation logic, and the coordination primitives that let dozens of agents collaborate without stepping on each other
- Async event bus and message passing — the substrate over which agents publish, subscribe, hand off work, and react to environmental changes, with backpressure, retries, and ordering guarantees that hold up under load
- Context management — long-horizon coherence through summarization, relevance scoring, pruning, and memory externalization across sessions and worker boundaries
- Persistent memory and graph state — knowledge and spatial graphs that model the operational environment, asset relationships, and cross-session state agents reason over
- Protocol integration — agent-to-tool and agent-to-agent protocols that let Virtual Workers discover, invoke, and chain capabilities at runtime
- Self-reflection and recovery — agents that detect failures, evaluate their own reasoning, backtrack, and retry with improved strategies
- Tool design and execution environments — browser automation, computer use, sandboxed terminals, async shells, and code interpreters that agents can invoke safely
- Planning and reasoning architectures — OODA-loop execution, dynamic plan decomposition, and confidence-gated action selection
You Might Be a Fit If You
- Have 3+ years building production agent systems or LLM-powered applications end-to-end at a startup or research lab
- Have shipped multi-agent orchestration in production — supervisor patterns, role-based delegation, inter-agent communication, and worker coordination at scale
- Have built event-driven systems with async message passing, queues, or pub/sub as the backbone of distributed work
- Have designed tool abstractions for agents — browser automation, computer use, sandboxed code execution, and terminal interaction
- Have worked with graph data models for persistent state, knowledge representation, and relationship traversal across long-running operations
- Understand agent-to-tool and agent-to-agent protocol patterns and have integrated or built them in production
- Have designed self-reflection loops, planning systems, or long-horizon reasoning architectures for autonomous agents
- Are strong in Python, with deep async experience and a track record of building reliable distributed systems