A comprehensive, modular library for building multi-agent software development systems based on advanced context engineering principles.
Version: 1.3.0
This library provides reusable, composable AI agent prompts and infrastructure for teams of AI agents working together on software projects. Instead of manually crafting prompts for each project, you can compose specialized agents from modular components.
✨ Modular Design: Base prompts + platform augmentations + project context
🎓 Skills Integration: Anthropic Skills for specialized capabilities
🚀 Starter Templates: Quick-start templates for existing projects
🤝 Dual-Mode Workflows: Simple Mode (default) + Complex Mode (with infrastructure validation & code review)
⚙️ Specialized Agents: IT Specialist for infrastructure, Senior Engineer for code review
📊 Optional Project Tracking: Scrum Master agent for visibility, sprint metrics, and AppFlowy integration
🧠 Advanced Context Management: Ultra-lean Manager context (15-25%)
🌲 Git-Based Workflow: Branch isolation prevents conflicts
📡 Structured Communication: JSON-based inter-agent messaging
🎯 Platform Agnostic: Web, mobile, desktop, and more
⚡ Advanced Tool Use: Deferred loading, prompt caching, programmatic orchestration (37% token reduction)
💭 Slash Commands: 12 thinking model commands, debugging workflows, task management
🔍 Quality Auditors: Agent-based review for skills, slash commands, and subagents
📐 XML Architecture: Pure XML prompt structure with 25% token efficiency improvement
🔌 Tool Selector: Cross-project tool access via /command style wrappers
Choose the approach that fits your needs:
| Approach | Best For | Setup Time |
|---|---|---|
| Option A: Starter Templates | Quick setup with pre-configured files | ~5 minutes |
| Option B: Git Submodule | Stay synced with library updates | ~10 minutes |
| Option C: Direct Copy | Full control, no external dependencies | ~10 minutes |
Use pre-configured templates for instant setup:
cd your-project
# Interactive mode (recommended)
python3 path/to/AI_agents/starter-templates/generate-template.py --interactive
# Or direct command
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type web-app \
--name "YourProject" \
--output .What you get:
- ✅ Complete
.ai-agents/directory structure - ✅ Pre-configured context files (architecture, API contracts, coding standards)
- ✅ Ready-to-use agent configurations
Available templates: web-app, mobile-app, full-stack, api-service, data-pipeline
See starter-templates/README.md for complete guide.
Add the library as a submodule to receive updates:
cd your-project
# 1. Add library as submodule
git submodule add https://github.com/HelloWorldSungin/AI_agents.git .ai-agents/library
# 2. Create project structure
mkdir -p .ai-agents/{context,state,skills,composed}
# 3. Copy example config
cp .ai-agents/library/examples/web-app-team/config.yml .ai-agents/config.yml
# 4. Create your context files
touch .ai-agents/context/{architecture,coding-standards,api-contracts}.mdYour structure:
your-project/
├── .ai-agents/
│ ├── library/ # Git submodule (this repo)
│ ├── config.yml # Your agent configuration
│ ├── context/ # Your project documentation
│ │ ├── architecture.md
│ │ ├── coding-standards.md
│ │ └── api-contracts.md
│ ├── skills/ # Your custom skills (optional)
│ └── composed/ # Generated agent prompts
└── src/ # Your project code
Updating the library:
cd .ai-agents/library && git pull origin main && cd ../..
git add .ai-agents/library && git commit -m "Update AI Agents library"See skills/PROJECT_INTEGRATION.md for detailed guide.
Copy the library directly for complete ownership:
cd your-project
# Copy library
cp -r path/to/AI_agents .ai-agents/
# Remove git history to make it part of your repo
rm -rf .ai-agents/.git
# Create your context files
mkdir -p .ai-agents/context
touch .ai-agents/context/{architecture,coding-standards,api-contracts}.mdTrade-offs:
- ✅ Full control over all files
- ✅ No submodule complexity
- ❌ Manual process to sync updates from library
Regardless of which option you chose, populate your context files:
architecture.md- Your system architecture, tech stack, key componentscoding-standards.md- Your team conventions, style guidesapi-contracts.md- Your API specifications, endpointscurrent-features.md- Feature roadmap, priorities (optional)
See examples/ for reference templates.
cd .ai-agents/library # or .ai-agents if using direct copy
python scripts/compose-agent.py --config ../config.yml --allThis generates complete agent prompts in .ai-agents/composed/.
.ai-agents/ goes next to .git/ - Place it at your repository root, regardless of where your code lives.
my-project/ ← Repository root
├── .git/
├── .ai-agents/ ✅ Place here
├── src/
├── package.json
└── README.md
my-project/ ← Repository root
├── .git/
├── .ai-agents/ ✅ Place here (NOT in app/)
├── app/ ← Code lives here
│ ├── src/
│ └── package.json
├── docs/
└── README.md
Why? The library uses relative paths from repository root. Config references like context/architecture.md assume .ai-agents/ is at the root.
monorepo/ ← Repository root
├── .git/
├── .ai-agents/ ✅ Single shared setup
├── packages/
│ ├── frontend/
│ ├── backend/
│ └── mobile/
└── package.json
Best for: One team working across all packages with shared standards.
monorepo/ ← Repository root
├── .git/
├── packages/
│ ├── frontend/
│ │ └── .ai-agents/ ✅ Per-package setup
│ ├── backend/
│ │ └── .ai-agents/ ✅ Per-package setup
│ └── mobile/
│ └── .ai-agents/ ✅ Per-package setup
└── package.json
Best for: Independent teams with different tech stacks and standards.
| Your Project Structure | Where to Place .ai-agents/ | Why |
|---|---|---|
| Standard repo | Next to .git/ | Standard setup |
| Code in subdirectory | Next to .git/ (NOT in subdirectory) | Relative paths from root |
| Monorepo - single team | Root, next to .git/ | Shared config for all packages |
| Monorepo - multiple teams | Inside each package/ | Independent configs per team |
| Nested repositories | Next to each .git/ | Each repo is independent |
✅ DO:
- Place at repository root (next to
.git/) - Use relative paths in configs (e.g.,
../../src/components) - Add
.ai-agents/state/and.ai-agents/checkpoints/to.gitignore
❌ DON'T:
- Place inside code directories (e.g.,
src/.ai-agents/) - Use absolute paths in configs
- Commit agent runtime state
When using starter templates, the --output parameter determines placement:
# Standard repo - output to current directory (repository root)
cd my-project/
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type web-app \
--name "MyProject" \
--output .
# Code in subdirectory - still output to root
cd my-project/ # Repository root
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type web-app \
--name "MyProject" \
--output . # Creates my-project/.ai-agents/
# Monorepo with per-package setup
cd monorepo/packages/frontend/
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type web-app \
--name "Frontend" \
--output . # Creates monorepo/packages/frontend/.ai-agents/The generator always creates .ai-agents/ in the --output directory.
┌─────────────────────────────────────────┐
│ Project Context (Your Requirements) │
│ • Business logic │
│ • API contracts │
│ • Team conventions │
├─────────────────────────────────────────┤
│ Skills (Domain Expertise) ✨ NEW │
│ • Specialized workflows │
│ • Tool integrations │
│ • Domain knowledge │
├─────────────────────────────────────────┤
│ Platform Augmentation (Specialized) │
│ • Web/Mobile/Desktop expertise │
│ • Framework knowledge │
│ • Platform best practices │
├─────────────────────────────────────────┤
│ Base Agent (Universal) │
│ • Core software engineering │
│ • Testing, debugging, git │
│ • Security, performance │
└─────────────────────────────────────────┘
🔹 Simple Mode (90% of projects - Default)
┌──────────────┐
│ Manager │ ← Plans & coordinates
└──────┬───────┘
│
┌───┴───┬──────────┬─────────┐
│ │ │ │
┌──▼───┐ ┌▼──────┐ ┌─▼─────┐ ┌─▼────────┐
│Archi-│ │Frontend│ │Backend│ │Integration│
│tect │ │ Dev │ │ Dev │ │ Agent │
└──────┘ └────────┘ └───────┘ └──────────┘
🔸 Complex Mode (10% of projects - Advanced)
┌──────────────┐
│ Manager │ ← Coordinates only
└──────┬───────┘
│
┌───────┴───────┐
│ │
┌──────▼─────┐ ┌────▼──────────┐
│ IT │ │ Senior │
│ Specialist │ │ Engineer │
│(Infra) │ │(Review+Merge) │
└──────┬─────┘ └────▲──────────┘
│ │
│ ┌────┴────┬──────────┬─────────┐
└─────────►│ │ │ │
┌─▼─────┐ ┌▼──────┐ ┌─▼─────┐ ┌─▼──┐
│Backend│ │Frontend│ │Backend│ │ QA │
│ Dev 1 │ │ Dev │ │ Dev 2 │ │Test│
└───────┘ └────────┘ └───────┘ └────┘
See ARCHITECTURE.md for detailed documentation.
AI_agents/
├── .claude/ # Claude Code extensions
│ ├── commands/ # Slash commands
│ │ ├── ai-tools.md # Tool discovery router 🔌 NEW
│ │ ├── consider/ # 12 thinking model commands
│ │ │ ├── first-principles.md
│ │ │ ├── 5-whys.md
│ │ │ ├── swot.md
│ │ │ ├── cost-benefit.md
│ │ │ ├── premortem.md
│ │ │ ├── second-order.md
│ │ │ ├── eisenhower.md
│ │ │ ├── inversion.md
│ │ │ ├── opportunity-cost.md
│ │ │ ├── stakeholder-mapping.md
│ │ │ ├── devils-advocate.md
│ │ │ └── reversible-irreversible.md
│ │ ├── whats-next.md # Context handoff between sessions
│ │ ├── debug.md # Systematic debugging methodology
│ │ ├── add-to-todos.md # Add tasks to todo list
│ │ └── check-todos.md # Review todo list
│ └── agents/ # Quality auditor agents
│ ├── skill-auditor.md # Reviews skills for best practices
│ ├── slash-command-auditor.md # Reviews slash commands
│ └── subagent-auditor.md # Reviews agent configurations
│
├── prompts/ # All agent prompts
│ ├── roles/ # Base agent prompts
│ │ ├── software-developer.md
│ │ ├── manager.md
│ │ ├── qa-tester.md
│ │ ├── architect.md
│ │ └── scrum-master.md # Optional: Project tracking & visibility
│ ├── manager-task-delegation.md # Comprehensive Manager guide (dual-mode)
│ ├── manager-quick-reference.md # Quick-start Manager template
│ ├── it-specialist-agent.md # Infrastructure validation specialist
│ └── senior-engineer-agent.md # Code review & integration specialist
│
├── platforms/ # Platform specializations
│ ├── web/
│ │ ├── frontend-developer.md
│ │ └── backend-developer.md
│ ├── mobile/
│ │ └── mobile-developer.md
│ └── ...
│
├── skills/ # Anthropic Skills integration ✨
│ ├── README.md # Skills overview
│ ├── CATALOG.md # Available skills directory
│ ├── INTEGRATION.md # Technical guide
│ ├── anthropic/ # Anthropic skills (submodule)
│ ├── custom/ # Project-specific skills
│ │ └── appflowy-integration/ # AppFlowy task tracking (for Scrum Master)
│ └── taches-cc/ # taches-cc skills 🎯 NEW
│ ├── create-agent-skills/ # Skill authoring best practices
│ ├── create-plans/ # Hierarchical project planning
│ └── debug-like-expert/ # Systematic debugging with domain expertise
│
├── starter-templates/ # Project templates 🚀 NEW
│ ├── generate-template.py # Template generator
│ ├── web-app/ # Web application template
│ ├── mobile-app/ # Mobile app template
│ └── README.md # Template documentation
│
├── schemas/ # JSON schemas
│ ├── communication-protocol.json
│ ├── communication-protocol-examples.json # Tool use examples
│ ├── state-management.json
│ ├── agent-schema.json # v2.0 with deferred loading support
│ └── project-config.json
│
├── tools/ # Tool definitions
│ ├── skill-search.md # Deferred skill discovery
│ └── programmatic-tools.md # Programmatic orchestration tools
│
├── workflows/ # Multi-agent patterns
├── examples/ # Example configurations
│
├── scripts/ # Automation tools
│ ├── compose-agent.py # Agent composition with deferred loading
│ ├── setup-commands.py # Tool selector installer 🔌 NEW
│ └── orchestration/ # Advanced orchestration
│ ├── simple_orchestrator.py # Basic multi-agent orchestration
│ ├── prompt_cache.py # Prompt caching for cost reduction
│ ├── sandbox_executor.py # Secure code execution sandbox
│ └── programmatic_orchestrator.py # Programmatic tool calling
│
├── docs/ # Documentation
│ └── PROGRAMMATIC_TOOL_CALLING.md # Programmatic orchestration guide
│
├── tests/ # Test suite
│ └── test_compose_agent.py # Composition tests
│
└── memory/ # RAG and knowledge base
Use pre-configured templates for instant setup:
# For existing web app
cd your-project
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type web-app \
--name "YourProject" \
--output .
# For existing mobile app
python3 path/to/AI_agents/starter-templates/generate-template.py \
--type mobile-app \
--name "YourApp" \
--output .See starter-templates/README.md for complete guide and all available templates.
# .ai-agents/config.yml
agents:
team_manager:
base: "prompts/roles/manager.md"
frontend_developer:
base: "prompts/roles/software-developer.md"
platforms:
- "platforms/web/frontend-developer.md"
skills:
- "core/artifacts-builder"
- "design/theme-factory"
project_context:
- ".ai-agents/context/architecture.md"
- ".ai-agents/context/api-contracts.md"
backend_developer:
base: "prompts/roles/software-developer.md"
platforms:
- "platforms/web/backend-developer.md"
skills:
- "core/mcp-builder"See examples/web-app-team/ for complete example.
See examples/mobile-app-team/ for React Native example.
The library includes powerful slash commands for enhanced workflows in Claude Code:
12 analytical frameworks for structured problem-solving:
| Command | Description | Use When |
|---|---|---|
/consider:first-principles |
Break down to fundamental truths | Complex problems, novel solutions |
/consider:5-whys |
Root cause analysis | Debugging, understanding failures |
/consider:swot |
Strengths, Weaknesses, Opportunities, Threats | Strategic decisions, architecture choices |
/consider:cost-benefit |
Analyze tradeoffs | Technology selection, refactoring |
/consider:premortem |
Imagine future failures | Risk assessment, deployment planning |
/consider:second-order |
Identify downstream effects | System design, breaking changes |
/consider:eisenhower |
Urgent vs Important prioritization | Sprint planning, task management |
/consider:inversion |
Think backwards from failure | Avoiding pitfalls, security review |
/consider:opportunity-cost |
What are you NOT doing? | Resource allocation, priorities |
/consider:stakeholder-mapping |
Identify affected parties | API changes, migrations |
/consider:devils-advocate |
Challenge assumptions | Code review, architecture review |
/consider:reversible-irreversible |
Categorize by reversibility | Deployment strategy, data migrations |
| Command | Description | Use When |
|---|---|---|
/whats-next |
Context handoff between sessions | Starting new session, resuming work |
/debug |
Systematic debugging methodology | Troubleshooting, investigating issues |
/add-to-todos |
Add tasks to todo list | Planning work, tracking tasks |
/check-todos |
Review current todo list | Status check, progress review |
Usage Example:
# Before making an architecture decision
/consider:swot
# Before deploying a major change
/consider:premortem
# When debugging a complex issue
/debugRun quality checks on your agent configurations:
| Agent | Purpose | Usage |
|---|---|---|
skill-auditor |
Reviews skills for best practices | Validate new skills, optimize token usage |
slash-command-auditor |
Reviews slash command quality | Ensure command clarity and effectiveness |
subagent-auditor |
Reviews agent configurations | Validate agent composition, check for issues |
Usage:
# Audit a skill
claude-code agent:.claude/agents/skill-auditor.md --input skills/custom/my-skill/
# Audit a slash command
claude-code agent:.claude/agents/slash-command-auditor.md --input .claude/commands/my-command.md
# Audit an agent configuration
claude-code agent:.claude/agents/subagent-auditor.md --input .ai-agents/composed/backend-developer.mdUse AI_agents tools from any project via /command style wrappers:
# Install tools to your project
cd /path/to/your/project
python /path/to/AI_agents/scripts/setup-commands.py
# Or install globally
python /path/to/AI_agents/scripts/setup-commands.py --global
# List available tools
python /path/to/AI_agents/scripts/setup-commands.py --listWhat gets installed:
- 30 wrapper commands (~60 tokens overhead per invocation)
/ai-toolsdiscovery command- All
/consider:*thinking models
Usage in target project:
/ai-tools # Discover available tools
/create-prompt [description] # Create optimized prompts
/debug [issue] # Apply debugging methodology
/consider:first-principles # Break down to fundamentalsToken impact: Negligible (~50 tokens at-rest, +60 tokens per invocation)
| Document | Description |
|---|---|
| README.md | This file - quick start guide |
| ARCHITECTURE.md | Detailed system architecture with Skills Integration |
| Context_Engineering.md | Foundational principles (the "HOLY BIBLE") |
| SKILLS_GUIDE.md | Comprehensive skills guide - Selection, usage, best practices |
| PRACTICAL_WORKFLOW_GUIDE.md | Human-coordinated workflows - Step-by-step multi-agent coordination |
| PARALLEL_EXECUTION_GUIDE.md | Multi-agent parallelization strategies |
| MIGRATION_GUIDE.md | Adding skills to existing projects |
| starter-templates/README.md | Starter templates guide - Quick setup for existing projects 🚀 |
| prompts/manager-task-delegation.md | Manager guide - Dual-mode workflow (Simple + Complex) |
| prompts/manager-quick-reference.md | Manager quick-start - Copy-paste templates |
| prompts/it-specialist-agent.md | IT Specialist - Infrastructure validation (8 critical checks) |
| prompts/senior-engineer-agent.md | Senior Engineer - Code review & integration |
| skills/README.md | Skills integration overview |
| skills/CATALOG.md | Available skills directory with token estimates |
| skills/INTEGRATION.md | Skills technical implementation guide |
| skills/taches-cc/create-agent-skills/ | Skill authoring best practices - Create high-quality skills 🎯 |
| skills/taches-cc/create-plans/ | Hierarchical planning - Break down complex projects 🎯 |
| skills/taches-cc/debug-like-expert/ | Expert debugging - Systematic issue resolution 🎯 |
| .claude/commands/consider/ | Thinking models - 12 analytical frameworks 💭 |
| .claude/agents/ | Quality auditors - Review skills, commands, and agents 🔍 |
| docs/PROGRAMMATIC_TOOL_CALLING.md | Advanced Tool Use - Programmatic orchestration guide |
| examples/ | Reference implementations with skills |
The library supports two workflow modes depending on project complexity:
🔹 Simple Mode (90% of projects):
User → Manager → Task Agents → Integration Agent
🔸 Complex Mode (10% of projects):
User → Manager → IT Specialist → Task Agents → Senior Engineer
For projects requiring visibility and reporting, add the Scrum Master agent:
With Scrum Master (Simple Mode):
User → Manager → [Scrum Master Setup] → Task Agents → Integration Agent
↓
[AppFlowy Tracking]
With Scrum Master (Complex Mode):
User → Manager → [Scrum Master Setup] → IT Specialist → Task Agents → Senior Engineer
↓
[AppFlowy Tracking + Daily Summaries]
When to Enable Scrum Master:
- ✅ External stakeholders need visibility (clients, executives)
- ✅ Sprint velocity tracking required
- ✅ Daily standup summaries needed
- ✅ You have AppFlowy server (self-hosted or cloud)
What Scrum Master Does:
- Tracks all tasks in AppFlowy workspace
- Generates daily standup summaries
- Calculates sprint velocity metrics
- Reports blockers to Manager
- Creates stakeholder presentations
What Scrum Master Does NOT Do:
- ❌ Create or assign tasks (Manager's job)
- ❌ Make technical decisions
- ❌ Review code or approve merges
See examples/web-app-team/config-with-scrum-master.yml for configuration and
skills/custom/appflowy-integration/README.md for setup guide.
User: "Implement user authentication"
Complex Mode (new project with infrastructure needs):
Manager:
1. Chooses Complex Mode (first feature, needs infrastructure validation)
2. Delegates to IT Specialist for infrastructure setup
3. Creates task breakdown:
├── TASK-001: Implement JWT service (Backend Dev + mcp-builder skill)
├── TASK-002: Create auth API (Backend Dev + mcp-builder skill)
├── TASK-003: Build login form (Frontend Dev + artifacts-builder skill)
└── TASK-004: Write tests (QA Tester + webapp-testing skill)
4. Delegates to Senior Engineer for review & integration
Simple Mode (established project):
Manager:
1. Creates task breakdown (same as above)
2. Delegates directly to Task Agents
3. Delegates to Integration Agent when complete
feature/user-auth/
├── agent/architect/design
├── agent/backend-dev/jwt-service
├── agent/backend-dev/auth-api
├── agent/frontend-dev/login-form
└── agent/mobile-dev/login-screen
{
"type": "status_update",
"agent_id": "frontend-dev-001",
"task_id": "TASK-004",
"status": "in_progress",
"progress": 75
}- Ensures API contracts are followed
- Resolves conflicts
- Coordinates testing
- Merges when complete
IMPORTANT: The workflow described above represents the conceptual model of multi-agent coordination. In practice, there are three ways to implement this:
What it is: You manually run agents in sequence and relay information between them.
How it works:
You → Manager Agent (creates task plan)
You → Backend Agent (works on TASK-001)
You → Frontend Agent (works on TASK-002, you relay backend's progress)
You → Manager Agent (coordinates integration, you provide status from both)
Communication:
- Agents write status to
.ai-agents/state/team-communication.json - You read the file and relay relevant info to other agents
- Agents can see each other's updates by reading the shared file
- You act as coordinator and decision-maker
Tools:
- Claude Code, ChatGPT, or any LLM tool
- One agent session at a time
- Manual switching between agents
Best for:
- ✅ 90% of users - Most practical approach today
- ✅ Small to medium teams (1-5 agents)
- ✅ Projects where you want control and visibility
- ✅ Learning multi-agent patterns
- ✅ When using tools like Claude Code that run one session at a time
See: PRACTICAL_WORKFLOW_GUIDE.md for complete tutorial
What it is: Manager spawns agents using Claude Code's Task tool. Each agent gets a fresh context window.
How it works:
Manager (in Claude Code):
├─ Uses Task tool → IT Specialist (fresh context, 0 tokens)
├─ Uses Task tool → Backend Dev (fresh context, 0 tokens)
├─ Uses Task tool → Frontend Dev (fresh context, 0 tokens)
└─ Uses Task tool → Senior Engineer (fresh context, 0 tokens)
Each agent:
- Starts with 0 tokens (no inherited context)
- Reads only what Manager provides in prompt
- Reports back to Manager when done
- Manager context stays lean (~15-25%)
Communication:
- Manager delegates via Task tool
- Agents read
team-communication.jsonfor context - Agents report back with brief summaries
- Manager acknowledges and moves to next task
Tools:
- Claude Code (Task tool feature)
- No API costs (runs within Claude Code session)
- Single user session, multiple sub-agents
- Fresh context per agent (isolation benefits)
Best for:
- ✅ Recommended for most users - Great balance of control and efficiency
- ✅ Complex projects (5+ agents)
- ✅ First-time infrastructure setup
- ✅ Projects needing code review
- ✅ When you want Manager to stay lean (<25% context)
Dual-Mode Workflow:
- Simple Mode: Manager → Task Agents → Integration Agent (3-5 agents)
- Complex Mode: Manager → IT Specialist → Task Agents → Senior Engineer (5+ agents)
See: prompts/manager-task-delegation.md for complete guide
What it is: Programmatic orchestration system that runs multiple agents via LLM APIs.
How it works:
orchestrator = MultiAgentOrchestrator()
orchestrator.assign_task("TASK-001", backend_agent)
orchestrator.assign_task("TASK-002", frontend_agent)
# Agents run in parallel, communicate automatically
# Manager receives updates via callbacks
# System coordinates integration automaticallyCommunication:
- Direct agent-to-agent messaging via message queue
- Automatic status propagation
- Real-time coordination without human intervention
Tools:
- Custom Python scripts using LLM APIs (Claude API, OpenAI API)
- Message queue (Redis, RabbitMQ) or event system
- Orchestration framework (custom or tools like LangGraph, CrewAI)
Best for:
⚠️ Advanced users with programming experience⚠️ Large-scale projects (5+ agents)⚠️ CI/CD automation⚠️ When you need true parallel execution
See: scripts/orchestration/ for example implementations
| Aspect | Human-Coordinated | Task Tool Delegation ✨ | Fully Automated |
|---|---|---|---|
| Setup | ✅ Simple (any LLM tool) | ✅ Simple (Claude Code) | |
| Control | ✅ Full visibility | ✅ Full visibility | |
| Speed | ✅ True parallel | ||
| Context Isolation | ❌ No isolation | ✅ Fresh context per agent | ✅ Fresh context per agent |
| Manager Context | ✅ Stays lean (15-25%) | N/A | |
| Communication | Manual relay | Automatic (Task tool) | Automatic (message queue) |
| Best for | Learning, simple projects | Most users | Advanced automation |
| Learning curve | Low | Low | High (programming required) |
| Cost | Lower (one at a time) | No API costs | Higher (API calls) |
🔧 Task Tool Delegation (Recommended for Most) if you:
- ✅ Use Claude Code
- ✅ Have complex projects (5+ agents)
- ✅ Want Manager to stay lean (no context overflow)
- ✅ Need infrastructure validation and code review
- ✅ Want zero API costs with fresh context isolation
- ✅ Prefer Simple vs Complex mode flexibility
🤝 Human-Coordinated if you:
- Learning multi-agent systems for the first time
- Using tools other than Claude Code (ChatGPT, etc.)
- Have 1-3 agents total
- Want to manually control every step
🤖 Fully Automated when you:
- Have 5+ agents needing true parallelization
- Built custom orchestration tooling
- Need CI/CD integration
- Have budget for parallel API calls
- Understand coordination patterns deeply
Recommended Path:
- Start: Human-Coordinated (learn the patterns)
- Move to: Task Tool Delegation (most projects - best balance)
- Advanced: Fully Automated (only if you need true parallel execution)
The library supports all three models equally well.
Based on the Context Engineering Guide, this library implements:
- Multi-tier memory - Never lose critical information
- Progressive compression - Manage context window efficiently
- Checkpointing - Resume from failures
- RAG integration - Long-term project memory
Agents use structured JSON for coordination via three complementary state files:
Within-Session Communication (.ai-agents/state/team-communication.json):
- Task assignments - Manager → Agent
- Status updates - Agent → Manager
- Blocker reports - Agent → Manager
- Integration requests - Agent → Agent (via Manager)
- Code reviews - Manager → Agent
Cross-Session Tracking (.ai-agents/state/session-progress.json):
- Current project phase
- Completed vs. active tasks
- Blockers and priorities
- Git baseline for resumption
Feature Verification (.ai-agents/state/feature-tracking.json):
- Feature ID, description, status
- Test files and pass/fail status
- Verification history
- Progress metrics
See schemas/communication-protocol.json for message formats and docs/guides/LONG_RUNNING_AGENTS.md for workflow examples.
Prevents conflicts through git branch strategy:
feature/<name>/agent/<role>/<task>
✓ feature/auth/agent/frontend-dev/login-form
✓ feature/auth/agent/backend-dev/jwt-service
Automatically assembles agents from components:
python scripts/compose-agent.py \
--config .ai-agents/config.yml \
--agent frontend_developer \
--output .ai-agents/composedCentral project state in .ai-agents/state/project-state.json:
{
"active_tasks": [...],
"agent_states": {...},
"shared_resources": {...},
"metrics": {...}
}Long-term memory for:
- Architectural decisions (ADRs)
- Code patterns
- Troubleshooting solutions
- Requirements
Based on Anthropic's research, AI_agents now supports multi-session projects with a three-file state management system:
1. Real-Time Communication (.ai-agents/state/team-communication.json):
- Live coordination between agents within a single session
- Task assignments (Manager → Agents)
- Status updates (Agents → Manager)
- Integration requests (Agent ↔ Agent)
- Cleared/reset between sessions
2. Session Progress Tracking (.ai-agents/state/session-progress.json):
- Cross-session continuity - resume work without rediscovery
- Tracks completed/active tasks across sessions
- Records blockers and priorities
- Maintains git baseline
- Reduces session startup time by 50%
3. Feature Status Management (.ai-agents/state/feature-tracking.json):
- Structured feature lists with pass/fail status
- Prevents premature "done" declarations
- Mandatory E2E testing for user-facing features
- Clear progress visibility (e.g., "6/8 features passing")
Environment Automation (init.sh):
- IT Specialist generates project-specific setup scripts
- Automates dependency installation
- Ensures consistent environments
- Onboards new team members in minutes
Security Framework (scripts/security_validator.py):
- Three-layer defense-in-depth for autonomous execution
- Command allowlist
- Destructive pattern detection
- Filesystem scope restrictions
- Within session: Agents use
team-communication.jsonfor real-time coordination - End of session: Manager updates
session-progress.jsonandfeature-tracking.json - Next session: Manager reads progress files first → skips redundant planning → 50% faster startup
Simple Mode: Uses team-communication.json only
Complex Mode: Uses all three state files for full project tracking
See docs/guides/LONG_RUNNING_AGENTS.md for complete guide with workflows and examples.
Based on Anthropic's Advanced Tool Use, this library implements three optimization patterns:
Instead of loading all skills at startup, skills are discovered on-demand:
# config.yml - New format
agents:
orchestrator:
skills:
always_loaded:
- "core/skill-creator" # Always needed
deferred:
- path: "testing/webapp-testing"
triggers: ["test", "QA", "coverage"]
- path: "communication/internal-comms"
triggers: ["coordinate", "communicate"]How it works:
always_loadedskills are included in the agent promptdeferredskills are listed in a manifest with triggers- When a trigger matches, the skill is loaded on-demand
- Result: 85% reduction in initial context tokens
See tools/skill-search.md for implementation details.
Reduce API costs by caching stable prompt components:
from scripts.orchestration.prompt_cache import CachedAnthropicClient
client = CachedAnthropicClient(api_key)
response, cache_info = client.call_with_cache(
system_prompt=system_prompt, # Cached
messages=messages
)
print(f"Cache hit: {cache_info['cache_read_input_tokens']} tokens saved")Benefits:
- Stable context (system prompts, tools) cached for 5 minutes
- Dynamic context (messages) always fresh
- Significant cost reduction on repeated calls
See scripts/orchestration/prompt_cache.py.
Instead of N tool calls = N inference passes, Claude writes orchestration code:
Traditional: Programmatic:
┌─────────────────────┐ ┌─────────────────────┐
│ Tool call 1 │ │ Claude generates │
│ → Result in context │ │ Python code │
│ Tool call 2 │ │ ↓ │
│ → Result in context │ │ Sandbox executes: │
│ Tool call 3 │ │ - call tool 1 │
│ → Result in context │ │ - call tool 2 │
│ ...N times... │ │ - process results │
│ │ │ - return summary │
│ Context: 50KB+ │ │ Context: 1KB │
└─────────────────────┘ └─────────────────────┘
Run the demo:
python3 scripts/orchestration/sandbox_executor.pyKey features:
- Secure sandbox (no imports, no file access, no network)
- Tool injection - only registered tools available
- Timeout protection
- Only final
resultreturned to model
See docs/PROGRAMMATIC_TOOL_CALLING.md for complete guide.
| Pattern | Token Savings | Best For |
|---|---|---|
| Deferred Loading | 85% initial | Large skill libraries |
| Prompt Caching | API cost | Repeated operations |
| Programmatic Calls | 37% per workflow | Multi-tool orchestration |
All agent prompts and skills now use pure XML structure for improved parsing and token efficiency.
- 25% Token Reduction: XML is more compact than markdown with nested structures
- Better Parsing: LLMs parse XML structure more reliably
- Consistent Structure: Enforced schema across all prompts
- Easier Composition: XML elements compose cleanly
<agent>
<identity>
<role>Backend Developer</role>
<expertise>
<item>REST API design</item>
<item>Database optimization</item>
</expertise>
</identity>
<capabilities>
<capability name="api-development">
<description>Design and implement RESTful APIs</description>
<best-practices>
<practice>Use semantic HTTP methods</practice>
<practice>Implement proper error handling</practice>
</best-practices>
</capability>
</capabilities>
<workflows>
<workflow name="feature-implementation">
<step order="1">Read requirements from task description</step>
<step order="2">Design API endpoints</step>
<step order="3">Implement with tests</step>
</workflow>
</workflows>
</agent>All prompts in .claude/ and skills/taches-cc/ use XML format:
-
Before (Markdown):
# Agent Identity Role: Backend Developer ## Expertise - REST API design - Database optimization
-
After (XML):
<agent> <identity> <role>Backend Developer</role> <expertise> <item>REST API design</item> <item>Database optimization</item> </expertise> </identity> </agent>
Token Comparison: XML version uses ~25% fewer tokens for the same semantic content.
- Legacy Prompts: Existing markdown prompts in
prompts/roles/,platforms/, andprompts/continue to work - New Features: All new slash commands, auditors, and taches-cc skills use XML
- Migration Tool: Coming in v1.3.0 - automatic markdown-to-XML converter
- Python 3.8+ (for composition script)
- PyYAML (
pip install pyyaml) - Git
- LLM provider (Claude, GPT-4, etc.)
Q: Do I need all the agents? A: No, start with just one developer agent and add others as needed.
Q: Can I use this with GPT-4? A: Yes, the prompts work with any LLM. Just adjust model parameters in config.
Q: How do agents avoid conflicts? A: Through branch isolation and resource locking managed by the project state.
Q: What if an agent loses context? A: Checkpointing and multi-tier memory ensure critical info is preserved.
Q: Can I customize the base prompts? A: Yes, fork the repo and modify. Better: use project context to add requirements.
Q: How do I update to newer library versions? A: Update the submodule and test. Use semantic versioning to manage compatibility.
- Start Simple - Begin with one agent, add more as needed
- Define Interfaces First - API contracts before implementation
- Choose Skills Strategically - Assign 1-3 skills per agent based on role (see SKILLS_GUIDE.md)
- Use Branch Isolation - One branch per agent per task
- Monitor Context - Watch for context usage warnings, especially with skills
- Regular Checkpoints - Every 10 turns or at 75% context
- Quality Gates - Enforce tests, reviews, coverage
- Structured Communication - Use JSON message protocol
- Track Skill Effectiveness - Monitor which skills are actually used and improve outcomes
- Skills integration with Anthropic Skills repository (Phase 1-5 complete!)
- Comprehensive skills documentation and guides (45,000+ words)
- Token budget management with skills (Automatic warnings)
- Custom skills framework and templates (Template + 5 examples)
- Example projects showcasing skills usage (3 complete examples)
- Parallel execution guide (Multi-agent optimization strategies)
Based on Anthropic's Advanced Tool Use:
- Deferred Skill Loading - 85% token reduction on initial context
- Prompt Caching - Cost reduction via
cache_controlblocks - Programmatic Tool Calling - 37% token reduction, single inference pass
- Tool Use Examples - Concrete examples for 72% → 90% parameter accuracy
- Secure Sandbox Executor - Safe code execution with restricted builtins
- Agent Schema v2.0 -
defer_loading,allowed_callers,input_examples
Integration of taches-cc-resources for enhanced workflows:
- Slash Commands - 12 thinking model commands (
/consider:*) - Workflow Commands -
/whats-next,/debug, task management commands - Quality Auditor Agents - skill-auditor, slash-command-auditor, subagent-auditor
- XML Prompt Format - Pure XML structure with 25% token efficiency improvement
- taches-cc Skills - create-agent-skills, create-plans, debug-like-expert
- Thinking Frameworks - First-principles, 5-whys, SWOT, cost-benefit, and 8 more
Cross-project tool access system:
- setup-commands.py - Auto-generate wrapper commands for target projects
- /ai-tools router - Discovery command for available tools
- Minimal wrappers - ~200-300 bytes each, +60 tokens per invocation
- Token efficiency - Negligible at-rest cost (~50 tokens for 30 commands)
- Global install - Option to install tools globally for all projects
- Desktop Platform - Electron, Qt, native desktop development
- Data Platform - Data engineering, ML pipelines, analytics
- DevOps Platform - Infrastructure, CI/CD, cloud operations
- Embedded Platform - IoT, firmware, real-time systems
- Lazy loading for skills - On-demand activation based on task context ✅ (Phase 1.5)
- Skill versioning system - Semantic versioning with compatibility tracking
- Skill composition - Combine multiple skills into meta-skills
- Skills marketplace/registry - Community-contributed skills catalog
- Automated testing for prompts - Unit tests for agent behaviors
- Agent performance analytics - Track skill usage, success rates, token efficiency
- Regression testing suite - Ensure skills work across LLM versions
- Prompt optimization tools - A/B testing for agent prompts
- Advanced context compression - Progressive summarization strategies
- Long-term memory systems - RAG integration with vector databases
- Cross-session persistence - Resume agent work across sessions
- Memory prioritization - Smart retention of critical context
- Interactive skill builder - CLI tool for creating custom skills with prompts
- Visual workflow designer - Drag-and-drop agent workflow creation
- Web dashboard - Visual interface for managing agents and monitoring
- VS Code extension - IDE integration for agent development
- Agent templates library - Pre-configured personas for common roles
- CI/CD integration - GitHub Actions for agent testing and deployment
- Docker containers - Containerized agent deployments
- API gateway - RESTful API for agent orchestration
- Webhook support - Event-driven agent activation
- OpenAI GPT integration - Native support for GPT-4, GPT-4-turbo
- Google Gemini integration - Support for Gemini Pro and Ultra
- Local model support - LLaMA, Mistral, Phi via Ollama
- Multi-LLM orchestration - Heterogeneous agent teams (Claude + GPT + local)
- Cost optimization - Smart routing based on task complexity and cost
- Team collaboration - Multi-user agent management
- Access control - Role-based permissions for agents
- Audit logging - Track all agent actions and decisions
- Compliance tools - GDPR, SOC2 compliance helpers
- SLA monitoring - Response time and availability tracking
- Metrics dashboard - Real-time agent performance metrics
- Cost tracking - Token usage and API cost analysis
- Success rate monitoring - Task completion and quality metrics
- Bottleneck detection - Identify coordination issues
- Skill effectiveness scoring - Which skills improve outcomes
- Self-improving agents - Learn from past interactions
- Dynamic skill discovery - Agents request new skills as needed
- Agent specialization - Automatic role optimization based on performance
- Cross-project learning - Share knowledge between agent teams
- Predictive task routing - ML-based agent assignment
- Agent swarm intelligence - Emergent behaviors from agent collectives
- Autonomous architecture - Agents design their own team structures
- Meta-learning systems - Agents that learn how to learn
- Natural language orchestration - Manage teams through conversation
- Blockchain integration - Decentralized agent coordination
- Prompt evolution algorithms - Genetic algorithms for prompt optimization
- Multi-modal agents - Vision, audio, and text integration
- Formal verification - Prove agent behavior correctness
- Causal reasoning - Agents that understand cause-effect relationships
- Ethical AI frameworks - Built-in fairness and safety guardrails
Q4 2024 - Q1 2025: Platform augmentations (Desktop, Data, DevOps) and migration tooling
What's New in v1.3.0:
- Tool Selector system for cross-project tool access via
/commandstyle /ai-toolsdiscovery command for exploring available toolssetup-commands.pyscript for installing wrappers to other projects- Minimal token overhead (~50 tokens at-rest, +60 per invocation)
What's New in v1.2.0:
- 12 thinking model slash commands for structured decision-making
- Quality auditor agents for skills, commands, and agent validation
- XML-based prompt architecture (25% token efficiency)
- taches-cc skills integration (create-agent-skills, create-plans, debug-like-expert)
- Enhanced workflow commands for context handoff and debugging
Community Contributions Welcome! See Contributing section for how to help.
Have ideas for the roadmap? We'd love to hear them!
- Open an issue with the
enhancementlabel - Join discussions in GitHub Discussions
- Submit a PR with a proof-of-concept
Contributions welcome! Please:
- Read ARCHITECTURE.md
- Follow existing conventions
- Test with example projects
- Update documentation
- Submit PR with clear description
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: This repository
Built with principles from the Context Engineering Guide.
Ready to get started? Check out the examples/ directory for complete working configurations!