Open Source · MIT Licensed

Claude Code runs
while you sleep.

Submit work from anywhere — Discord, Slack, REST, GitHub, MCP, or a YAML file. TaskSmith queues it, spawns Claude Code headless, validates output, retries intelligently, tracks costs, and pings you when it's done.

$ npm install -g tasksmith-cli [CLICK TO COPY]
then: tasksmith setuptasksmith run
~/.tasksmith — tasksmith v1.0.1
■ TASKSMITH v1.0.1 — 3 tasks queued, concurrency: 2
 
[pool] fix-auth-bug → slot 1 | model: sonnet (auto)
[pool] add-rate-limit → slot 2 | model: sonnet (auto)
[engine] fix-auth-bug iter 1 — PASS | $0.12
[engine] add-rate-limit iter 1 — FAIL [TEST] → retry
[pool] update-docs → slot 1 | model: haiku (auto)
[engine] add-rate-limit iter 2 — PASS | $0.31
[engine] update-docs iter 1 — PASS | $0.08
 
✓ BATCH COMPLETE — 3/3 passed | total: $0.51 | notifications sent
sys.capabilities

The ops layer
Claude Code doesn't have.

Claude Code is brilliant at writing code. TaskSmith handles the rest — queueing, retrying, validating, learning, and notifying.

01 // QUEUE

Fire-and-Forget Tasks

Submit work from Discord, Slack, REST API, GitHub webhooks, MCP clients, or YAML files. TaskSmith queues it, runs Claude Code headless, and handles the full lifecycle. No terminal babysitting.

02 // VALIDATION

Ralph Loop + Circuit Breaker

Define a validation command. TaskSmith retries until tests pass. The circuit breaker detects stuck loops, infrastructure failures, contradictions, cost ceilings, and timeouts — ejecting doomed tasks before they burn your budget.

03 // CONTEXT

Skills & Context Assembly

SOUL.md, USER.md, conventions, memory, and project context assembled into every invocation. 7 bundled skills (ralph-loop, bug-hunt, code-review, research, doc-gen, heartbeat, project-init) in Claude Code's native SKILL.md format.

04 // MEMORY

Cross-Task Learning

Three-tier memory — hot (MEMORY.md, every prompt), warm (JSONL, searchable), cold (compressed archives). Per-task JSONL event logs. Semantic search via Ollama, OpenAI, or Gemini. tasksmith insights for automated pattern detection.

05 // COST

Cost Dashboard + Budgets

tasksmith costs — per-model/project spend breakdown, time-series rollups, budget alerts, and weighted-moving-average forecasting. Daily, weekly, and monthly limits with configurable warning thresholds.

06 // COMMS

Multi-Channel I/O

7 inbound sources (file drop, CLI, REST API, Discord bot, GitHub webhooks, Slack Events, MCP). 5 outbound providers (Discord, Slack, ntfy.sh, email, webhooks). Tasks flow in from anywhere. Results land where you need them.

07 // ORCHESTRATION

Task DAGs

Chain tasks with depends_on in a directed acyclic graph. Downstream tasks wait. Failure propagates. Cycle detection built in. tasksmith dag shows status.

08 // PROTOCOL

MCP Server + CC Integration

13 MCP tools and 4 resources. Any MCP client can submit tasks and search memory. tasksmith cc-install registers as a Claude Code MCP server in one command. Agents helping agents.

09 // SAFETY

Approval Gates + Security

Human-in-the-loop for high-risk tasks — rule-based matching, timeout auto-reject. REST API auth with bearer tokens and rate limiting. Two-tier input sanitization. Discord guild/channel scoping.

10 // INTELLIGENCE

Smart Model Routing

Set model: auto and let TaskSmith pick. Template-based defaults, prompt complexity heuristics, automatic escalation to a stronger model on failure. tasksmith insights shows actual vs. all-opus cost savings.

11 // RESILIENCE

Crash Recovery

Per-task JSONL event logs survive engine restarts. Iteration checkpointing resumes from last completed iteration — zero wasted tokens. Orphaned tasks auto-recovered on startup.

task.pipeline

How a task flows.

From submission to passing tests. Unattended.

01

Submit

YAML file, Discord, Slack, GitHub webhook, REST API, MCP, or CLI

02

Sanitize & Route

Trust-level validation, smart model selection, approval gate check

03

Execute

Claude Code runs headless with skills injected and full project context

04

Validate & Retry

Run tests. Classify failures. Circuit breaker ejects stuck loops. Retry until green.

05

Learn + Notify

Archive to memory. Log costs. Push results to Discord, Slack, or your phone.

<10k
Lines of core TypeScript
13
MCP tools exposed
8
Official plugins
12
Communication providers
plugins.official

8 plugins.
Zero installs.

Official plugins ship with the CLI. Enable any with one line in config. Lazy-loaded — disabled plugins cost nothing.

# tasksmith.yaml
plugins:
  - github           # Issues/PRs on fail/success
  - metrics          # Execution analytics + insights
  - docker           # Container isolation
  - jira             # Ticket integration
  - postgres         # SQL task history
  - proxmox          # VM provisioning
  - cloudflare       # Pages deploy/rollback
  - semantic-memory  # Vector search

GitHub

Auto-create issues on failure. Submit tasks from GitHub issues with tasksmith submit --from-github-issue. Webhook intake with HMAC-SHA256 verification.

📊

Metrics & Insights

Success rates, model comparison, failure patterns, cost outliers, trends. tasksmith metrics and tasksmith insights for full analytics in your terminal.

🔍

Semantic Memory

Vector-based search over task history via Ollama (local), OpenAI, or Gemini embeddings. tasksmith semantic --query "auth refactor"

Cloudflare

Deploy to Cloudflare Pages on task success. Rollback, cache purge, deployment history. tasksmith plugin run cloudflare

+

Build Your Own

tasksmith plugin create my-thing
npm is the plugin manager. Publish to @tasksmith-dev/* or tasksmith-plugin-*.

proof.dogfood
16
Dogfood tasks processed
100%
First-iteration pass rate
$3.29
Total cost (16 tasks)
$0.21
Average cost per task

TaskSmith dogfoods its own development. These numbers are real.

SysOp

Matt

Software architect from Birmingham, AL. Built TaskSmith because Claude Code is powerful but stateless — every session starts from scratch with no queue, no retry, no notifications. OpenClaw's 430k lines were too many. TaskSmith is under 10,000 lines of core TypeScript, 8 bundled plugins, and it's been dogfooding its own development since v0.5.0.