Development Guide

Thrum Development Guide

This guide explains how to set up and work with the Thrum codebase.

Prerequisites

Go 1.26+: Install Go
Node.js 22+ and pnpm 10+: Required for building the UI monorepo
Make: Build automation
Playwright: E2E testing (npx playwright install chromium)
golangci-lint: Code linting (auto-installed by make lint)
markdownlint-cli: Markdown linting (auto-installed by make lint-md)

Quick Start

# Clone repository
git clone <repo-url>
cd thrum

# Install dependencies
go mod download
cd ui && pnpm install && cd ..

# Run Go tests
make test

# Full build (UI + Go binary) and install to ~/.local/bin
make install

# Start daemon and verify
thrum daemon start
thrum daemon status

Project Structure

thrum/
├── cmd/
│   └── thrum/               # CLI entry point
│       ├── main.go          # Cobra command tree (all CLI commands)
│       └── mcp.go           # MCP server command
├── internal/                # Private packages
│   ├── cli/                 # CLI business logic (one file per command)
│   │   └── hints/           # CLI hint and suggestion output
│   ├── config/              # Configuration loading and identity files
│   ├── daemon/              # Daemon core
│   │   ├── cleanup/         # Agent work context cleanup
│   │   ├── permission/      # Permission prompt detection and keystroke delivery
│   │   ├── reconcile/       # Agent state reconciliation
│   │   ├── rpc/             # JSON-RPC 2.0 method handlers
│   │   ├── state/           # Persistent state (JSONL + SQLite)
│   │   ├── server.go        # Unix socket server
│   │   ├── lifecycle.go     # Signal handling, defer cleanup, flock
│   │   ├── pidfile.go       # JSON PID file with repo-affinity metadata
│   │   ├── client.go        # Client library
│   │   ├── flock.go         # FileLock struct definition
│   │   ├── flock_unix.go    # flock() implementation (Unix)
│   │   ├── flock_other.go   # No-op stubs (non-Unix)
│   │   └── testutil_test.go # StartTestDaemon() helper
│   ├── gitctx/              # Git-derived work context (branch, uncommitted files)
│   ├── identity/            # ID generation (ULID-based: repo, agent, session, message, event)
│   │   └── guard/           # Cross-worktree identity guards, WritePID enforcement
│   ├── jsonl/               # JSONL reader/writer with file locking
│   ├── mcp/                 # MCP stdio server (4 tools, WebSocket waiter)
│   ├── paths/               # Path resolution, .thrum/redirect, sync worktree path
│   ├── projection/          # JSONL to SQLite event replay (projector)
│   ├── schema/              # SQLite schema, DDL, and migrations (v24)
│   ├── sync/                # Sync engine (loop, merge, push, dedup, branch management)
│   ├── tmux/                # Tmux operations, nudge delivery, session management (v0.7.1)
│   ├── restart/             # JSONL conversation extraction, restart snapshots (v0.7.1)
│   ├── bridge/              # Shared bridge components (v0.7.0)
│   │   ├── bridge.go        # TransportBridge interface, Notification type
│   │   ├── msgmap.go        # Local↔remote message ID mapping
│   │   ├── relay.go         # Common inbound/outbound relay
│   │   ├── wsclient.go      # Shared WebSocket client
│   │   ├── peer/            # PeerBridge, PeerTransport, address validation
│   │   └── telegram/        # Telegram bridge (thin wrappers around shared components)
│   ├── transport/           # Transport abstraction layer
│   ├── types/               # Shared event type definitions
│   ├── web/                 # Embedded SPA (//go:embed React build)
│   └── websocket/           # WebSocket server, connections, registry
├── ui/                      # UI monorepo (Turborepo + pnpm workspaces)
│   ├── packages/
│   │   ├── shared-logic/    # Framework-agnostic business logic (TanStack, Zod)
│   │   ├── web-app/         # React web application (Vite, shadcn/ui)
│   │   └── tui-app/         # Terminal UI (Ink, placeholder)
│   ├── turbo.json           # Turborepo configuration
│   ├── pnpm-workspace.yaml  # pnpm workspace configuration
│   └── package.json         # Root monorepo scripts
├── tests/
│   └── e2e/                 # Playwright E2E tests (13 spec files)
│       ├── helpers/         # Test helpers (CLI wrapper, fixtures)
│       ├── global-setup.ts  # Daemon start + agent registration
│       └── global-teardown.ts
├── scripts/                 # Setup scripts
│   ├── setup-worktree-thrum.sh  # Configure .thrum/redirect for worktrees
│   └── setup-worktree-beads.sh  # Configure .beads/redirect for worktrees
├── docs/                    # User documentation
├── dev-docs/                # Design documents and prompts
├── .agents/                 # Agent workflow instructions (hidden directory)
├── .beads/                  # Issue tracking (beads)
├── Makefile                 # Build targets
├── go.mod                   # Go module (github.com/leonletto/thrum)
├── playwright.config.ts     # Playwright E2E configuration
└── llms.txt / llms-full.txt # Agent reference files

Development Workflow

1. Running Tests

Go Tests

# Run all Go tests
make test

# Run unit tests only (fast, skips integration)
make test-unit

# Run integration tests
make test-integration

# Run tests with verbose output
make test-verbose

# Run specific package tests
go test ./internal/config/... -v

# Run with race detector
go test -race ./...

# Run resilience tests (requires build tag)
go test -tags=resilience ./internal/daemon/...

Resilience Test Suite (v0.4.3):

The resilience test suite includes 39 tests covering crash recovery, concurrent access, and timeout enforcement. These tests require the -tags=resilience build flag:

Crash recovery scenarios (daemon restart, state restoration)
Concurrent access patterns (multiple goroutines, race conditions)
Timeout enforcement (I/O timeouts, RPC timeouts, WebSocket timeouts)

Run the full resilience suite:

go test -tags=resilience -v ./internal/daemon/resilience_test.go

UI Tests

# Run all UI tests (from monorepo root)
cd ui && pnpm test

# Run web-app tests only
cd ui/packages/web-app && pnpm test

# Run shared-logic tests only
cd ui/packages/shared-logic && pnpm test

# Watch mode
cd ui/packages/web-app && pnpm test:watch

# Coverage report
cd ui/packages/web-app && pnpm test:coverage

E2E Tests (Playwright)

E2E tests require the daemon to be running and the binary to be built:

# Build and install
make install

# Start daemon
thrum daemon start

# Run all E2E tests (serial execution required)
npx playwright test --workers=1

# Run a specific spec file
npx playwright test tests/e2e/messaging.spec.ts --workers=1

# View HTML report
npx playwright show-report

The E2E test suite uses global-setup.ts to start the daemon and register a test agent, and global-teardown.ts to stop the daemon after all tests complete.

2. Code Coverage

# Generate Go coverage report
make test-coverage
# Report output: output/coverage.html

3. Linting

# Run Go linter (auto-installs golangci-lint if missing)
make lint

# Auto-fix Go lint issues
make lint-fix

# Run Markdown linter
make lint-md

# Auto-fix Markdown issues
make lint-md-fix

# Run all linters (Go + Markdown)
make lint-all

4. Formatting

# Format Go code
make fmt

# Format Markdown files (requires prettier)
make fmt-md

# Format all files (Go + Markdown)
make fmt-all

5. Building

# Full build: UI + Go binary
make build
# Output: ./bin/thrum

# Build Go binary only (skip UI rebuild, uses existing internal/web/dist/)
make build-go

# Build UI only (pnpm install + build, copies to internal/web/dist/)
make build-ui

# Full build + install to ~/.local/bin
make install

# Run built binary
./bin/thrum

The build embeds the React SPA into the Go binary via //go:embed in internal/web/embed.go. The make build-ui step copies the Vite build output into internal/web/dist/ so the Go embed directive can include it.

A .gitkeep file in internal/web/dist/ ensures go build and go vet work even when the UI has not been built.

Makefile Targets

Target	Description
`make help`	Show all available targets (default)
`make dev`	Build + sign, restart only the worktree-scoped daemon. Does NOT touch `~/.local/bin/thrum`. Safe for multi-agent machines — other agents on the host keep running their existing binary.
`make build`	Full build: UI + Go binary
`make build-ui`	Build UI and copy to embed location
`make build-go`	Build Go binary only (skip UI rebuild)
`make install`	Full build and install to `~/.local/bin`
`make test`	Run all Go tests
`make test-unit`	Run unit tests only (fast)
`make test-integration`	Run integration tests
`make test-coverage`	Generate coverage report to `output/`
`make test-verbose`	Run tests with verbose output
`make fmt`	Format Go code
`make fmt-md`	Format Markdown files with prettier
`make fmt-all`	Format all files (Go + Markdown)
`make lint`	Run golangci-lint
`make lint-fix`	Run golangci-lint with auto-fix
`make lint-md`	Run markdownlint
`make lint-md-fix`	Run markdownlint with auto-fix
`make lint-all`	Run all linters (Go + Markdown)
`make vet`	Run `go vet`
`make tidy`	Tidy Go dependencies
`make clean`	Remove build artifacts (`output/`, `bin/`, `dist/`)
`make install-tools`	Install dev tools (golangci-lint, markdownlint-cli)
`make quick-check`	Fast pre-commit checks: format, vet, test, build
`make ci`	Full CI checks: format-all, lint-all, vet, test, build
`make pre-commit`	Alias for `quick-check`
`make pre-push`	Alias for `ci`

Common Tasks

Adding a New Event Type

Define event struct in internal/types/events.go
Add handler in internal/projection/projector.go
Add case in the Apply() switch statement
Write tests in internal/projection/projector_test.go

Current event types handled by the projector:

message.create, message.edit, message.delete
agent.register
agent.session.start, agent.session.end
agent.update
thread.create

Example:

// 1. Define event type in internal/types/events.go
type MyNewEvent struct {
    BaseEvent
    MyField string `json:"my_field"`
}

// 2. Add handler in internal/projection/projector.go
func (p *Projector) applyMyNew(data json.RawMessage) error {
    var event types.MyNewEvent
    if err := json.Unmarshal(data, &event); err != nil {
        return fmt.Errorf("unmarshal my.new: %w", err)
    }

    // Insert/update database
    _, err := p.db.Exec(`...`)
    return err
}

// 3. Update switch in Apply()
case "my.new":
    return p.applyMyNew(event)

Modifying Database Schema

Update table definitions in internal/schema/schema.go
Increment CurrentVersion constant (currently v24)
Add migration logic in the Migrate() function
Write tests for the new schema
Update docs/architecture.md

Testing with Temporary Databases

func TestMyFeature(t *testing.T) {
    // Create temp database
    tmpDir := t.TempDir()
    dbPath := filepath.Join(tmpDir, "test.db")

    db, _ := schema.OpenDB(dbPath)
    defer db.Close()

    schema.InitDB(db)

    // Test your feature
    // ...
}

Adding a New RPC Method

Create handler file in internal/daemon/rpc/:

// internal/daemon/rpc/mymethod.go
package rpc

import (
    "context"
    "encoding/json"
)

type MyMethodHandler struct {
    // dependencies
}

func NewMyMethodHandler(deps...) *MyMethodHandler {
    return &MyMethodHandler{...}
}

func (h *MyMethodHandler) Handle(ctx context.Context, params json.RawMessage) (any, error) {
    // Parse params
    var args MyMethodArgs
    if err := json.Unmarshal(params, &args); err != nil {
        return nil, fmt.Errorf("invalid params: %w", err)
    }

    // Implementation
    result := MyMethodResponse{
        // ...
    }

    return result, nil
}

Add tests in internal/daemon/rpc/mymethod_test.go
Register in daemon startup (in cmd/thrum/main.go):

myMethodHandler := rpc.NewMyMethodHandler()
server.RegisterHandler("mymethod", myMethodHandler.Handle)

Update documentation in docs/rpc-api.md

Environment Variables

Configuration is resolved in priority order:

THRUM_NAME env var to select which identity file (highest priority)
Environment variables: THRUM_ROLE, THRUM_MODULE, THRUM_DISPLAY
CLI flags (--role, --module, --name)
Identity file in .thrum/identities/{name}.json
Error if required fields are missing

# Select a named agent identity
export THRUM_NAME=furiosa

# Or set agent properties directly
export THRUM_ROLE=implementer
export THRUM_MODULE=auth
export THRUM_DISPLAY="Auth Agent"

Identity files are stored per-agent at .thrum/identities/{name}.json and contain repo ID, agent config, worktree name, and metadata.

Storage Layout

Thrum uses a split storage model:

.git/thrum-sync/a-sync/              # Sync worktree (a-sync orphan branch)
├── events.jsonl                     # Agent lifecycle events (register, session, update)
└── messages/                        # Per-agent message files (sharded)
    ├── furiosa.jsonl                # Messages authored by agent "furiosa"
    └── coordinator.jsonl            # Messages authored by agent "coordinator"

.thrum/                              # Runtime directory (gitignored)
├── var/
│   ├── messages.db                  # SQLite projection cache (rebuilt from JSONL)
│   ├── thrum.sock                   # Unix socket for daemon RPC
│   ├── thrum.pid                    # JSON PID file with repo-affinity metadata
│   └── ws.port                      # WebSocket port file (default 9999)
├── identities/                      # Per-agent identity files
│   └── {name}.json                  # Agent identity (repo_id, role, module, etc.)
└── redirect                         # Points to main worktree .thrum/ (feature worktrees only)

Inspecting JSONL Files

# View all events (agent lifecycle)
cat .git/thrum-sync/a-sync/events.jsonl | jq .

# View messages for a specific agent
cat .git/thrum-sync/a-sync/messages/furiosa.jsonl | jq .

# Filter by event type
cat .git/thrum-sync/a-sync/events.jsonl | jq 'select(.type == "agent.register")'

# Count events
wc -l .git/thrum-sync/a-sync/events.jsonl

Inspecting the SQLite Database

# Open database
sqlite3 .thrum/var/messages.db

# List tables
.tables

# Query messages
SELECT * FROM messages LIMIT 10;

# Check schema version
SELECT * FROM schema_version;

Daemon Development

Daemon Architecture

The daemon runs as a background service handling client connections via Unix socket, with a WebSocket server and embedded SPA all on a single port (default 9999).

Key components:

Server (internal/daemon/server.go): JSON-RPC 2.0 over Unix socket
Lifecycle (internal/daemon/lifecycle.go): Signal handling, defer cleanup safety net, flock-based process detection
PID file (internal/daemon/pidfile.go): JSON format with PIDInfo struct (PID, repo path, socket path, started at). Backward-compatible reader falls back to plain integer format. All agent_pid writes into identity files must go through internal/identity/guard.WritePID. All legacy direct-write callsites have been removed; CI guards against re-introduction.
File lock (internal/daemon/flock.go, flock_unix.go): OS-level flock() on socket file. Auto-released on process death (even SIGKILL). No-op stubs for non-Unix platforms.
State (internal/daemon/state/): Manages JSONL writes (sharded per-agent) and SQLite projection. NewState(thrumDir, syncDir, repoID) separates runtime state from sync data.
RPC handlers (internal/daemon/rpc/): Method implementations for agent, session, message, thread, health, sync, queue, and user operations
Queue (internal/daemon/rpc/queue.go, queue_rpc.go): Command queue dispatch for tmux sessions — per-session FIFO, silence-based completion detection, @system notifications, restart recovery
Logging (internal/daemon/logging.go): Slog-based structured logging with lumberjack rotation (10 MB, 4 backups, 28 days, gzip)
safecmd (internal/daemon/safecmd/): Git command wrapper with 5s/10s timeouts and automatic -c user.name injection for sync worktree commits
Client (internal/daemon/client.go): Connection library for CLI-to-daemon communication
WebSocket (internal/websocket/): Server, connection registry, event streaming
Web (internal/web/embed.go): Embedded SPA served at / on the same port as WebSocket (/ws)

See docs/daemon.md for detailed architecture.

Running the Daemon

# Start daemon (background, auto-creates sync worktree)
thrum daemon start

# Check status (shows PID, repo path, WebSocket port)
thrum daemon status

# Stop daemon
thrum daemon stop

# Auto-start (happens automatically via any CLI command)
thrum send "Hello" --to @coordinator

Testing Daemon Code

# Run daemon tests
go test ./internal/daemon/...

# With coverage
go test -cover ./internal/daemon/...

# RPC handler tests
go test ./internal/daemon/rpc/... -v

# State tests
go test ./internal/daemon/state/... -v

Use the StartTestDaemon() helper in internal/daemon/testutil_test.go for integration tests. It provides automatic t.Cleanup() with force-kill to prevent test orphan processes on timeout or panic.

Debugging Daemon

Check if daemon is running:

# Check PID file (JSON format)
cat .thrum/var/thrum.pid | jq .

# Check process
ps aux | grep thrum

# Check socket
ls -l .thrum/var/thrum.sock

Test RPC calls manually:

# Using netcat
echo '{"jsonrpc":"2.0","method":"health","id":1}' | nc -U .thrum/var/thrum.sock

View daemon logs:

# View last 50 lines (default)
thrum daemon logs

# Stream live
thrum daemon logs -f

# Last hour
thrum daemon logs --since 1h

# Raw file access (for development introspection)
cat .thrum/var/daemon.log

# Check daemon status (shows PID, repo path, WebSocket port)
thrum daemon status

Clean restart:

# Stop daemon
thrum daemon stop

# Remove stale files if needed
rm .thrum/var/thrum.sock
rm .thrum/var/thrum.pid

# Restart
thrum daemon start

Common Daemon Issues

Socket path too long:

Unix sockets limited to ~104 characters
Use shorter temp directory paths in tests
Example: filepath.Join(tmpDir, "d.sock") not filepath.Join(tmpDir, ".thrum", "var", "thrum.sock")

Permission denied:

Socket should be 0600 (owner only)
Check .thrum/var/ directory permissions

Bind: address already in use:

Another daemon already running
Pre-startup duplicate detection validates no existing daemon serves this repo
Check PID file and kill process
Remove stale socket file

Connection refused:

Daemon not running
Check PID file exists
Verify socket file exists

MCP Server Development

The MCP server (thrum mcp serve) provides native MCP tools for Claude Code agents instead of shelling out to the CLI. It uses stdio transport (JSON-RPC over stdin/stdout).

Key files:

internal/mcp/server.go: Server skeleton and tool registration
internal/mcp/tools.go: Tool handler implementations
internal/mcp/types.go: Request/response type definitions
internal/mcp/waiter.go: WebSocket-based blocking message waiter
cmd/thrum/mcp.go: thrum mcp serve Cobra command

MCP Tools (5 total: 4 active + 1 deprecated):

Tool	Description
`send_message`	Send a message to another agent via @role addressing
`check_messages`	Poll for unread messages mentioning this agent
`wait_for_message`	Block until a message arrives (WebSocket push) or timeout
`list_agents`	List registered agents with active/offline status
`broadcast_message`	(Deprecated) Broadcast to all agents — use `send_message(to="@everyone")` instead

Architecture:

Per-call cli.Client creation (thread-safe; Unix socket connections are cheap)
WebSocket waiter with atomic incrementing JSON-RPC IDs
Identity resolved at startup from .thrum/identities/{name}.json
THRUM_NAME env var or --agent-id flag for multi-agent worktrees

# Start MCP server
thrum mcp serve

# Override agent identity
thrum mcp serve --agent-id furiosa

Sync Engine

The sync engine runs in the daemon, performing fetch/merge/push every 60 seconds (configurable via --sync-interval).

Key files:

internal/sync/loop.go: SyncLoop with periodic and manual sync triggers
internal/sync/merge.go: JSONL merge with deduplication (ULID event_id)
internal/sync/push.go: Git push to remote
internal/sync/branch.go: Safe orphan branch creation via git commit-tree + git update-ref, sync worktree with sparse checkout, 4-level health checks
internal/sync/dedup.go: Event deduplication by event_id

Sync worktree location: .git/thrum-sync/a-sync/ (uses git-common-dir for nested worktree support).

Sparse checkout patterns: /events.jsonl, /messages/, /messages.jsonl (migration compat).

Worktree Setup

Thrum supports multiple git worktrees sharing a single daemon and data store via the .thrum/redirect mechanism. Feature worktrees point to the main worktree's .thrum/ directory so all worktrees share one daemon, one SQLite database, and one set of JSONL files.

Setting Up a Worktree

# Option 1: Use the thrum setup command
thrum setup /path/to/worktree

# Option 2: Use the setup script
./scripts/setup-worktree-thrum.sh /path/to/worktree

# Option 3: Manual setup
mkdir -p /path/to/worktree/.thrum/identities
echo "/path/to/main/repo/.thrum" > /path/to/worktree/.thrum/redirect

Beads Issue Tracking for Worktrees

All worktrees should share the same beads issue database:

# Use the setup script
./scripts/setup-worktree-beads.sh /path/to/worktree

# Or manual setup
mkdir -p /path/to/worktree/.beads
echo "/path/to/main/repo/.beads" > /path/to/worktree/.beads/redirect

# Verify
cd /path/to/worktree && bd where

Testing Best Practices

1. Use Table-Driven Tests

tests := []struct {
    name string
    input string
    want string
}{
    {"case 1", "input1", "expected1"},
    {"case 2", "input2", "expected2"},
}

for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
        got := MyFunction(tt.input)
        if got != tt.want {
            t.Errorf("got %v, want %v", got, tt.want)
        }
    })
}

2. Clean Up Resources

// Use t.TempDir() for automatic cleanup
tmpDir := t.TempDir()

// Or defer cleanup
db, _ := schema.OpenDB(dbPath)
defer db.Close()

3. Test Error Cases

// Test both happy path and error cases
_, err := MyFunction(invalidInput)
if err == nil {
    t.Error("expected error, got nil")
}

4. Use StartTestDaemon for Integration Tests

// Automatically cleans up on test completion (even on panic/timeout)
daemon := StartTestDaemon(t, tmpDir)
defer daemon.Stop()

Code Style

Formatting: Use go fmt (or gofmt -s)
Imports: Group stdlib, external, internal
Comments: Document exported functions and types
Error messages: Lowercase, no punctuation, wrap with fmt.Errorf
Variable names: Short, descriptive (e.g., db, cfg, msg)

Git Workflow

# Create feature branch
git checkout -b feature/my-feature

# Make changes and test
go test ./...

# Commit
git add .
git commit -m "Add my feature"

# Push
git push origin feature/my-feature

# Create PR
gh pr create

Troubleshooting

"no such table" error

The SQLite projection database is a rebuild-able cache. Delete it and restart the daemon to rebuild from JSONL:

rm .thrum/var/messages.db
thrum daemon stop
thrum daemon start

"cannot open file" error

Check file permissions and directory existence:

ls -la .thrum/
ls -la .thrum/var/

Tests fail with "database is locked"

Close any open SQLite connections or delete WAL files:

rm .thrum/var/*.db-wal
rm .thrum/var/*.db-shm

Daemon won't start (duplicate detection)

The daemon validates no existing instance serves the same repository before starting. Check for a stale PID file:

cat .thrum/var/thrum.pid | jq .
# If the process is dead, remove the PID file
rm .thrum/var/thrum.pid
thrum daemon start

`go build` fails with embed error

If the UI has not been built, internal/web/dist/ needs at least a .gitkeep file:

touch internal/web/dist/.gitkeep

Or build the UI first: make build-ui

Key Dependencies

Dependency	Purpose
cobra	CLI command framework
modernc.org/sqlite	Pure Go SQLite driver
oklog/ulid	ULID generation for event IDs
gorilla/websocket	WebSocket server
go-sdk (MCP)	MCP server SDK

Next Steps

Architecture — the full system design before diving into specific packages
Daemon Architecture — the daemon's component layout for adding new RPC handlers or sync features
RPC API Reference — all existing RPC methods as a reference point when adding new ones
Security & CI/CD — CI workflows, branch protection, and how documentation is deployed

Resources

Architecture: docs/architecture.md
Daemon Architecture: docs/daemon.md
RPC API Reference: docs/rpc-api.md
Sync Design: docs/sync.md
Quickstart Guide: docs/quickstart.md
CLI Reference: docs/cli.md
Identity System: docs/identity.md
Workflow Templates: docs/workflow-templates.md (structured feature development with AI agents)
Agent Reference: llms.txt (concise) and llms-full.txt (detailed)