2026TypeScriptAI AgentsDev Tools

VibeHub

A spec-first development platform where features are defined in plain English and compiled into working software by an agentic loop.

Version control, at its core, is about protecting the integrity of shared work. From RCS's file-level locks in 1982, to CVS's concurrent branches in 1986, to git's distributed model in 2005, the primitive we version-control has always been code. Lines of code are diffed, merged, and resolved. But AI has fundamentally changed how we interact with code. Our primary modality is increasingly human language, and the artifact we're version-controlling hasn't kept up.

In my day job, I might have multiple Claude Code sessions running simultaneously, handling separate tickets. By the time I put up a PR, an AI reviewer is going to catch things I missed. The code is increasingly a derived artifact. What actually captures the engineering decision is the intent behind it. VibeHub is built around this thesis: version-control the intent, compile the code.

VibeHub landing page — spec-first version control for the AI era

Vibes as a Primitive

A vibe file is a markdown specification for a single feature, stored in .vibe/features/. It's the source of truth. If the spec and the code diverge, the spec wins. Each vibe file has YAML frontmatter declaring its dependencies, constraints, and metadata, followed by prose describing what the feature should do in plain English. Here's what that looks like:

yaml

---
Name: Payments
Uses: [Auth, Database]
Data: [User, Transaction, PaymentMethod]
Never:
  - Store raw credit card numbers
  - Process payments without authentication
Connects: [Stripe]
---

The Uses: field declares dependencies: the compiler resolves these into a directed graph and compiles features in topological order. Data: describes the entities the feature touches, helping the agent understand the data model. Never: rules are hard constraints injected into every LLM call across both compilation phases. Connects: references external integrations defined in .vibe/integrations/.

The broader .vibe/ directory houses the full project context:

project-root/
├── .vibe/
│   ├── meta.json          # project name, version, creation date
│   ├── project.json        # { dev, build, test, install, framework, language }
│   ├── features/
│   │   ├── auth.md         # "OAuth2 via Google and GitHub..."
│   │   ├── payments.md     # "Users can pay via Stripe..."
│   │   └── dashboard/
│   │       └── analytics.md
│   ├── requirements/       # tech stack constraints (YAML)
│   ├── mapping.json        # feature path → source file globs
│   └── remote.json         # { owner, repo, webUrl }

This is a deliberate design choice. The most accessible feature language is human language. Non-technical stakeholders can read and propose changes to a vibe file without needing to understand the codebase. Design decisions are reviewable, diffable, and portable. When Linus Torvalds created git, he started with four tenets: reliability, performance, distribution, and content management. Those same tenets apply to version-controlled intention, although the mechanisms change significantly.

Creating a new VibeHub project with feature specs

The Compilation Engine

VibeHub's compiler doesn't use templates or code generation in the traditional sense. It's an agentic loop: a two-phase pipeline that processes features in topological dependency order, where each phase is an autonomous agent with a restricted toolset.

Phase 1: Agentic Code Generation

The first phase uses a strong model (Claude Opus, GPT-4o, or Gemini Pro, depending on the user's choice) with a deliberately restricted set of tools:

Tool	Purpose
`write_file`	Create complete files from scratch (no partial edits)
`read_file`	Inspect existing workspace code
`list_files`	Directory enumeration with glob patterns
`search_files`	Regex-based pattern matching across files
`finish`	Signal completion

No shell access. No partial edits. The agent must write complete files from scratch. This constraint is intentional; it forces the agent to reason holistically about each file rather than patching incrementally.

The agent receives the full feature specification, any dependency specs, hard Never: constraints, and code headers from upstream features. These headers are extracted automatically (imports, type definitions, exports, function signatures) and capped at 80 lines per file to prevent cognitive overload while giving the agent enough context to import correctly. The principle is "import, don't redefine", which prevents the duplication that plagues most AI-generated code.

This phase typically completes in 3-5 tool-use rounds, with a budget of 10.

Phase 2: Agentic Validation & Fixing

The second phase is where things get interesting. A faster model takes over with an expanded toolset: edit_file for surgical string replacements, and critically, run_command for sandboxed shell execution. The allowlisted commands are intentionally narrow:

tsc, npx tsc, npx eslint, node, npm test, npm run,
npx jest, npx vitest, npx prettier, cat, ls, find, head, tail, wc

The agent reads project.json, runs npm install, executes the build/typecheck (npx tsc --noEmit), and iteratively fixes errors. When tests fail, it fixes the implementation, never the test files, ensuring tests remain an honest signal.

The key design insight is the split itself. Phase 1 optimizes for code quality with a strong model that reasons deeply. Phase 2 optimizes for correctness and cost with a fast model that iterates quickly on mechanical fixes: type errors, missing imports, test assertions. This phase has a budget of 25 iterations (typical: 5-10) and a 12-minute wall-clock deadline.

Dependency Resolution and Topological Compilation

Features declare dependencies via Uses: frontmatter. The compiler builds a directed dependency graph, detects cycles (reported as errors), and topologically sorts for compilation order. Generated code accumulates, so later features can import types and functions from earlier ones. This is what makes the system compositional: features build on each other semantically, not just file-by-file.

Merging Intent, Not Syntax

In traditional version control, merging happens at the line level. If two people change the same line, you get a conflict marker and a human figures it out. In VibeHub, merging happens at the spec level. The merge algorithm operates on vibe files directly:

typescript

const conflicts = detectConflicts(baseFeatures, headFeatures, mainFeatures);
// conflicts: files changed differently on both sides since the branch point

const merged = computeMergedVibes(base, head, main, resolutions);
// resolutions: accept-head | accept-main | AI-feathered content

If both sides changed the same feature spec, you get a conflict — but it's a conflict of intent, not syntax. And because these are human-readable descriptions, resolving them is tractable in a way that AST-level code merges are not.

We also built an AI conflict resolution mode ("feathering") that calls Claude to produce a merged spec when both sides changed the same file. It reads both intents and synthesizes a coherent specification. This works because intent is composable in ways that implementation details often aren't.

Sandboxing and Security

Every compile job runs in an isolated temporary directory, automatically destroyed on completion (even on agent crash). Security constraints include:

Path traversal protection: path.resolve validates all file paths; ../../ escapes are blocked
120-second timeout per command, 50KB output truncation, 50-match cap on search results
Command allowlist enforcement: only whitelisted prefixes are permitted
BYOK encryption: user API keys encrypted at rest with AES-256-CBC, decrypted only at compile time by the agent service
Per-user concurrency limits: 1 active job (free tier), 3 (BYOK); returns 429 when at capacity
Zombie reaper: agent workers periodically call POST /api/agent/jobs/reap to mark stale running jobs as failed after 15 minutes

Architecture

The platform is a monorepo of four packages:

vibehub/
├── packages/
│   ├── web/       # Next.js 14 (App Router) — collaborative review platform
│   ├── desktop/   # Tauri 2 + React — VibeStudio, the local editing environment
│   ├── agent/     # Node.js polling worker — the agentic compile loop
│   └── cli/       # Go binary — vibe init, vibe clone, vibe import, vibe updates

The agent worker polls GET /api/agent/jobs/next every 5 seconds, runs the compile job, and patches results back. Multiple instances can be deployed for horizontal scaling since the job queue handles atomic claiming. Real-time progress is streamed via structured CompileEvents buffered every 2 seconds, with the frontend polling every 3.

VibeHub project overview showing feature specs and compilation status

The provider abstraction supports Anthropic, Google, and OpenAI through a unified interface:

typescript

interface LLMProvider {
  readonly model: string;
  createMessage(
    system: string,
    messages: MessageParam[],
    tools: Tool[]
  ): Promise<{ content: ContentBlock[]; stop_reason: string }>;
}

Internally, the canonical message format is Anthropic's tool-use protocol. The GeminiProvider translates Anthropic's call-ID-based tool-use format to Gemini's name-based function calling at the SDK boundary.

The Vibe CLI

The CLI is a Go binary that deliberately avoids git terminology. Projects have updates, not pull requests or branches. Statuses are in review, applied, and closed — not open, merged, and closed. This isn't just cosmetic; the vocabulary signals that you're working with intent, not code, and that the workflow is fundamentally different from what git models.

bash

vibe updates              # list all updates (default)
vibe updates list         # explicit list, with --status open|merged|closed
vibe updates close <id>   # close without merging
vibe updates reopen <id>  # reopen a closed update
vibe updates retry <id>   # retry a failed compilation

Update IDs support prefix matching, just like git commit SHAs. vibe updates close c2d4 resolves to the full UUID; if the prefix is ambiguous, it errors with "use a longer prefix." Auth is handled via a VIBEHUB_TOKEN environment variable, and the client reads .vibe/remote.json for project identity — cascading from a per-project webUrl, to a VIBEHUB_WEB_URL env var, to the default https://getvibehub.com.

Snapshots and Recompilation

Every meaningful state change (merging an update, editing a feature, forking a project) creates an immutable snapshot of the specs. Because code is a derived artifact, any snapshot can be recompiled independently. This means you can recompile an old project with a newer, better model and get an upgraded implementation without changing a single line of spec. It also enables cross-model comparisons: compile the same specs with Claude and Gemini, diff the outputs, pick the better one.

Intent Diffing

Once you accept that specs are the source of truth, code review has to change too. A traditional PR diff shows you line-level additions and deletions — which lines of text moved where. But when the artifact under review is a human-language specification, a line-level diff is actively misleading. Someone rephrases a paragraph for clarity, and the diff lights up red and green even though nothing behavioral changed. Meanwhile, a single added sentence that introduces a hard new constraint gets buried in the noise.

We built intent diffing to solve this. When an update is opened on VibeHub, the server fires an async computation that sends all changed spec files through a single batched Gemini Flash call. The model is prompted to synthesize the minimum number of bullet points needed to completely explain each file's intent — grouped by theme, not extracted granularly. The output for each file is a set of highlights: concise, color-coded summaries of what the feature adds, removes, or modifies behaviorally. Rewording, formatting, and clarifications that don't alter meaning are filtered out entirely.

typescript

interface IntentHighlight {
  kind: 'added' | 'removed' | 'modified';
  text: string;       // concise bullet describing the behavioral change
}

An earlier iteration included a confidence score on each delta, with the frontend filtering below 0.7. In practice, the confidence values were noisy and the filtering created more confusion than it resolved — reviewers would wonder why certain changes weren't showing up. We dropped it in favor of a simpler model: the LLM is prompted to only emit highlights that represent genuine behavioral changes in the first place, rather than emitting everything and filtering after the fact. The result is cached on the update record so every reviewer sees the same output. If the call fails, the next page load retries transparently.

The reviewer sees a toggle between two views: Intent changes (the default) and Content diff (the raw line-level fallback). The intent view renders each highlight as a colored bullet — green for additions, yellow for modifications, red for removals — against a tinted background. Instead of parsing a wall of green and red line diffs to figure out what actually changed, you get a clean, scannable list: "Adds constraint: never process refunds without manager approval." "Removes dependency on Notifications feature." "Adds expiresAt field to Session entity." You're reviewing decisions, not prose.

Intent diff view — behavioral changes extracted from spec modifications, showing added and modified deltas

Content diff view — traditional line-level diff of the same spec changes for comparison

The contrast between those two views is the whole argument for intent diffing in one screenshot. The content diff is honest but noisy. The intent diff tells you what the author actually meant to change. For a non-technical stakeholder reviewing a feature spec — a PM, a designer, a legal reviewer — the intent view is the difference between being able to participate in code review and being locked out of it.

Under the hood, this pairs with the merge system. Intent diffing is read-path: it helps you understand what changed. The merge system (detectConflicts, computeMergedVibes) is the write-path: it resolves conflicting changes at the spec level. Together, they close the loop. You can review intent, merge intent, and recompile — without ever touching a line of generated code.

Why This Matters

In the same way the central dogma of biology describes DNA → RNA → Proteins, we've added a step before the DNA of engineering (prompting) but have totally left it out of the primitives we generate and version-control. VibeHub is an attempt to fix that. The most important thing in a software project isn't the code, it's the decision behind the code. VibeHub makes those decisions first-class: versioned, reviewable, and portable.

View project →Source code →