← Back to writing

vibecheck: auditing ai code against the plan

·6 min read

part of my ai coding workflow. this is the final checkpoint before committing. did we actually build what we planned?

cat typing aggressively

the problem

ai assistants are eager to help. too eager. you ask for a feature, they build the feature, then add "helpful" extras:

  • "i also refactored the adjacent module for consistency"
  • "added comprehensive error handling throughout"
  • "created a utility function we might need later"

none of that was in the plan. it's gold-plating. and it creates:

  • untested code paths
  • unexpected behavior changes
  • scope creep that compounds
  • commits that don't match their message

the test-first enforcement and plan critique happen BEFORE implementation. but what about AFTER? how do you verify the implementation matches the plan?

the solution

a slash command called /vibe-check that audits what was built against what was planned.

it's not a hook. it's a manual checkpoint i run before committing. the name is deliberately casual because it's a sanity check, not a gate.

how it works

the command lives in .claude/commands/vibe-check.md and instructs claude to:

Vibecheck workflow - gather sources, cross-reference against plan, check conventions, generate report

step 1: find the plan sources

the command tells claude where to look:

**A. Claude Plan File (primary)**
# get the most recent plan file
ls -t ~/.claude/plans/*.md | head -1

**B. Story File (if referenced)**
- story files live at your docs folder
- pattern: docs/stories/3.8-*.md

**C. ADR (if architectural work)**
- check docs/decisions/*.md for relevant ADRs

step 2: read and compare

# most recent claude plan
cat "$(ls -t ~/.claude/plans/*.md | head -1)"

# what changed this session
git diff --name-only HEAD~5  # recent commits
git diff --name-only         # uncommitted

step 3: cross-reference

for each file in git diff:

  • was it in the plan's "files to create" or "files to modify"?
  • if not, is it a justified discovery or gold-plating?

step 4: check conventions

the command includes a checklist of project conventions from CLAUDE.md:

| Convention | Check |
|------------|-------|
| **Test-first** | tests written before implementation? |
| **Protocol-based** | new services implement protocols? |
| **Dependency injection** | services via deps.py? |
| **Idempotent** | check state before doing work? |
| **Type hints** | all functions typed? |
| **Graceful degradation** | optional services fail gracefully? |
| **No over-engineering** | only requested changes? |

step 5: generate report

the output is a structured audit:

## Vibe Check: 2026-01-27

### Sources
- **Plan:** happy-paper-salesman.md
- **Story:** 3.17-client-query-service.md
- **ADR:** DATABASE_SCHEMA.md

### Plan Adherence
| Plan Item | Status | Notes |
|-----------|--------|-------|
| Create src/app/models/client.py | ✓ | |
| Modify src/app/services/sales.py | ✓ | |
| Modify src/app/protocols.py | ✓ | |
| Add 13 tests | ✓ | |

### Unplanned Work
| Change | Justified? | Notes |
|--------|------------|-------|
| Updated models/__init__.py | ✓ | required for exports |

### Convention Compliance
| Convention | Status | Notes |
|------------|--------|-------|
| Test-first | ✓ | tests written first |
| Protocols | ✓ | added to SalesProvider |
| Types | ✓ | ClientType Literal |
| Graceful degradation | ✓ | returns [] on error |

### Verdict
**CLEAN**

real example

say i'm building a client lookup feature for dunder mifflin infinity. the plan was to implement get_regional_clients() with 13 specific tests.

after implementation, /vibe-check found:

plan adherence: all 4 files modified as planned, all 13 tests implemented.

unplanned work: one file (models/__init__.py) was modified but not in the plan. verdict: justified. the plan said "export ClientType for reuse" which implies updating the init file.

conventions: all passed. test-first followed, protocol updated, types complete, graceful degradation implemented.

verdict: CLEAN

no drift. no gold-plating. ready to commit.

the command file

here's the structure of the command (.claude/commands/vibe-check.md):

# Check

Audit whether we stayed the course. did we stick to the plan
and follow CLAUDE.md conventions?

## Instructions

### 1. Find the Plan Sources

Check these locations:

**A. Claude Plan File (primary)**
\`\`\`bash
ls -t ~/.claude/plans/*.md | head -1
\`\`\`
Read this file. it contains the implementation plan.

**B. Story File (if referenced)**
- scan conversation for story references like "Story 3.8"
- story files live at your docs/stories/ folder

**C. ADR (if architectural work)**
- check docs/decisions/*.md for relevant ADRs

### 2. Read the Sources

\`\`\`bash
cat "$(ls -t ~/.claude/plans/*.md | head -1)"
git diff --name-only HEAD~5
git diff --name-only
\`\`\`

### 3. Cross-Reference

From the plan file, check each section:
- "Files to Create". were they all created?
- "Files to Modify". were they all modified?
- implementation steps. completed in order?

For each change we made:
- was it in the plan? (Y/N)
- if not, justified discovery or gold-plating?

### 4. Check CLAUDE.md Conventions

| Convention | Check |
|------------|-------|
| **Test-first** | tests written before implementation? |
| **Protocol-based** | new services implement protocols? |
| **Dependency injection** | services via deps.py? |
| **Type hints** | all functions typed? |
| **Graceful degradation** | optional services fail gracefully? |
| **No over-engineering** | only requested changes? |

### 5. Report

[structured output template]

### Verdict
**[CLEAN / MINOR DRIFT / OFF TRACK]**

be honest. catching drift early saves debugging later.

when drift happens

not every vibecheck is clean. when there's drift:

MINOR DRIFT. small additions that weren't planned but are reasonable:

  • added a log statement
  • fixed an unrelated typo
  • updated a comment

OFF TRACK. significant unplanned work:

  • refactored the SalesRep module "while i was in there"
  • added regional filtering not in the story
  • changed behavior of existing Client queries

for minor drift, i usually let it slide. for off track, i either:

  1. revert the unplanned changes
  2. create a separate commit/PR for them
  3. update the plan retroactively (if it was genuinely needed)

why manual, not automated

unlike the plan critique hook, vibecheck is manual. reasons:

  1. context matters. some drift is fine, some isn't. judgment required.
  2. not every session needs it. quick fixes don't need audits.
  3. it's a conversation. sometimes i want to discuss the drift with claude.

the goal isn't bureaucracy. it's catching problems before they compound.

the workflow

Plan → Critique Hook → Implement → /vibe-check → Commit

the critique hook blocks bad plans. the vibecheck catches drift from good plans.

together they create a closed loop: what you planned is what you built.


staying on course is harder than starting on course. the vibecheck is how i verify we got there.