Saad Khalid

Codebase Knowledge

A set of Claude Code skills I wrote to manage persistent knowledge about a codebase. The idea is simple: agents forget between sessions, and git log alone can't explain why a decision was made. The knowledge skills fill that gap — a thin semantic layer on top of git, written and maintained by the agent, never by me.

Get the files: knowledge-skills.zip (zip) or browse them on GitHub Gist — the four SKILL.md files, the scripts, and the Claude Code + git hooks.

How it works

There's a ./knowledge directory at the root of every project I work in. It has an INDEX.md, an ENVIRONMENT.md, topic files grouped by domain, a timeline of decisions, and a scripts/ folder that the agent can run. At session start, Claude reads INDEX.md + ENVIRONMENT.md — about 400 tokens — and uses the trigger table there to decide what else to load.

The core principle: git is ground truth. If git log or grep can answer it, the knowledge directory doesn't restate it. Topics capture the things git can't: verbal decisions, gotchas discovered through pain, constraints from stakeholders, postmortems. Timeline entries are semantic indexes into git — they say why, link a SHA for what.

The skills

Four sub-skills, each triggered by a top-level knowledge dispatcher that reads the state of ./knowledge and routes based on what the session actually needs. I never pick manually. The bundle below ships the four sub-skills — the dispatcher is a thin personal wrapper I keep out of the distribution.

knowledge-init — bootstraps the directory from scratch. Scans the repo, pulls contributors, CI/CD, infra, monitoring config, and runs a dual-agent consensus pass on ENVIRONMENT.md claims.
knowledge-prune — detects drift. Uses a git-based staleness pre-filter (only topics where referenced paths changed since the topic's Updated: date), then runs two read-only agents independently to verify before applying corrections.
knowledge-rollup — compresses the base when it gets bloated. Archives old timeline files, merges oversized categories, strips entries that just restate commit messages.
knowledge-insights — targeted analysis: dependencies, dead code, security surface, architecture diagrams, git hotspots, CI health. Some of these run dual-agent consensus; the agent acts as judge.

Scripts are sensors, not writers

Every script is read-only. They output to stdout and never touch the knowledge files. Some of them spawn claude -p subprocesses with a restricted tool allowlist to do AI-powered drift detection or bootstrap planning — still read-only. The calling agent reads the output and decides what to persist. This is the rule I care most about: only the agent with full context writes.

The consensus pattern is the other thing I'd flag. For high-stakes writes (drift corrections, security findings, postmortem root causes), the scripts run two independent agents and output both sets of findings side-by-side. The caller merges: both agree → apply, one says it → tag [single-agent, verify], disagree → mark [?]. Single-agent hallucinations corrupt knowledge that cascades into every future session. Two agents disagreeing is a cheap signal to slow down.

Hooks: capture on autopilot

The thing I kept forgetting to do was actually writing the timeline entry after a session. So I added hooks. There are two Claude Code hooks (Stop and PostToolUse) and a git post-commit hook. All three run async, skip trivial commits (wip, chore, fmt, merges), and spawn a claude -p subprocess with just Read,Write allowed to append a two-line entry to ./knowledge/timeline/YYYY-MM-DD.md.

The Stop hook checks the transcript for Write/Edit/MultiEdit tool uses before spending a token — if the session didn't actually modify files, it exits silently. Installer is in knowledge-hooks/install.sh: bash install.sh --all merges the hook config into ~/.claude/settings.json via jq (idempotent — re-runs are safe).

How I actually use it

When I start a session on a repo I haven't touched in a while, the first thing the agent does is read INDEX.md. That's usually enough to orient. If I'm about to change an area I don't know, the trigger table tells the agent exactly which topic file to load — not all of them. When something breaks and takes me more than fifteen minutes to diagnose, that becomes a postmortem entry with a SHA. When I make a non-obvious decision ("we're using argon2 not bcrypt because the client said so"), that becomes a ## Decisions line. And every few weeks I run knowledge-prune to catch the drift I didn't notice.

It's not perfect. The consensus step is slow on large repos. The size budgets are aggressive and I sometimes hit them mid-week and have to rollup. But the alternative — starting every session from zero, or worse, from half-remembered code comments — was worse.

Installing

Unzip the download into ~/.claude/skills/. Each directory with a SKILL.md becomes an invocable skill. Copy the knowledge-scripts/ contents into any project's ./knowledge/scripts/ directory. The first time you run knowledge-init, the agent will scaffold the rest. For auto-capture, run bash knowledge-hooks/install.sh --all.