Gap Analysis: PRD Vision vs What's Built

Date: 2026-02-24 Source PRD: Bitcoin Agent Brainstorm (Feb 23) Method: Full codebase inventory (30+ API routes, 20 tables, 18 RPCs, 14 agent files) compared against PRD vision

Executive Summary

The web app has strong foundations — search, playbook, threads, community, and ETL are all working. But the core PRD thesis ("The agent IS the product") remains unrealized. The 6 critical gaps all relate to making the product genuinely agentic: Oracle routing, hybrid search, citation gating, BYOK, tiers, and the terminal-native interface.

What's Built

Inventory Scorecard

Category	PRD Items	Built	Partial	Missing
Search & Retrieval	4	1	1	2
Agent Orchestration	4	1	1	2
Workflows (8 total)	8	3	2	3
Auth & Identity	3	2	0	1
Monetization	3	0	0	3
Data & ETL	3	3	0	0
Community	5	2	1	2
Infrastructure	4	3	0	1
Total	34	15	5	14

Completed Features

Completed Features Detail

Feature	Implementation	Key Files	Data
Vector search	pgvector HNSW, halfvec(1536), m=16, ef=64	`src/lib/db.ts`, migration 007-008	11,546 chunks, 345 episodes
Playbook analysis	4 parallel Haiku lenses + Sonnet synthesis	`src/lib/agents/playbook.ts`, `lenses/`	Beliefs, timeline, sentiment, context
Speaker graph	force-graph-2d, similarity + coappearance edges	`/api/graph/data`, 3 RPCs	All speakers with 2+ appearances
Thread system	CRUD + beliefs + share/fork/vote	6 API routes, 4 tables	Max 5 active per user
Community cards	Claims + citations + voting + feed	3 tables, 2 RPCs, `/api/feed`	Cursor-paginated
ETL pipeline	Trigger.dev weekly, HF SHA detection	`src/trigger/etl-pipeline.ts`	~$0.001/incremental run
Auth	Magic Link + Google OAuth	Supabase Auth, auto-profile trigger	RLS on all tables
Alerts	Fear/Greed + speaker/topic mention	Cron check, notifications table	Realtime on notifications
Beta invites	Code generation + atomic claim	`beta_invites` table, `/invite/[code]`	Status: pending/sent/claimed
Belief analysis	Cross-episode agreement, confidence	`src/lib/agents/analyze-belief.ts`	15s timeout, 10 req/min limit

The Gaps

Architecture Gap Map

This diagram shows the PRD architecture with gaps highlighted in red:

Legend: Green = built, Red = gap

Critical Gaps (G1-G6)

These gaps block the core PRD identity: "The agent IS the product."

G1: Oracle Agent — Query Decomposition & Orchestration

Aspect	PRD	Reality
Query handling	Decompose into 2-3 sub-queries	Single-shot search
Evidence grading	LLM grades if sufficient, retries up to 2x	No grading, no retry
Orchestration	Routes to Jackal, Playbook, or Oracle based on complexity	All queries route to `direct-search`
Multi-step reasoning	Parallel search → grade → analyze → verify	Search → synthesize (1 step)

Impact: Can't answer complex questions like "How has Lightning Network narrative evolved?" — requires decomposition into early adoption + scaling debate + current state sub-queries.

G2: Hybrid Search (Jackal)

Search Type	Strengths	Weaknesses	Status
Vector (semantic)	Finds related concepts, handles paraphrasing	Misses exact terms, acronyms, proper nouns	Built
Keyword (full-text)	Exact match for "UTXO", "block size war", "HODL"	No semantic understanding	Missing
Hybrid (RRF)	Best of both — semantic + exact	Requires both indexes	Missing

Implementation effort: Pure SQL migration — add tsvector column, GIN index, new RPC. No new infrastructure.

Impact: Bitcoin terminology is heavy on exact terms (UTXO, HODL, halving, mempool). Vector-only search returns semantically related but not necessarily relevant results for these queries.

G3: Citation Gate

Aspect	PRD	Reality
Claim extraction	LLM splits response into individual claims	Not implemented
Evidence matching	Each claim checked against cited sources	Playbook cites sources but doesn't verify claims
Ungrounded handling	Flagged or removed before user sees response	Hallucinations pass through unchecked
Trust signal	"Every claim grounded" is a product differentiator	No verification guarantee

Impact: The PRD's non-negotiable invariant ("Every claim cites a source") is not enforced systematically.

G4: BYOK (Bring Your Own Key)

Component	PRD	Reality
Key storage	Supabase Vault (encrypted at rest)	Not implemented
Key submission UI	One-time setup flow	Not implemented
Runtime fetch	Task fetches user's key before LLM call	All calls use app's API key
Billing	User pays their LLM provider directly	All costs borne by us
Key rotation	User can update/delete their key	Not implemented

Impact: No path to monetization. All LLM costs ($0.02-0.05/query) come from our budget.

G5: User Tiers

Legend: Green = built, Yellow = partial, Red = missing

Current state: No tier differentiation. All users get the same features with no usage limits.

G6: Ink TUI (Terminal-Native Agent)

Aspect	PRD	Reality
Primary interface	Terminal (Ink)	Web (Next.js)
Build order	"Agent first, web second"	Web first, no agent CLI
Protocol	Typed JSON messages (renderer-agnostic)	AI SDK streams (web-coupled)
Renderer swap	Same agent, two frontends	Single web frontend

Impact: The PRD's foundational thesis is inverted. This may be a deliberate pivot (web-first is pragmatic) but should be acknowledged.

Major Gaps: Workflow Coverage

PRD Workflows vs Implementation Status

Workflow Gap Detail

#	Workflow	PRD Description	Current State	Gap
1	Search	Embed → vector search → summarize. $0.002/query	Built. Direct search agent + vector search + synthesis	Needs hybrid search (G2)
2	Speaker Lookup	Speaker RPC → summarize profile. $0.001/query	Partial. RPCs exist, no unified profile synthesis	Missing: LLM profile summary
3	Episode Browse	Episode RPC → summarize. $0.001/query	Partial. API exists, returns raw metadata	Missing: LLM episode summary
4	Deep Research	Decompose → parallel search → grade → 4-lens → gate. $0.05/query	Not built. Requires Oracle (G1) + Jackal (G2) + Gate (G3)	Largest gap — the flagship workflow
5	Speaker Deep Dive	4 parallel analyses: beliefs-over-time, sentiment, network, quotes. $0.04/query	Not built. Components exist (playbook lenses, speaker RPCs) but no unified workflow	Wire existing pieces together
6	Compare Speakers	Parallel filtered searches → comparison table → gate. $0.035/query	Not built. No speaker comparison logic	New workflow from scratch
7	Claim Verify	Embed claim → vector+keyword search → verify → CONFIRMED/NOT FOUND. $0.02/query	Not built. `analyze-belief` is similar but not exposed as standalone workflow	Refactor existing code
8	Topic Timeline	Embed → timeline RPC + vector → narrative arc. $0.035/query	Not built. Timeline lens + RPC exist but no quarterly narrative arc synthesis	Wire existing pieces together

Deep Research Pipeline: The Biggest Gap

The flagship Pro workflow requires 5 components, only 1 is built:

Step	Component	Status	Effort
1	Query decomposition	Missing	Medium — `generateObject` with sub-query schema
2	Hybrid search	Missing	Medium — SQL migration + new RPC
3	Evidence grading	Missing	Small — LLM call to assess chunk relevance
4	4-lens analysis	Built	-
5	Citation verification	Missing	Medium — claim extraction + evidence matching

Minor Gaps

#	Gap	PRD Reference	Effort	Impact
G7	Contextual embeddings	"35-49% retrieval improvement"	High ($50-100, batch re-embed 11.5K chunks)	Better retrieval quality
G8	Thread export (MD/JSON)	"markdown, JSON, or shareable links"	Small (serialize + download)	User convenience
G9	Audio deep links	"Click to hear the exact moment"	Medium (need audio URLs + player)	Killer feature per PRD
G10	X/Twitter OAuth	"X OAuth for launch"	Small (Supabase config)	Speaker identity linking
G11	Ontology evolution	Tags → Entities → GraphRAG	Large (3-phase roadmap)	Long-term differentiator
G12	Global rate limiting	Per-user daily quota	Small (Redis/KV counter)	Cost protection
G13	Search observability	Query analytics, trace data	Medium (telemetry pipeline)	Product intelligence

Cost Model: Current vs PRD

Current State (All Costs Are Ours)

PRD Model (BYOK Shifts Cost to Pro Users)

Per-Query Cost Comparison

Workflow	Model(s)	Cost/Query	Who Pays (PRD)	Who Pays (Now)
#1 Search	Haiku + embedding	~$0.002	Us (free tier)	Us
#2 Speaker Lookup	Haiku	~$0.001	Us (free tier)	Us
#3 Episode Browse	Haiku	~$0.001	Us (free tier)	Us
#4 Deep Research	Sonnet + 4x Haiku + embedding	~$0.05	User (BYOK)	Us (no BYOK)
#5 Speaker Deep Dive	4x Haiku + Sonnet	~$0.04	User (BYOK)	Us
#6 Compare Speakers	Sonnet + Haiku	~$0.035	User (BYOK)	Not built
#7 Claim Verify	Sonnet + embedding	~$0.02	User (BYOK)	Not built
#8 Topic Timeline	Sonnet + Haiku	~$0.035	User (BYOK)	Not built

Bottom line: Without BYOK, every Pro-tier workflow we ship increases our cost per user with no revenue offset.

Database Schema: What Exists vs What's Needed

Existing Tables (20)

Missing Tables (for PRD features)

Table	Purpose	Needed For
`user_api_keys`	Encrypted API key storage (or Supabase Vault)	G4: BYOK
`user_tiers`	Tier assignment (free/pro/community)	G5: Tiers
`daily_usage`	Per-user daily query counter	G5: Rate limiting
`search_analytics`	Query text, latency, result count, user satisfaction	G13: Observability

Missing Indexes / Columns

Change	Table	Purpose	Needed For
Add `tsvector` column	`transcript_chunks`	Full-text search	G2: Hybrid search
Add GIN index	`transcript_chunks.tsv`	Fast keyword lookup	G2: Hybrid search
Add `updated_at`	`transcript_chunks`	Track when rows change	ETL observability
Add `embedding_model`	`transcript_chunks`	Detect embedding drift	Data integrity

Priority Matrix

Recommended Shipping Sequence

Phase A: Unblock Monetization

Gap	Deliverable	Effort
G2	SQL migration: `tsvector` + GIN + `hybrid_search` RPC	2-3 days
G4	Supabase Vault integration + key submission UI	3-4 days
G5	Tier column on users + middleware check + daily counter	1-2 days
G12	Redis/KV counter or simple Supabase RPC	1 day

Phase B: Make It Agentic

Gap	Deliverable	Effort
G1	Oracle agent: `generateObject` decomposition + evidence grading loop	4-5 days
G3	Gate agent: claim extraction + evidence matching + flagging	2-3 days
Deep Research	Wire Oracle→Jackal→Playbook→Gate end-to-end	2-3 days
Remaining workflows	Speaker deep dive, compare, claim verify, topic timeline	1-2 days each

Phase C: Polish & Differentiation

Appendix: Full Feature Matrix

Feature	PRD Section	Status	Gap #	Phase
Vector search (pgvector)	Search	Done	-	-
Hybrid search (vector + keyword + RRF)	Search	Missing	G2	A
Contextual embeddings	Search	Missing	G7	C
Embedding cache (LRU)	Search	Done	-	-
Direct search agent	Workflows #1	Done	-	-
Speaker lookup	Workflows #2	Partial	-	B
Episode browse	Workflows #3	Partial	-	B
Deep research pipeline	Workflows #4	Missing	G1+G2+G3	B
Speaker deep dive	Workflows #5	Missing	-	B
Compare speakers	Workflows #6	Missing	-	B
Claim verification	Workflows #7	Missing	-	B
Topic timeline	Workflows #8	Missing	-	B
Oracle (orchestrator)	Architecture	Missing	G1	B
Jackal (hybrid retrieval)	Architecture	Missing	G2	A
Playbook (4-lens)	Architecture	Done	-	-
Gate (citation verification)	Architecture	Missing	G3	B
BYOK via Supabase Vault	Key Management	Missing	G4	A
Free tier (10/day)	User Tiers	Missing	G5	A
Pro tier (BYOK unlimited)	User Tiers	Missing	G5	A
Community tier	User Tiers	Missing	-	C+
Magic link auth	Auth	Done	-	-
Google OAuth	Auth	Done	-	-
X/Twitter OAuth	Auth	Missing	G10	C
Ink TUI	Architecture	Missing	G6	C
Typed JSON protocol	Architecture	Missing	G6	C
Thread CRUD	Community	Done	-	-
Thread sharing + forking	Community	Done	-	-
Thread voting	Community	Done	-	-
Thread export (MD/JSON)	Community	Missing	G8	C
Community cards + citations	Community	Done	-	-
Shared spaces	Community	Missing	-	C+
Ontology builder	Community	Missing	G11	C+
Audio deep links	Features	Missing	G9	C
Speaker graph	Features	Done	-	-
Alerts + notifications	Features	Done	-	-
Beta invite system	Platform	Done	-	-
ETL pipeline (Trigger.dev)	Infrastructure	Done	-	-
Weekly cron schedule	Infrastructure	Done	-	-
Rate limiting (global)	Infrastructure	Missing	G12	A
Search observability	Infrastructure	Missing	G13	C

Generated 2026-02-24. Source: full codebase audit of be-bitcoinology-v1 against Feb 23 agent brainstorm PRD.

Executive Summary​

What's Built​

Inventory Scorecard​

Completed Features​

Completed Features Detail​

The Gaps​

Architecture Gap Map​

Critical Gaps (G1-G6)​

G1: Oracle Agent — Query Decomposition & Orchestration​

G2: Hybrid Search (Jackal)​

G3: Citation Gate​

G4: BYOK (Bring Your Own Key)​

G5: User Tiers​

G6: Ink TUI (Terminal-Native Agent)​

Major Gaps: Workflow Coverage​

PRD Workflows vs Implementation Status​

Workflow Gap Detail​

Deep Research Pipeline: The Biggest Gap​

Minor Gaps​

Cost Model: Current vs PRD​

Current State (All Costs Are Ours)​

PRD Model (BYOK Shifts Cost to Pro Users)​

Per-Query Cost Comparison​

Database Schema: What Exists vs What's Needed​

Existing Tables (20)​

Missing Tables (for PRD features)​

Missing Indexes / Columns​

Priority Matrix​

Recommended Shipping Sequence​

Phase A: Unblock Monetization​

Phase B: Make It Agentic​

Phase C: Polish & Differentiation​

Appendix: Full Feature Matrix​