Belief Schema
Every belief extracted from a podcast transcript follows an 8-layer abstraction hierarchy, progressively distilling a raw quote into a structured, searchable, and comparable data point.
Abstraction Hierarchy
Layer Details
L0: Raw Quote + Context
The verbatim text from the transcript, plus surrounding context for disambiguation.
| Field | Type | Description |
|---|---|---|
quote_text | string | Exact speaker quote |
context_before | string | ~100 tokens before the quote |
context_after | string | ~100 tokens after the quote |
timestamp_start | float | Audio start time (seconds) |
timestamp_end | float | Audio end time (seconds) |
L1: Surface Statement
Cleaned, grammar-corrected version of the quote. Maximum 50 words. Removes filler words and false starts while preserving meaning.
| Field | Type | Description |
|---|---|---|
surface_statement | string | Cleaned quote (≤50 words) |
L2: Atomic Belief
The core claim distilled to 25 words or fewer. This is the primary unit of the belief system — every downstream layer derives from this.
| Field | Type | Description |
|---|---|---|
atomic_belief | string | Core claim (≤25 words) — required |
topic | string | Short topic label (2–5 words) |
polarity | enum | for | against | neutral |
polarity_confidence | float | 0.0–1.0 confidence in polarity |
L3: Worldview
The underlying principle that makes the belief compelling to the speaker.
| Field | Type | Description |
|---|---|---|
worldview | string | Underlying principle or framework |
L4: Core Axiom
The foundational assumption the speaker takes for granted.
| Field | Type | Description |
|---|---|---|
core_axiom | string | Foundational assumption |
tier | int | 1 (peripheral) → 5 (foundational) |
L5: Polar Analysis
What someone who disagrees would say. Enables ideological mapping.
| Field | Type | Description |
|---|---|---|
polar_opposite | string | Counter-position to the belief |
L6: Tabloid Headline
Sensationalized headline version for engagement and discovery.
| Field | Type | Description |
|---|---|---|
tabloid_headline | string | Clickbait-style headline |
L7: Positioning Vector
A 10-dimensional vector that positions the belief on ideological axes.
| Dim | Name | Range | Description |
|---|---|---|---|
| 0 | Philosophical/Spiritual | 0.0–1.0 | Metaphysics affinity |
| 1 | Moral/Ethical | 0.0–1.0 | Morality affinity |
| 2 | Political | 0.0–1.0 | Power/governance affinity |
| 3 | Economic | 0.0–1.0 | Value/markets affinity |
| 4 | Scientific/Technical | 0.0–1.0 | Empirical/technical affinity |
| 5 | Academic ↔ Practical | -1.0–+1.0 | Theory vs application |
| 6 | Mainstream ↔ Contrarian | -1.0–+1.0 | Consensus vs fringe |
| 7 | Institutional ↔ Individual | -1.0–+1.0 | System vs sovereignty |
| 8 | Epistemic Certainty | -1.0–+1.0 | Skeptical vs dogmatic |
| 9 | Overton Position | -1.0–+1.0 | Acceptable vs taboo |
Dimensions 0–4 are domain affinity (how much this belief belongs to each domain). Dimensions 5–9 are positioning (where the belief sits on ideological spectrums).
Embedding
Each belief also carries a 1,536-dimensional semantic embedding for vector search.
| Field | Type | Description |
|---|---|---|
embedding | float[1536] | OpenAI text-embedding-3-large |
bundle_text | string | Text used to generate the embedding |
The embedding is generated from a bundle of the atomic belief, worldview, polar opposite, and topic — not just the raw quote.
Identity
| Field | Type | Description |
|---|---|---|
id | string | Content-hash ID: b_xxxxxxxx |
podcast_slug | string | Source podcast |
episode_slug | string | Source episode |
speaker_slug | string | Speaker identifier |
speaker_name | string | Denormalized display name |
Example
{
"id": "b_a7f3c2e1",
"speaker_name": "Naval Ravikant",
"speaker_slug": "naval-ravikant",
"podcast_slug": "lex-fridman",
"episode_slug": "naval-ravikant-2023",
"quote_text": "I think the most important skill is learning how to learn",
"surface_statement": "The most important skill is learning how to learn.",
"atomic_belief": "Meta-learning is the most valuable skill",
"topic": "meta-learning",
"polarity": "for",
"worldview": "Self-improvement through deliberate practice",
"core_axiom": "Adaptability trumps specialization",
"tier": 4,
"polar_opposite": "Specialized expertise matters more than learning ability",
"tabloid_headline": "NAVAL: Forget Everything You Know — Learning to LEARN Is All That Matters",
"weights": [0.7, 0.3, 0.1, 0.2, 0.4, -0.6, -0.3, 0.5, 0.4, -0.2],
"embedding": [0.012, -0.034, ...]
}
Pipeline Stages That Produce Belief Data
| Layer | Pipeline Stage | Description |
|---|---|---|
| L0 | extract | Identifies quotes and extracts raw context |
| L1 | extract | Cleans and normalizes the quote |
| L2 | extract | Distills to atomic belief |
| L3 | abstract | Derives worldview from atomic belief |
| L4 | abstract | Infers core axiom and assigns tier |
| L5 | abstract | Generates polar opposite |
| L6 | headlines | Creates tabloid headline |
| L7 | weights | Computes 10-dim positioning vector |
| Embedding | embed | Generates 1,536-dim search vector |