Skip to main content

Belief Schema

Every belief extracted from a podcast transcript follows an 8-layer abstraction hierarchy, progressively distilling a raw quote into a structured, searchable, and comparable data point.

Abstraction Hierarchy

Layer Details

L0: Raw Quote + Context

The verbatim text from the transcript, plus surrounding context for disambiguation.

FieldTypeDescription
quote_textstringExact speaker quote
context_beforestring~100 tokens before the quote
context_afterstring~100 tokens after the quote
timestamp_startfloatAudio start time (seconds)
timestamp_endfloatAudio end time (seconds)

L1: Surface Statement

Cleaned, grammar-corrected version of the quote. Maximum 50 words. Removes filler words and false starts while preserving meaning.

FieldTypeDescription
surface_statementstringCleaned quote (≤50 words)

L2: Atomic Belief

The core claim distilled to 25 words or fewer. This is the primary unit of the belief system — every downstream layer derives from this.

FieldTypeDescription
atomic_beliefstringCore claim (≤25 words) — required
topicstringShort topic label (2–5 words)
polarityenumfor | against | neutral
polarity_confidencefloat0.0–1.0 confidence in polarity

L3: Worldview

The underlying principle that makes the belief compelling to the speaker.

FieldTypeDescription
worldviewstringUnderlying principle or framework

L4: Core Axiom

The foundational assumption the speaker takes for granted.

FieldTypeDescription
core_axiomstringFoundational assumption
tierint1 (peripheral) → 5 (foundational)

L5: Polar Analysis

What someone who disagrees would say. Enables ideological mapping.

FieldTypeDescription
polar_oppositestringCounter-position to the belief

L6: Tabloid Headline

Sensationalized headline version for engagement and discovery.

FieldTypeDescription
tabloid_headlinestringClickbait-style headline

L7: Positioning Vector

A 10-dimensional vector that positions the belief on ideological axes.

DimNameRangeDescription
0Philosophical/Spiritual0.0–1.0Metaphysics affinity
1Moral/Ethical0.0–1.0Morality affinity
2Political0.0–1.0Power/governance affinity
3Economic0.0–1.0Value/markets affinity
4Scientific/Technical0.0–1.0Empirical/technical affinity
5Academic ↔ Practical-1.0–+1.0Theory vs application
6Mainstream ↔ Contrarian-1.0–+1.0Consensus vs fringe
7Institutional ↔ Individual-1.0–+1.0System vs sovereignty
8Epistemic Certainty-1.0–+1.0Skeptical vs dogmatic
9Overton Position-1.0–+1.0Acceptable vs taboo

Dimensions 0–4 are domain affinity (how much this belief belongs to each domain). Dimensions 5–9 are positioning (where the belief sits on ideological spectrums).

Embedding

Each belief also carries a 1,536-dimensional semantic embedding for vector search.

FieldTypeDescription
embeddingfloat[1536]OpenAI text-embedding-3-large
bundle_textstringText used to generate the embedding

The embedding is generated from a bundle of the atomic belief, worldview, polar opposite, and topic — not just the raw quote.

Identity

FieldTypeDescription
idstringContent-hash ID: b_xxxxxxxx
podcast_slugstringSource podcast
episode_slugstringSource episode
speaker_slugstringSpeaker identifier
speaker_namestringDenormalized display name

Example

{
"id": "b_a7f3c2e1",
"speaker_name": "Naval Ravikant",
"speaker_slug": "naval-ravikant",
"podcast_slug": "lex-fridman",
"episode_slug": "naval-ravikant-2023",
"quote_text": "I think the most important skill is learning how to learn",
"surface_statement": "The most important skill is learning how to learn.",
"atomic_belief": "Meta-learning is the most valuable skill",
"topic": "meta-learning",
"polarity": "for",
"worldview": "Self-improvement through deliberate practice",
"core_axiom": "Adaptability trumps specialization",
"tier": 4,
"polar_opposite": "Specialized expertise matters more than learning ability",
"tabloid_headline": "NAVAL: Forget Everything You Know — Learning to LEARN Is All That Matters",
"weights": [0.7, 0.3, 0.1, 0.2, 0.4, -0.6, -0.3, 0.5, 0.4, -0.2],
"embedding": [0.012, -0.034, ...]
}

Pipeline Stages That Produce Belief Data

LayerPipeline StageDescription
L0extractIdentifies quotes and extracts raw context
L1extractCleans and normalizes the quote
L2extractDistills to atomic belief
L3abstractDerives worldview from atomic belief
L4abstractInfers core axiom and assigns tier
L5abstractGenerates polar opposite
L6headlinesCreates tabloid headline
L7weightsComputes 10-dim positioning vector
EmbeddingembedGenerates 1,536-dim search vector