Belief Schema

Every belief extracted from a podcast transcript follows an 8-layer abstraction hierarchy, progressively distilling a raw quote into a structured, searchable, and comparable data point.

Abstraction Hierarchy

Layer Details

L0: Raw Quote + Context

The verbatim text from the transcript, plus surrounding context for disambiguation.

Field	Type	Description
`quote_text`	string	Exact speaker quote
`context_before`	string	~100 tokens before the quote
`context_after`	string	~100 tokens after the quote
`timestamp_start`	float	Audio start time (seconds)
`timestamp_end`	float	Audio end time (seconds)

L1: Surface Statement

Cleaned, grammar-corrected version of the quote. Maximum 50 words. Removes filler words and false starts while preserving meaning.

Field	Type	Description
`surface_statement`	string	Cleaned quote (≤50 words)

L2: Atomic Belief

The core claim distilled to 25 words or fewer. This is the primary unit of the belief system — every downstream layer derives from this.

Field	Type	Description
`atomic_belief`	string	Core claim (≤25 words) — required
`topic`	string	Short topic label (2–5 words)
`polarity`	enum	`for` \| `against` \| `neutral`
`polarity_confidence`	float	0.0–1.0 confidence in polarity

L3: Worldview

The underlying principle that makes the belief compelling to the speaker.

Field	Type	Description
`worldview`	string	Underlying principle or framework

L4: Core Axiom

The foundational assumption the speaker takes for granted.

Field	Type	Description
`core_axiom`	string	Foundational assumption
`tier`	int	1 (peripheral) → 5 (foundational)

L5: Polar Analysis

What someone who disagrees would say. Enables ideological mapping.

Field	Type	Description
`polar_opposite`	string	Counter-position to the belief

L6: Tabloid Headline

Sensationalized headline version for engagement and discovery.

Field	Type	Description
`tabloid_headline`	string	Clickbait-style headline

L7: Positioning Vector

A 10-dimensional vector that positions the belief on ideological axes.

Dim	Name	Range	Description
0	Philosophical/Spiritual	0.0–1.0	Metaphysics affinity
1	Moral/Ethical	0.0–1.0	Morality affinity
2	Political	0.0–1.0	Power/governance affinity
3	Economic	0.0–1.0	Value/markets affinity
4	Scientific/Technical	0.0–1.0	Empirical/technical affinity
5	Academic ↔ Practical	-1.0–+1.0	Theory vs application
6	Mainstream ↔ Contrarian	-1.0–+1.0	Consensus vs fringe
7	Institutional ↔ Individual	-1.0–+1.0	System vs sovereignty
8	Epistemic Certainty	-1.0–+1.0	Skeptical vs dogmatic
9	Overton Position	-1.0–+1.0	Acceptable vs taboo

Dimensions 0–4 are domain affinity (how much this belief belongs to each domain). Dimensions 5–9 are positioning (where the belief sits on ideological spectrums).

Embedding

Each belief also carries a 1,536-dimensional semantic embedding for vector search.

Field	Type	Description
`embedding`	float[1536]	OpenAI `text-embedding-3-large`
`bundle_text`	string	Text used to generate the embedding

The embedding is generated from a bundle of the atomic belief, worldview, polar opposite, and topic — not just the raw quote.

Identity

Field	Type	Description
`id`	string	Content-hash ID: `b_xxxxxxxx`
`podcast_slug`	string	Source podcast
`episode_slug`	string	Source episode
`speaker_slug`	string	Speaker identifier
`speaker_name`	string	Denormalized display name

Example

{
  "id": "b_a7f3c2e1",
  "speaker_name": "Naval Ravikant",
  "speaker_slug": "naval-ravikant",
  "podcast_slug": "lex-fridman",
  "episode_slug": "naval-ravikant-2023",
  "quote_text": "I think the most important skill is learning how to learn",
  "surface_statement": "The most important skill is learning how to learn.",
  "atomic_belief": "Meta-learning is the most valuable skill",
  "topic": "meta-learning",
  "polarity": "for",
  "worldview": "Self-improvement through deliberate practice",
  "core_axiom": "Adaptability trumps specialization",
  "tier": 4,
  "polar_opposite": "Specialized expertise matters more than learning ability",
  "tabloid_headline": "NAVAL: Forget Everything You Know — Learning to LEARN Is All That Matters",
  "weights": [0.7, 0.3, 0.1, 0.2, 0.4, -0.6, -0.3, 0.5, 0.4, -0.2],
  "embedding": [0.012, -0.034, ...]
}

Pipeline Stages That Produce Belief Data

Layer	Pipeline Stage	Description
L0	`extract`	Identifies quotes and extracts raw context
L1	`extract`	Cleans and normalizes the quote
L2	`extract`	Distills to atomic belief
L3	`abstract`	Derives worldview from atomic belief
L4	`abstract`	Infers core axiom and assigns tier
L5	`abstract`	Generates polar opposite
L6	`headlines`	Creates tabloid headline
L7	`weights`	Computes 10-dim positioning vector
Embedding	`embed`	Generates 1,536-dim search vector

Abstraction Hierarchy​

Layer Details​

L0: Raw Quote + Context​

L1: Surface Statement​

L2: Atomic Belief​

L3: Worldview​

L4: Core Axiom​

L5: Polar Analysis​

L6: Tabloid Headline​

L7: Positioning Vector​

Embedding​

Identity​

Example​

Pipeline Stages That Produce Belief Data​