ProductArchitect & Full-stack2026in-progress
Lore Engine
A RAG engine that turns any fan wiki into a queryable, spoiler-aware knowledge base.
Stack
Next.jsFastAPIPostgreSQLpgvectorCeleryRedisS3 / R2OpenAIAnthropic
Outcomes
6,000+Pages indexed per series
HybridBM25 + vector retrieval
SpoilerAware retrieval filter
What it is
A full-stack RAG application that ingests any fan wiki (Fandom, Wikipedia), structures it into a queryable knowledge base, and surfaces it through both a chat interface and a generated "lore book" document — with a spoiler cutoff so users can read up to whatever point they're at in the series.
Key points
- Hybrid retrieval — combines BM25 lexical search with pgvector embeddings via reciprocal rank fusion, so exact character/place names work alongside semantic queries.
- Spoiler-aware filter — every chunk is tagged with a
first_appearance_orderand filtered at query time, so a user reading Book 3 never sees plot points from Book 7. - Section-aware chunking — splits markdown at H2/H3 boundaries instead of fixed token windows, then enriches each chunk with
[Series][Page][Section]metadata before embedding for sharper retrieval. - Async ingestion pipeline — Celery + Redis pull pages from the MediaWiki API, convert wikitext to markdown, store raw
.mdper page in S3/R2, and write structured infobox metadata to Postgres. - Two surfaces, one corpus — same retriever powers (a) a streaming chat with inline citations, and (b) a hierarchically generated lore book that's cached aggressively per series.
- Evaluation harness — 50-question ground-truth QA set per series gates every deploy on retrieval MRR and answer accuracy.
Result
Designed to scale to ~5M vectors per series on a single Postgres instance with no extra vector infra. Architecture validated against the Harry Potter wiki (~6,000 pages) as the reference workload.