How This Wiki Works¶

This page explains the pattern behind the GiveCare Wiki — what it is, how it stays current, and why it is different from a typical docs site or a RAG system. If you are an LLM agent reading this to understand how to maintain the wiki, see apps/web-wiki/CLAUDE.md for the operational schema. This page is the conceptual companion to that schema.

The basic move¶

Stop thinking of the LLM as a thing that answers questions from sources. Start thinking of it as a thing that maintains a living document about what has been learned. That document is the wiki. Sources feed it. Questions draw from it. Every session makes it richer.

The difference from RAG in one sentence:

RAG re-derives understanding at query time; the LLM Wiki persists understanding at ingest time and updates it as new evidence arrives.

RAG treats documents as a haystack to retrieve from. The LLM Wiki treats documents as inputs to a synthesis that persists. The expensive cognitive work — "how does this new paper change my picture?" — is done once, when the source is first read, and the result is kept.

The three layers¶

raw/        Immutable source documents. PDFs, article clips, transcripts,
            screenshots, data dumps. The LLM reads but never edits them.
            This is the ground truth.

docs/       LLM-maintained synthesis. Entity pages, concept pages,
            summaries, comparisons, an index, a log. The LLM owns this
            layer entirely. You read it; the LLM writes it.

CLAUDE.md   The schema. Tells the LLM how this particular wiki is
            organized — page types, naming, citation format, workflows,
            contradiction conventions. You and the LLM co-evolve this
            as you learn what works for the domain.

The schema file is the critical piece. Without it, an LLM is a generic chatbot that might happen to edit some markdown. With it, every session picks up the same conventions — the wiki stays coherent across weeks and months and across different LLM sessions that never see each other's chat history.

In the GiveCare Wiki, Lane B pages (product/, domain/, evidence/, bench/, architecture/, sources/) are the part that follows this pattern. Lane A pages (benefits, conditions, guides, organizations) are structured caregiver-facing reference data with a different lifecycle — they are generated from upstream data in benefits-cli-tools, not LLM-synthesized from raw sources. See CLAUDE.md for the full two-lane split.

The ingest loop — how the wiki grows¶

Concrete example from outside the caregiving domain, to show the shape of the pattern. Imagine you are researching whether GLP-1 drugs might help with Alzheimer's disease.

Day 1¶

You find a Nature paper on liraglutide and cognitive decline. You drop it into raw/nature-2023-liraglutide.pdf. You tell the LLM "ingest this."

The LLM reads it. The LLM already knows the schema (CLAUDE.md is in its context). So it:

Creates sources/nature-2023-liraglutide.md — a bibliography entry with the key findings
Creates topics/glp1-alzheimers.md — a new concept page summarizing the state of evidence, citing this paper as the only current source
Maybe creates drugs/liraglutide.md — an entity page for the drug itself, linked to the topic
Updates _catalog.md — adds the new pages with one-line summaries so future sessions can locate them
Appends log.md: ## [2026-04-11] ingest | Nature 2023 — liraglutide and AD

You skim the diff. Maybe you say "emphasize that this was animal-only evidence so far." The LLM edits the topic page to qualify the claim. You commit. Ten minutes.

Day 5¶

You find a JAMA paper showing semaglutide human trial results. You drop it in, say "ingest."

Now the interesting part. The LLM does not create a brand-new topic — topics/glp1-alzheimers.md already exists. So instead it:

Creates sources/jama-2024-semaglutide.md
Opens and revises topics/glp1-alzheimers.md. Adds a "Human trial evidence" section. The old "strongest evidence is animal models" claim gets qualified: "was true until JAMA 2024."
Notices that the Nature paper proposed mechanism A and JAMA implies mechanism B. Files a contradiction block on the topic page using the !!! contradiction admonition — with a proposed resolution like "may be dose-dependent, needs Phase III."
Creates topics/glp1-mechanisms.md for the mechanism debate specifically, linked from the main topic
Updates the Nature source's cited_by: frontmatter to include the new mechanism page
Appends log.md: ## [2026-04-11] ingest | JAMA 2024 — semaglutide cognitive trial

The wiki now contains synthesis that reflects both papers simultaneously. It was not re-derived. It was compiled once at ingest, updated once at ingest, and persists.

Day 30¶

You have ingested six more papers over the month. Each one touched three to ten existing wiki pages — adding citations, qualifying claims, creating new concept pages for ideas that recur across sources. The topic page is no longer a summary of one paper. It is a synthesis of everything you have read, with contradictions flagged and mechanisms traced.

That is compounding. The wiki gets richer every time, not just longer.

The query loop — how you use it¶

Day 45 you ask: what is the current state of evidence for GLP-1 in Alzheimer's?

In a RAG system: the LLM searches a vector store, retrieves chunks from several papers, tries to stitch an answer. It has no memory of previous questions. If you had asked the same thing on day 10 it would have re-derived from scratch — possibly giving a different answer depending on which chunks the retrieval surfaced.

In the LLM Wiki pattern: the LLM reads topics/glp1-alzheimers.md. That page is the answer, because you built it incrementally. The LLM does not hallucinate — every claim is already cited. It does not miss evidence — the page already integrates everything you have ingested. Producing the answer is cheap because the expensive work was done during ingest.

Then something subtler happens. You ask: does GLP-1 show cognitive benefit in any other condition besides AD?

The LLM reads the topic page, sees it cross-references a Parkinson's page you filed two weeks ago for a different project, opens topics/parkinson-review.md, synthesizes across both, and — crucially — files the comparison as a new page (topics/glp1-cognitive-spectrum.md). The log gets a query | ... entry pointing to it.

Now your exploration is also compounding. The next question on day 60 draws from a wiki that is richer because of the day-45 question. Good answers do not disappear into chat history — they become persistent pages that future queries can draw on.

This is the "good answers can be filed back" move. Queries are not just consumption; they add to the substrate.

The lint loop — how it stays healthy¶

The wiki decays if nobody watches it. Once a week (enforced by .github/workflows/wiki-lint.yml) you run pnpm lint. The LLM — or the lint script as a proxy — walks the tree and reports:

Contradictions: two pages citing different numbers for the same thing, or a claim on one page that a newer source on another page disagrees with
Stale claims: pages last updated before newer sources arrived that should have changed them
Orphans: pages with no inbound links — either link them or delete them
Missing concepts: terms mentioned across multiple pages without their own entity page
Drift: sources claim X but synthesis pages imply Y
Gaps: things the wiki references but does not explain — added to gaps.md as a next-source queue

Some of this is automated (file-level checks, orphan detection, broken cross-references, staleness windows). Some of it needs judgment (semantic contradictions, missing concepts). You fix the easy stuff inline and add the hard questions to gaps.md for next ingest.

Why it compounds and RAG doesn't¶

Three distinct mechanisms:

1. Synthesis persists. In RAG, every answer is rebuilt from raw fragments. In the wiki, the LLM writes its understanding once and updates it as new evidence arrives. The expensive cognitive work — integrating a new source into a picture of the whole — is done at ingest, not at query.

2. Cross-references persist. The wiki is interlinked. When you follow liraglutide → mechanism A → GIP receptor → tirzepatide you are walking a path the LLM already traced during an earlier session. RAG cannot do this — retrieved chunks have no trails between them; each retrieval is a fresh stab.

3. Contradictions persist. When two sources disagree, a RAG system notices the tension for one answer and forgets. The wiki files the contradiction on the relevant page, with a resolution, and every future reader — human or LLM — sees it. Two papers that disagree go from being a liability ("the LLM will cite one or the other at random") to being an asset ("the disagreement itself is knowledge").

Your role vs the LLM's role¶

You	The LLM
Find sources worth reading	Read sources
Decide what is worth ingesting	Extract what matters
Ask questions	Synthesize across pages
Make judgment calls	Maintain cross-references
Curate the shape over time	File contradictions
Review the diffs	Append to the log
Name what the wiki is for	Regenerate the index / catalog
	Flag orphans
	Suggest new questions

The human's job is curation and direction — which sources to read, which threads to pull, which questions matter, what the wiki is ultimately for. The LLM's job is bookkeeping and synthesis — the tedious maintenance that makes wikis die when humans try to do it themselves.

This split is the whole thing. Humans abandon wikis because updating fifteen cross-references every time a new source lands is soul-crushing. LLMs do not get bored. So the wiki stays alive.

Where this wiki lives in practice¶

The pattern is format-agnostic but the idiomatic setup in this repo is:

A git repo of markdown files — version history, diffs, branches for free
MkDocs Material renders the wiki at wiki.givecareapp.com
Obsidian (or any markdown editor) works for local browsing and graph view
Claude Code is the primary LLM agent — reads CLAUDE.md on every session
CI (.github/workflows/wiki-lint.yml) runs lint + audit + catalog drift check weekly
Scripts under apps/web-wiki/scripts/ (lint.ts, audit.ts, catalog.ts) provide deterministic checks so the LLM does not have to re-verify everything by reading

In typical use you have two windows open: the LLM agent on one side, the rendered wiki (or Obsidian) on the other. You talk to the agent, watch the wiki change in real time, follow links, check the graph, correct anything that drifted. The wiki is the codebase; the LLM is the programmer; the editor is the IDE.

The key mental shift¶

Stop asking how do I query my documents? Start asking how do I build an artifact that answers most questions without needing to be re-queried?

The wiki is the answer to most questions, most of the time. Queries are requests to read the wiki. Ingest is the expensive operation; queries are cheap. That is backwards from RAG, where ingest is cheap (embed and store) and queries are expensive (retrieve, re-reason, hope the retrieval surfaced the right chunks).

Compounding over time beats retrieval-at-query-time — but only if you stay disciplined about ingest and lint. The schema file is what makes that discipline reproducible across sessions that never share chat history. That is why CLAUDE.md is the most important file in the wiki.