Memory Vault¶
Slice 12 (vault plumbing) + Slice 13 (semantic memory). Design constraints trace back to
spec_v3.md §1.3 / §3.2.4 / §4.3 / §7.3 / §14 / §17 / §30.
Why a vault¶
Task state lives in SQLite, conversation context dies with the session, and there's no file-based workspace agents can hand back to the user. An Obsidian-compatible markdown vault gives Donna a durable, human-editable, version-controlled surface for meeting notes, people profiles, daily logs, and research artefacts. Slice 12 establishes the plumbing; slices 13–15 layer semantic retrieval, episodic ingestion, and template-driven writes on top.
Architecture at a glance¶
| Piece | File | Responsibility |
|---|---|---|
| Config | config/memory.yaml + donna.config.MemoryConfig |
Vault root, git author, safety envelope, ignore globs. |
| Read client | donna.integrations.vault.VaultClient |
read, list, stat, extract_links. Async, read-only. |
| Write client | donna.integrations.vault.VaultWriter |
write, delete, move, undo_last. Sole mutation path. |
| Git wrapper | donna.integrations.git_repo.GitRepo |
subprocess-based init_if_missing, commit, revert, log. |
| Tools | donna.skills.tools.vault_{read,write,list,link,undo_last} |
LLM-facing skill tools. |
| WebDAV | docker/donna-vault.yml + docker/caddy/vault.Caddyfile.example |
Sync channel for Obsidian desktop / mobile clients. |
| Memory store | donna.memory.store.MemoryStore |
Upsert / search over memory_documents + memory_chunks + vec_memory_chunks (sqlite-vec). |
| Embeddings | donna.memory.embeddings.MiniLMProvider |
Wraps the shared MiniLM-L6-v2 loader in capabilities.embeddings. |
| Chunker | donna.memory.chunking.MarkdownHeadingChunker |
256-token chunks with heading-path provenance. |
| Ingest queue | donna.memory.queue.MemoryIngestQueue |
Batches upserts so embed_batch runs once per flush. |
| Vault source | donna.memory.sources_vault.VaultSource |
watchfiles watcher + boot-time backfill, keeps the store in sync with disk. |
memory_search tool |
donna.skills.tools.memory_search |
Agent-facing retrieval entry point. |
The client and writer mirror the Gmail integration line-for-line: single module per integration, async methods over asyncio.to_thread, non-fatal startup via _try_build_vault_client / _try_build_vault_writer in donna.cli_wiring.
Safety envelope¶
VaultWriter rejects any write that violates the invariants in spec_v3.md §7.3:
- Path must resolve under the configured vault root (no
.., no absolute, no symlink escape). - Extension must be
.md. - Top-level folder must be in
safety.path_allowlist(Inbox,Meetings,People,Projects,Daily,Reviewsby default). - Payload size ≤
safety.max_note_bytes(200 KB default). - If
expected_mtimeis supplied and differs from on-disk, the write fails withVaultWriteError(reason="conflict")before any disk change. - If the target exists with frontmatter and the new content omits it, the existing frontmatter is preserved on keys the new content does not supply.
- Every successful mutation produces exactly one git commit with author
Donna <donna@homelab.local>(from config) and a structured message. undo_lastalways usesgit revert— nevergit reset— so the audit trail is preserved.
Failures raise VaultWriteError(reason=...) with reason codes: path_escape, not_markdown, outside_allowlist, too_large, conflict, sensitive, missing.
Agent surface¶
Agents declared in config/agents.yaml gain the vault tools once the writer is built at boot, and memory_search once the store is built:
| Agent | Tools granted |
|---|---|
pm, scheduler, research, challenger |
vault_read, vault_write, vault_list, vault_link, vault_undo_last, memory_search |
If config/memory.yaml is missing or the vault root is unreachable, the vault tools simply aren't registered — boot still succeeds, and the rest of the skill system keeps running. Likewise, if sqlite-vec fails to load (Database.vec_available == False) the memory store and memory_search stay offline without taking the orchestrator down.
Semantic memory (slice 13)¶
The memory layer lives inside the existing donna_tasks.db file (spec_v3.md §16.1). Three tables are added:
memory_documents— one row per ingested source (a vault note today; chat turns, tasks, and corrections land in slice 14).(user_id, source_type, source_id)is unique. Soft-deleted viadeleted_atso search joins can filter without pruning the ANN index on every tombstone.memory_chunks— one row per chunk emitted by the chunker. Carriescontent,token_count, and a JSON-encodedheading_pathstack (e.g.["ProjectPlan", "Design", "Schema"]) so retrieval answers can cite a note's section header, not just the file path.vec_memory_chunks— the sqlite-vecvec0virtual table. Declared as(chunk_id TEXT PRIMARY KEY, embedding FLOAT[384]). Loaded on the shared aiosqlite connection inDatabase.connect(); if the extension wheel is missing the connection still opens andvec_availableflips toFalse.
Ingestion path¶
VaultSource.watch()— awatchfiles.awatchloop (500 ms coalesce) firesvault_watch_eventfor every.mdchange under the vault root, honoringsources.vault.ignore_globsplus the vault-widevault.ignore_globs. Deletes translate to soft-delete; adds / modifies route to_ingest_path.VaultSource.backfill(user_id)— walks the vault viaVaultClient.list(recursive=True)on boot, compares each file's mtime against the storedmemory_documents.updated_at, and enqueues anything newer-on-disk. Typical 20-note vault backfills in well under 30 s._ingest_pathbuilds aDocumentcarryinguser_id,source_type="vault", the relative path assource_id, the frontmatter title (or filename stem), thevault:<rel>URI, and the note body.donna: local-only(ordonna_sensitive: true) in frontmatter flipssensitive=True, which propagates to everyRetrievedChunk.metadata["sensitive"]for downstream prompt-building decisions.MemoryIngestQueue.run_forever()drains up to 16 docs per 500 ms window into a singleMemoryStore.upsert_manycall — soembed_batchfires once per flush, amortising the SentenceTransformer warm-up over the batch.
Re-ingest short-circuit¶
MemoryStore.upsert(doc) hashes doc.content to content_hash. If the existing row matches, we bump updated_at, clear deleted_at, refresh title / metadata / sensitive, and return without re-embedding. The invocation_log row count is the dedup signal: unchanged notes do not add rows for task_type=embed_vault_chunk.
Retrieval¶
MemoryStore.search(query, user_id, k, sources, filters) embeds the query (one invocation with task_type=embed_memory_query) and runs a single three-table join — vec_memory_chunks (ANN window of k*4), memory_chunks (content + heading path), memory_documents (provenance, sensitivity, soft-delete filter). Scores use MiniLM's unit-normalised outputs: score = 1 - distance² / 2 (sqlite-vec's vec0 returns L2 distance). Results below retrieval.min_score are dropped; k is clamped to retrieval.max_k. A structlog memory_retrieval event records k, hits, sources, and latency_ms per call.
Embedding contract¶
The default provider is MiniLMProvider (384-dim, 256-token window, BERT WordPiece tokenizer). Every embed / embed_batch emits one invocation_log row per input text — model_alias="minilm-l6-v2", tokens_in=0, cost_usd=0.0 — so the Grafana Memory Vault dashboard (docker/grafana/dashboards/memory.json) tracks embed volume alongside the normal LLM cost panels. Swapping to another provider (for example bge-small-en-v1.5 or a cloud embedding) is a config-only change in embedding.provider plus a build_embedding_provider factory branch.
Token counting uses tiktoken cl100k_base when the encoding file is available and falls back to a deterministic word+punct heuristic when it isn't (offline CI). The fallback is within ~10% of WordPiece on English prose and typically over-counts, so we err on smaller chunks rather than silent truncation inside the encoder.
Config¶
config/memory.yaml carries the tunables (embedding.{provider,version_tag,dim,max_tokens,chunk_overlap}, retrieval.{default_k,min_score,max_k}, sources.vault.{enabled,chunker,ignore_globs}). Pydantic aliases keep the slice-12 field names parseable so old configs still boot.
Fixtures¶
tests/fixtures/vault/ carries ~18 sample notes spanning the allowlisted folders plus deliberate Templates/** + .obsidian/** entries that exercise ignore_globs. Inbox/sensitive-credentials.md carries donna: local-only so the sensitivity-propagation tests have real content to bite on.
Sync channel¶
A Caddy container (donna-vault compose service) exposes the vault root over WebDAV with HTTP basic auth. Obsidian desktop (Remote Sync plugin), Obsidian mobile (WebDAV plugin), and any WebDAV-aware editor can mount the endpoint. Writes made by humans over WebDAV and writes made by agents via VaultWriter share the same on-disk repo, so git history reflects both.
See docs/operations/vault-sync.md for bring-up steps and client configuration.
What slices 12 + 13 do not do¶
- No chat / task / correction ingestion — slice 14 adds
ChatSource,TaskSource,CorrectionSourceon top of the sameMemoryStore. - No Jinja templates under
prompts/vault/and no memory-informed writers (meeting notes, weekly reviews) — slice 15. - No Supabase sync for
memory_documents/memory_chunks— slice 17. - No rename / move reconciliation beyond
delete + upsert— slice 16 (shipped). - No BM25 / hybrid retrieval or eval harness — slice 17.
- No off-server backup push — the vault is on local NVMe, captured by the existing backup rotation (
docs/operations/backup-recovery.md).
Handoff contract for slice 14¶
Slice 14 inherits:
- A stable
MemoryStore.upsert/upsert_manycontract. - The
MemorySource-shaped pattern modelled byVaultSource(watcher + backfill +_ingest_path) to copy for chat / task / correction sources. invocation_logtask_typevalues already present in the Grafana dashboard (embed_vault_chunk,embed_memory_query).- A fixture set under
tests/fixtures/vault/that the end-to-end memory test can extend.
No schema changes are expected in slice 14 — memory_documents / memory_chunks already accommodate every source_type.
Episodic sources (slice 14)¶
Slice 14 adds three new MemorySource-shaped modules on top of the same MemoryStore. All three observe the relevant source-of-truth write path, upsert a document, and expose a backfill entry point for the donna memory backfill CLI.
Observer wiring¶
Database— constructor injection (Option A).Database.__init__takes an optionalmemory_observerandadd_chat_message/create_task/update_taskeachawait self._fire_memory_observer(method, event). Exceptions are logged (memory_ingest_failed) and swallowed; the source-of-truth write has already committed by the time the observer fires and a memory-layer failure must never unwind the caller.cli_wiring._build_episodic_sources()builds a_CombinedDbObserverthat fans events out toChatSource/TaskSourceand attaches it viaDatabase.set_memory_observer(...).correction_logger— module-level registry (Option B).log_correctioncallsdonna.memory.observers.dispatch("correction", event).CorrectionSource.__init__is wired up viaregister_observer("correction", source.observe)during startup. Using the registry here keepslog_correction's signature stable (widening it would churn every existing call site).
The asymmetry is deliberate — the Database already takes a handful of collaborators via its constructor, so one more is cheap and keeps call sites explicit; the correction_logger is a single loose function and staying out of its signature is worth the small pattern split.
Source summaries¶
ChatSource(src/donna/memory/sources_chat.py). Maintains a per-session rolling buffer keyed bysession_id; flushes a turn document when the role flips, the buffer exceedsmax_tokens, or the session transitions toclosed/expired.source_idis"{session_id}:{first_msg_id}-{last_msg_id}"— re-running backfill upserts the same row, so row counts stay stable. Respectssources.chat.index_roles(default[user, assistant]),min_chars, and the configuredtask_verbslist.TaskSource(src/donna/memory/sources_task.py). Source-of-truth is thetaskstable. Content hash is driven bytitle + description + notes_json + status + domain + deadlineviaTaskChunker; non-semantic fields (priority, scheduling times) deliberately don't bump the hash so retrieval stays cheap. A status transition into a terminal state listed insources.task.reindex_on_status(defaultdone,cancelled) busts the content hash so the final-state context always lands in the index. A"delete"event callsMemoryStore.delete(source_type="task", source_id=task_id, user_id=...)— but note that as of slice 14 there is no soft-delete path on thetaskstable orDatabaseAPI, so this branch is dormant in production; it's kept ready for the day a soft-delete lands and exercised directly by unit tests.CorrectionSource(src/donna/memory/sources_correction.py). One chunk per correction event; template is"Field {field} changed from {original!r} to {corrected!r} on input: {input!r} (task_type={task_type})".source_idis the correction rowid, so the second call tolog_correctionfor the same row is a no-op upsert.
Why episodic sources skip the ingest queue¶
VaultSource enqueues into MemoryIngestQueue because the boot-time backfill replays dozens of files in one burst — batching embed_batch over the burst is a real win. Chat / task / correction events arrive at human-typing rate (one at a time), so the batching window almost never fires with more than one event in it. The chat source also keeps a per-session in-memory buffer that depends on synchronous ordering (a queue would let two messages from the same session be processed out of order). And TaskSource's "force re-embed on terminal status" path needs to bust the stored content_hash immediately before the upsert, which doesn't fit the queue's batched upsert_many contract. We accept the per-event cost (one embed_batch per upsert) and revisit if a bulk-import workload ever bursts chat ingest.
Backfill CLI¶
donna memory backfill [--source vault|chat|task|correction|all] [--user-id UID] boots a minimal orchestrator (Database + MemoryStore + sources) and calls each selected source's backfill(user_id) in sequence. Idempotent — a second invocation leaves memory_documents / memory_chunks row counts unchanged (the UNIQUE(user_id, source_type, source_id) index is the enforcer). One source failing doesn't stop the rest; the command exits non-zero if any raised so CI can notice.
Observability¶
- Invocation log:
task_typein{embed_chat_turn, embed_task, embed_correction}(in addition to slice-13'sembed_vault_chunk/embed_memory_query).model_alias="minilm-l6-v2",tokens_in=0,tokens_out=0,cost_usd=0.0. - Structlog events:
memory_ingest_chat_turn,memory_ingest_task,memory_ingest_correctionon success (each carrieslatency_msfor the full upsert round-trip);memory_ingest_failedon observer failure (withsource_type+reason);memory_backfill_{chat,task,correction}_doneon backfill completion. - Grafana: slice-13's
memorydashboard renders per-source gauges because it groups bysource_type. Slice 14's follow-up commit added a per-source ingest-latency histogram panel driven by thelatency_msfield above, so chat/task/correction counts and p50/p95 latencies are visible out of the box.
Task-verb morphology¶
ChatTurnChunker._keep rescues short messages that would otherwise be dropped when they contain a configured task_verbs token. The match is tokenized and covers the bare verb plus -s / -ed / -ing inflections and the e-drop variants (schedule → scheduling / scheduled). The check is token-level, so superset words like callous or callable intentionally slip through without rescuing an otherwise-short noisy message.
Slice 15 — template writes¶
Slice 15 introduces the first outbound path: Donna writes vault notes autonomously in response to triggers (today: post-meeting; Slice 16 adds four more templates under the same pattern).
Components¶
VaultTemplateRenderer(src/donna/memory/templates.py) — a thinFileSystemLoader+StrictUndefinedJinja environment. Templates are self-contained: each template emits its own frontmatter as a first-line---YAML block; the renderer parses and returns it separately viapython-frontmatter. Missing context keys raisejinja2.UndefinedError.MemoryInformedWriter(src/donna/memory/writer.py) — the shared orchestrator every template-write skill delegates to. Owns autonomy-based path redirection, frontmatter-keyed idempotency, prompt-template rendering, routed LLM completion, vault-template rendering, and commit. Any failure logsvault_autowrite_failedand returns a skippedWriteResult— never a partial write.resolve_person_link(src/donna/memory/linking.py) — looks upPeople/{name}.mdin the vault; returns[[People/{name}]]when present,[[{name}]]otherwise. Never auto-creates stubs.MeetingNoteSkill+MeetingEndPoller(src/donna/capabilities/) — the reference trigger. The poller scanscalendar_mirroronce perconfig.memory.skills.meeting_note.poll_interval_secondsfor events that ended within the lookback window and don't already have a meeting note indexed. The skill composes memory-search context (prior meetings, recent chats, open tasks), resolves attendee wikilinks, and delegates toMemoryInformedWriter.
Idempotency contract¶
Every autowritten note carries an idempotency_key frontmatter field
(the calendar event id for meeting notes). Before any LLM spend, the
writer reads the target path; if the existing note's
idempotency_key matches, it emits
vault_autowrite_skipped_idempotent and returns without work. This
makes re-polling safe and cheap.
Autonomy-level → path redirection¶
config/memory.yaml:skills.meeting_note.autonomy_level is the
skill-local control. At low, every write is redirected to
Inbox/{basename} regardless of the caller-computed target_path.
At medium / high, the caller's path is honoured. This is
distinct from config/agents.yaml:research.autonomy, which governs
the research agent's overall tool budget and timeout. Per-template
beats per-agent so Slice 16 templates can differ.
CalendarMirror.attendees¶
CalendarMirror gained a nullable attendees TEXT column (migration
c9d1e3f5a7b2). calendar.py::_parse_event reads
items[i].attendees from the Google API, normalising each entry to
{name, email} (name = displayName or email local-part);
calendar_sync.py::_update_mirror JSON-encodes the list on write.
The meeting-note skill parses the JSON and passes it through to the
template + wikilink resolver.
Observability¶
- Invocation log: new
task_type=draft_meeting_note,model_alias=reasoner, standard token/cost fields (this is a paid cloud call, unlike the local embedding calls). - Structlog events:
meeting_end_detected(poller found an eligible event),vault_autowrite_skipped_idempotent(writer found a matching key),vault_autowrite_written(happy path),vault_autowrite_failed(any step raised). Slice 16 renamed the two writer-owned events frommeeting_note_*to the genericvault_autowrite_*form and added atemplatefield so Grafana breaks counts down per template. - Grafana
memorydashboard gains a "Template writes" row (writes by template, skip rate, LLM cost, failures).
Slice 16 — cadence writes, person stubs, rename reconciliation¶
Slice 16 fills in the four template writes slice 15 deferred, adds a
central People/{name}.md stub auto-creator, and replaces
delete-plus-upsert rename handling with content-hash reconciliation.
No infrastructure changes to VaultTemplateRenderer,
MemoryInformedWriter, or resolve_person_link beyond two optional
constructor kwargs on the writer (safety_allowlist,
person_stub_helper).
Cadence-driven skills¶
Four new skills, all sharing one MemoryInformedWriter instance:
daily_reflection(src/donna/capabilities/daily_reflection_skill.py) — nightly. TargetReflections/{YYYY-MM-DD}.md, idempotency key the ISO date. Context: today's meeting notes, terminal task mutations, chat highlights.commitment_log(src/donna/capabilities/commitment_log_skill.py) — nightly. TargetCommitments/{YYYY-MM-DD}.md, idempotency key the ISO date. LLM extracts explicit speech-act commitments; one file per day so idempotency is trivial and git log gives the running view.weekly_review(src/donna/capabilities/weekly_review_skill.py) — Sunday evening. TargetWeeklyReview/{iso_year}-W{iso_week:02d}.md, idempotency key the ISO week label. Also loads the prior week's review (if any) for carry-over commitments.person_profile(src/donna/capabilities/person_profile_skill.py+person_mention_counter.py) — Sunday evening. Two triggers: mention_threshold (PersonMentionCountersweep ofmemory_chunks.content LIKE '%[[Name]]%'overlookback_days) and stub_fill (weekly scan ofPeople/*.mdfor notes shorter thanmin_body_chars). Overwrite guard: refuses to touch notes that are non-empty and lackautowritten_by: donnain frontmatter — Donna never overwrites a user-edited profile. Idempotency key{name}@{iso_week}.
All four route to the reasoner alias via new task_types
(draft_daily_reflection, extract_commitments,
draft_weekly_review, draft_person_profile) in
config/task_types.yaml + config/donna_models.yaml.
Time triggers¶
AsyncCronScheduler (src/donna/skills/crons/scheduler.py) gained
optional day_of_week: int | None (Mon=0..Sun=6) and
minute_utc: int = 0 kwargs — enough to cover daily +
sub-hour-granular weekly triggers without introducing APScheduler.
The existing positional AsyncCronScheduler(hour_utc, task)
signature is preserved for back-compat with the other cron users in
the codebase.
Person-stub auto-creation¶
donna.memory.person_stub.ensure_person_stubs scans a rendered body
for bare [[Name]] wikilinks (namespaced, aliased, and heading
variants are excluded) and writes a People/{name}.md stub when
missing. Wired into MemoryInformedWriter.run after a successful
vault_writer.write; failures never propagate (logged as
person_stub_failed). People must be in
safety.path_allowlist — the helper is a no-op otherwise.
Stubs carry type: person, name, stub: true,
autowritten_by: donna frontmatter, which the person_profile
skill later detects and rewrites with full context.
Rename reconciliation¶
VaultSource.watch() now buffers Change.deleted events for
sources.vault.rename_window_seconds (default 2 s) keyed by the
row's content_hash. If a matching Change.added arrives within the
window, the pending delete is cancelled and MemoryStore.rename
updates source_id in place — no chunk or embedding churn. On
miss, the delete flushes normally; on target collision, the caller
falls back to delete+upsert.
Structlog events: vault_rename_buffered, vault_rename_matched,
vault_rename_flushed_as_delete.
See slices/slice_16_autowrite_cadences_and_rename.md and
spec_v3.md §30.7 for the full scope + deferrals handed to slice 17.
See slices/slice_15_template_writes_meeting_notes.md and
spec_v3.md §1.3 / §4 / §4.3 / §7.3 / §14.