Semantic Memory (Slice 13)¶
The memory layer lives inside the existing donna_tasks.db file (spec_v3.md §16.1). Three tables are added:
memory_documents— one row per ingested source (a vault note today; chat turns, tasks, and corrections land in slice 14).(user_id, source_type, source_id)is unique. Soft-deleted viadeleted_atso search joins can filter without pruning the ANN index on every tombstone.memory_chunks— one row per chunk emitted by the chunker. Carriescontent,token_count, and a JSON-encodedheading_pathstack (e.g.["ProjectPlan", "Design", "Schema"]) so retrieval answers can cite a note's section header, not just the file path.vec_memory_chunks— the sqlite-vecvec0virtual table. Declared as(chunk_id TEXT PRIMARY KEY, embedding FLOAT[384]). Loaded on the shared aiosqlite connection inDatabase.connect(); if the extension wheel is missing the connection still opens andvec_availableflips toFalse.
Ingestion path¶
VaultSource.watch()— awatchfiles.awatchloop (500 ms coalesce) firesvault_watch_eventfor every.mdchange under the vault root, honoringsources.vault.ignore_globsplus the vault-widevault.ignore_globs. Deletes translate to soft-delete; adds / modifies route to_ingest_path.VaultSource.backfill(user_id)— walks the vault viaVaultClient.list(recursive=True)on boot, compares each file's mtime against the storedmemory_documents.updated_at, and enqueues anything newer-on-disk. Typical 20-note vault backfills in well under 30 s._ingest_pathbuilds aDocumentcarryinguser_id,source_type="vault", the relative path assource_id, the frontmatter title (or filename stem), thevault:<rel>URI, and the note body.donna: local-only(ordonna_sensitive: true) in frontmatter flipssensitive=True, which propagates to everyRetrievedChunk.metadata["sensitive"]for downstream prompt-building decisions.MemoryIngestQueue.run_forever()drains up to 16 docs per 500 ms window into a singleMemoryStore.upsert_manycall — soembed_batchfires once per flush, amortising the SentenceTransformer warm-up over the batch.
Re-ingest short-circuit¶
MemoryStore.upsert(doc) hashes doc.content to content_hash. If the existing row matches, we bump updated_at, clear deleted_at, refresh title / metadata / sensitive, and return without re-embedding. The invocation_log row count is the dedup signal: unchanged notes do not add rows for task_type=embed_vault_chunk.
Retrieval¶
MemoryStore.search(query, user_id, k, sources, filters) embeds the query (one invocation with task_type=embed_memory_query) and runs a single three-table join — vec_memory_chunks (ANN window of k*4), memory_chunks (content + heading path), memory_documents (provenance, sensitivity, soft-delete filter). Scores use MiniLM's unit-normalised outputs: score = 1 - distance² / 2 (sqlite-vec's vec0 returns L2 distance). Results below retrieval.min_score are dropped; k is clamped to retrieval.max_k. A structlog memory_retrieval event records k, hits, sources, and latency_ms per call.
Embedding contract¶
The default provider is MiniLMProvider (384-dim, 256-token window, BERT WordPiece tokenizer). Every embed / embed_batch emits one invocation_log row per input text — model_alias="minilm-l6-v2", tokens_in=0, cost_usd=0.0 — so the Grafana Memory Vault dashboard (docker/grafana/dashboards/memory.json) tracks embed volume alongside the normal LLM cost panels. Swapping to another provider (for example bge-small-en-v1.5 or a cloud embedding) is a config-only change in embedding.provider plus a build_embedding_provider factory branch.
Token counting uses tiktoken cl100k_base when the encoding file is available and falls back to a deterministic word+punct heuristic when it isn't (offline CI). The fallback is within ~10% of WordPiece on English prose and typically over-counts, so we err on smaller chunks rather than silent truncation inside the encoder.
Config¶
config/memory.yaml carries the tunables (embedding.{provider,version_tag,dim,max_tokens,chunk_overlap}, retrieval.{default_k,min_score,max_k}, sources.vault.{enabled,chunker,ignore_globs}). Pydantic aliases keep the slice-12 field names parseable so old configs still boot.
Fixtures¶
tests/fixtures/vault/ carries ~18 sample notes spanning the allowlisted folders plus deliberate Templates/** + .obsidian/** entries that exercise ignore_globs. Inbox/sensitive-credentials.md carries donna: local-only so the sensitivity-propagation tests have real content to bite on.