AgentsObsidianPKM 2026-04-16

The Architect: Building a Weekly Review Agent on Top of an Obsidian Vault

A structured weekly review agent with MCP read-write access to an Obsidian vault. From capability to application, and why the specificity of the workflow turned out to be the point.

Also on Medium.

How a weekly review agent inside a personal dashboard changed the relationship between my notes and my decisions.

I run my life out of an Obsidian vault, projects, weekly reviews, decisions, health logs, all in markdown. Around it I built a local stack: a Hono.js REST API that exposes the vault as structured data, a React dashboard for daily orientation, and an MCP server that gives an AI agent read-write access to everything in it. The first article covered the full system, thirty tools, bidirectional sync with Apple Reminders, a force-directed graph of every note and every link.

That article ended with a direction: connect an AI model to the vault for analysis, automated document generation, and weekly planning. What I built is more specific than that, and the specificity turned out to be the point.

MCP gave the agent access to my vault. That was a capability. What I built on top of it is an application of that capability, and the difference between “agent that knows your notes” and “agent that conducts review with you” turned out to be the entire question.

I built the Architect.

What the Architect is

The Architect is a tab in my dashboard, the same React SPA that handles the rest of my life. It looks like a chat interface, but it isn’t one. It’s a structured weekly review agent that already knows everything about my week before the conversation starts.

When I open the Architect tab and press “Full Review,” this is what happens on the backend: the server reads the previous week’s markdown file from Areas/Home/Weeks/, pulls the last two weeks of daily notes (the most recent seven of which actually reach the prompt), pulls the most recent letters from my personal project, reads career roadmap progress, learning dashboard completion rates, and health entries. It loads domain-specific AGENT.md instruction files from four vault areas, each one containing the rules, context, and constraints for that part of my life. All of this is assembled into a single system prompt and sent to the AI model along with one instruction: analyze.

The response is not a summary. It’s an analysis with specific questions, one or two per life domain, targeted at what the data doesn’t show. “Your capacity readings dropped Thursday and Friday but there are no overload entries. Was that cumulative fatigue or something specific?” “Career roadmap shows three tasks completed in LLM/RAG but the study log has no entries this week, were these done outside the vault?” The Architect doesn’t ask what happened. It already knows what happened. It asks about the gaps.

Why a conversation, not a generated document

The previous article ended with a broad direction: connect an AI model for analysis, automated document generation, and weekly planning. The first version I built leaned toward the document side of that, collect the data, run it through the model, produce a review I’d then edit. It worked. It was static.

The most valuable part of a weekly review is not what’s in the data, it’s what’s not in the data. Why was Wednesday’s capacity high even though the calendar was packed. Why no learning sessions happened despite them being in the minimum checklist. Whether a pattern I documented three weeks ago repeated or broke.

A generated document can surface metrics and summaries. A conversation can ask follow-up questions. The Architect is designed around a specific loop: it reads everything, forms an initial analysis, asks targeted questions, and uses the answers to inform the next week’s plan. The loop requires my input at the point where data meets interpretation.

What it actually replaces

It’s worth being honest about what the Architect isn’t. It doesn’t tell me anything I couldn’t have noticed myself if I’d had two hours and the right prompts. It doesn’t replace the thinking. What it replaces is the orientation cost, the part of review that isn’t thinking, it’s just getting to the point where thinking is possible. For a brain that bottlenecks hard on unstructured starts, that’s the entire bottleneck.

This is the thesis of the whole thing, and I want to put it up front rather than bury it. A weekly review needs you to read seven daily notes, scan four projects, remember what you flagged last week, decide what’s worth carrying forward, and generate three to five focus areas for next week. Each step is small. The wall is the sequence, knowing where to start, what’s worth attending to, what to skip. The Architect handles the orienting layer so I can spend my actual cognitive effort on the parts that need it: interpreting patterns, making decisions, choosing what matters next.

The three-phase flow

The Architect operates in three phases within a single session.

Phase one is review. The system prompt contains the previous week’s full markdown, the last seven daily notes, career progress bullets, learning completion stats, health entries, and the content of every domain’s AGENT.md. The Architect reads all of this, produces observations for each life domain, home, work, career, learning, health, and asks clarifying questions. The conversation continues until all domains are covered.

Phase two is planning. Once I’ve answered the review questions, I press “Plan next week.” The Architect takes the entire conversation history, my answers included, and generates a structured JSON plan: week theme, intention, focus areas split into minimum and optimum tiers, and nervous system support notes. Daily word-task-quote entries are filled in separately by the letters-generation step against a markdown template. The plan is displayed as a preview before anything is saved.

Phase three is writing. One button saves everything to the vault: the review section of last week’s file gets filled in with what-helped, too-much, pattern-noticed, and next-week items. A new week file is created with the full plan, tracker table, and daily entries. Seven daily letters are generated and saved to Areas/Home/Letters/. If the week file already exists, because I started filling it in manually, the system merges at the section level: each section in the generated content is checked against the corresponding section in the existing file, and only sections that are empty, contain only a single dash (-), or match a placeholder pattern like *(fill in during the week)* get overwritten. Anything I’ve actually written stays.

The entire flow, from pressing “Full Review” to having next week’s files in Obsidian, takes about fifteen minutes. Ten of those minutes are me thinking about the answers.

What this looks like in practice

A real exchange from a Phase 1 review, lightly anonymized:

Architect: Home, capacity stable Mon-Wed (4, 4, 3), then dropped Thursday and Friday (2, 2). No overload entries logged for either day.

Career, roadmap shows three tasks completed in the LLM/RAG track, but the study log has no entries this week. Two questions before I plan next week:

Thursday-Friday capacity drop, cumulative fatigue, or something specific?
Were the LLM/RAG tasks done outside the vault?

You: Cumulative, three social events Tue and Wed evening. LLM/RAG tasks done in the terminal, not logged.

Architect (suggested edit): Areas/Home/Weeks/Week-03.md

Overload: Thu-Fri, social cumulative (Tue + Wed evening events) [Apply] [Reject]

Architect (suggested edit): Areas/Career/Roadmap.md

LLM/RAG: Module 3 (in progress)

LLM/RAG: Module 3 (completed)
LLM/RAG: Module 4 (started)
Notes: terminal session, not vault-logged [Apply] [Reject]

The pattern visible in that exchange is the loop in miniature: the Architect surfaces a gap (capacity drop without overload entries), I supply the interpretation it can’t infer (cumulative social load), and it produces concrete, applicable edits to the underlying files. I press Apply, the markdown changes, the next time the dashboard loads those changes are visible everywhere else they appear, Home tab, Career tab, the graph view.

Domain context: what makes it not generic

The most technically interesting part of the Architect is how it builds its context. It’s not a general-purpose chatbot with access to my notes. It’s a domain-aware agent that loads different instruction sets depending on what it’s reviewing.

Each major vault area has its own AGENT.md file, a set of rules for how that area should be handled. The career AGENT.md describes the roadmap structure, certification priorities, and the dependency chain between career phases. The learning AGENT.md knows the English course has four phases and fifty lessons and that progress is tracked per-lesson. The family AGENT.md contains the “do not interpret, do not advise” rule that applies to personal notes. The Home AGENT.md explains the project’s thematic layers and the day-counting system.

When the Architect runs a weekly review, it loads all four domain contexts. When you switch to a single-domain chat, the Career tab, or the Learning tab, it loads only the relevant context. This means the Architect’s behavior adapts to the domain. In the career review, it tracks metrics and suggests specific next actions. In the home domain, it’s careful, quiet, and doesn’t offer unsolicited advice. Same model, same temperature, different instructions.

const agentFiles = {
  home:     'Areas/Home/AGENT.md',
  career:   'Areas/Career/AGENT.md',
  learning: 'Areas/Learning/AGENT.md',
  family:   'Areas/Family/AGENT.md',
};
for (const [key, relPath] of Object.entries(agentFiles)) {
  const file = await readFile(join(VAULT, relPath), 'utf-8');
  const raw = stripFrontmatter(file);
  result.domainContext[key] = raw.trim().slice(0, 1500);
}

The snippet is simplified for readability, the production version wraps each read in try/catch with an empty-string fallback, since a missing or unreadable AGENT.md shouldn’t crash the whole review.

Each context is capped at 1500 characters, enough for the rules and structure of an area, small enough that all four contexts plus the weekly data fit comfortably under the 8K-token budget for the system prompt. 1500 was an ad hoc choice rather than the result of a tuning sweep; I haven’t gone back to compare it against 800 or 2500 because the reviews have been working. What gets cut at the boundary is detailed examples and historical notes, anything that’s reference rather than rule. The total system prompt for a full review runs about 8000 tokens. This is the practical ceiling for a single-shot context window approach: no RAG, no vector search, no retrieval pipeline. Just structured markdown, loaded in full.

The AGENT.md files are the load-bearing decision in this whole architecture. Everything else, the conversation flow, the file edits, the merge logic, depends on those files being good. Early versions of mine were thin, and the Architect’s reviews were correspondingly generic. Once I rewrote them to actually describe what each area is for, what success looks like in that area, what I’d want flagged, the quality of review changed sharply. If I built this again I’d start there and treat the UI as secondary.

File edits: the agent writes back

The Architect doesn’t just generate plans. It can suggest edits to existing vault files, and the user decides whether to apply them.

During any conversation, the Architect can produce a structured <file_edit> block:

<file_edit path="Areas/Home/Weeks/Week-03.md"
           description="Add focus item to week plan">
  <old>## 3. Focus areas</old>
  <new>## 3. Focus areas

**1. Complete LLM/RAG module 3**</new>
</file_edit>

The <old> block in this example is narrow for brevity. In practice the Architect is prompted to include enough surrounding context in <old> to uniquely identify the edit location, and the backend enforces the guarantee: it rejects with a clear error if <old> matches in multiple places, and treats the “already applied” case as a no-op rather than a silent duplicate.

The frontend parses these blocks from the response, renders them as a “Suggested edits” panel with path, description, and expandable diff, and provides an “Apply” button for each one. The backend receives the edit, finds the exact old text in the file, replaces it with the new text, and writes atomically, temp file, then rename. If the old text doesn’t match (the file has changed since the Architect read it), the edit fails safely.

This is the feature that most clearly separates the Architect from a chatbot. A chatbot answers questions. The Architect can propose changes to the system it’s reviewing, and those changes are applied to the same markdown files that every other part of the dashboard reads from. When the Architect fills in a week review, that review shows up in the Home tab. When it creates a week file, the tracker table appears in the Main tab. The loop is closed: the data the Architect reads is the data it writes to, through the same atomic write path that every other API route uses.

Letters: the quiet part

The Architect generates seven daily letters as part of each week plan. These are short, a title, three to four sentences. They’re written in the voice established by the project’s current thematic layer, which is defined in the Home AGENT.md and loaded as part of every Home-domain context. The letters are also calibrated to the week’s theme and focus areas from the conversation that produced the plan.

The letters serve a function that’s hard to explain without the broader project context. “Home” is the foundational project, the base layer that everything else rests on: career decisions, family, daily stability, future. Each day has a number counting up. Each week has a theme, a set of focus items, and a daily ritual structure. The letters are the part of that structure that isn’t about tasks, they’re about continuity. A sentence at the start of the day that connects to the week’s intention.

The Architect checks which day numbers already have letters before generating, and skips existing ones. The agent augments what exists rather than replacing it.

What changed

The first article described a system where an AI agent had read-write access to my vault through thirty MCP tools. That was a capability. The Architect is an application of that capability, a specific workflow built for a specific purpose, with domain context, structured phases, and file persistence.

The difference in practice is that I actually use it. The MCP tools existed for weeks before the Architect, and I used them occasionally, asking the agent to find patterns, surface connections, toggle a few checkboxes. Useful, but ad hoc. The Architect turned a weekly review from something I should do into something I do, because the friction went from “sit down, read through seven daily notes, review four projects, write a new week file, create seven letter files” to “press Full Review, answer five questions, press Save.”

The data quality improved too. When a review is generated from actual conversation rather than template-filling, the “what helped” and “pattern noticed” sections contain things I wouldn’t have written on my own. The Architect noticed that my lowest-energy weeks follow social overload events, not workload spikes. I knew that. I had the data to prove it. I’d never written it down in a review.

Stack and architecture

For the technically curious: the Architect is about 1000 lines of React (one tab component with five UI sub-components) and 350 lines of backend (vault reading, session persistence, merge logic). Sessions are persisted in two formats side by side: a JSON file the app reads to resume state, and a readable markdown file with frontmatter that lives in the vault and shows up in Obsidian’s graph. Both are keyed by project week number, counted from Day 1 of the project, not the ISO calendar, so opening the Architect in week 4 reloads the week 4 session if one exists, otherwise starts fresh. No database, no separate app state directory; the vault is the single source of truth for both content and session state. The AI calls go through a single /api/oracle/chat endpoint that proxies to the model API, the frontend never talks to the AI directly. Secrets are managed through 1Password CLI injection at startup.

The frontend was recently migrated from a monolithic 1600-line index.html to a proper esbuild pipeline with modular components. The Architect tab was one of the first components built in the new architecture, it’s a self-contained module with its own state management, its own API hooks, and its own sub-routing (review, planning, domain chats, free chat).

The whole system runs locally. No cloud database. No user accounts. No deployment pipeline. The dashboard reads from the vault and writes to it. The Architect sits on top of both.

What’s next for the Architect

The immediate next step is automated scheduling, a weekly trigger that collects stats, generates preliminary observations, and prepares a review session for when I open the tab. Not a full batch review, but a warm start: “Here’s what I noticed this week. Ready to review?”

The distinction matters. Warm start is context preparation, the agent reads the data and forms initial observations before I sit down, so the session opens already oriented. A full batch review would be the agent running the entire review on a schedule and posting the result to a daily note. I’m deliberately not building that. The point of the Architect was never to generate review artifacts; it was to lower the activation cost of doing the review myself. Full automation would re-introduce the problem the tool was built to solve.

After the warm start: cross-week pattern detection. The Architect currently sees one week of data plus seven days of daily notes. It doesn’t yet have access to the full history of past reviews. Connecting it to the archive of completed week files and past Architect conversations would let it surface longer-term patterns, “This is the third week in a row where learning sessions dropped. Last time this happened was February, and it resolved after you adjusted the daily minimum.”

The longer-term direction is for the Architect to become the primary interface for project-level decisions. Not replacing the dashboard tabs, those are for quick reads and daily operations, but handling the weekly and monthly cadence of reflection, planning, and adjustment.

That’s the direction I think personal tooling with LLMs is actually useful. Not “chat with your notes.” Collaborative maintenance of the system you’re already trying to run.

Stack: Hono.js, React, esbuild, 1Password CLI, Obsidian

Previous: From Scattered Notes to Living System: Obsidian, Claude, and a Personal OS