PKMLLMObsidian 2026-05-09

Identity API: Writing Your Operating Manual

A typed identity layer for AI agents. Schemas, evidence-backed claims and bidirectional links across 173 vault files turn freeform user context into a structured API for LLMs.

Also on Medium.

A personal user manual readable both by me, by whichever LLM is configured, or by other people.

For many years, one of my long-running interests has been the study of consciousness. I understand perfectly well this is something like chasing a chimera. The definitions slip, the methods don’t agree, the question barely holds still long enough to be asked. But the arrival of LLMs added an unexpected layer to the chase: a new object to study, and through that object, a new way to look at myself.

This article isn’t really about writing a manual for myself. It’s about the logical continuation of an Obsidian dashboard that already runs my projects and my life, extending it toward two harder things: who I am inside the system, and what AI is inside it. And how these layers operate together.

Before the technical body of this article, one thing out loud. I treat this work as an experiment, exactly that, weighed in advance against what I think it can and can’t do, and run with reasonable knowledge of where it might fail. It runs as three experiments at once. One is technical: a chance to test new tooling as an engineer. One is structural: a chance to look at myself the way I look at the rest of my life, as a system I can describe well enough to operate. One is study: a chance to learn how LLMs behave under different kinds of context, from as many angles as I can find. None of the three reduce to the others. The work is interesting precisely because all three run in parallel.

The article describes the third architectural layer of a Personal OS I’ve been building over the past month: an operational vault that runs my dashboard and weekly reviews; a historical corpus (Diary-Corpus/) of digital notes and twenty-plus years of OCR’d handwritten diaries, indexed for retrieval and used for fine-tuning; and now Self-Model/, a typed knowledge graph of how I operate. Prior posts cover the earlier layers. This one focuses on Self-Model.

A genre this isn’t

At work a week ago, I had the idea to write a short manual about myself, eight points on how to structure collaboration with me. Direct feedback is fine; ask why before assuming; default to public channels for anything searchable, DMs for personal feedback; open a ticket for any request larger than five minutes; writing usually beats calls; flag urgency explicitly because async is the default. It took an afternoon. Once it existed, I noticed something similar belonged in the Personal OS project I’d been building separately.

A little searching showed the genre has been around for a while. Cassie Robinson’s “A User Manual for Me” (2017) was the foundational template. Michael Lopp’s “How to Rands” became the most-cited engineering-manager exemplar. Claire Hughes Johnson’s “Working with Claire” circulated widely from inside Stripe. By 2020 there were enough variations to fill a curated list. A separate, newer genre formed more recently around personal context for LLMs: Naveen Selvadurai’s “personal API” (2013, mostly metrics), NLW’s Personal Context Portfolio (2025, ten markdown files served via MCP), parrik/know-thyself (a typed provenance graph), Buster Benson’s fourteen-year Codex Vitae.

What I didn’t find, is a single artifact that combines a typed schema, evidence-as-first-class, the operating-instructions register of the manager READMEs, and a model-agnostic AI consumer. Self-Model is an attempt at that combination.

Soul came first

For the first version I wrote prose. Nine markdown files inside the main vault, in a folder called Soul/: 5,988 lines, roughly 589 KB. soul-tanya.md (948 lines) is a long first-person description of how I work. soul-agent.md (811 lines) is the symmetric document, what I want from the assistant, what register I expect. partnership.md, shared-vocabulary.md, open-questions.md, key-facts-for-handover.md, and a few smaller files fill out the rest.

These files worked. Loaded as initial context, they gave the assistant a grounded sense of who it was talking to. The prose was honest and dense.

It stopped scaling for a specific reason: prose has no surface for an LLM to ask how confident is this, when was it last verified, what evidence supports it, where does it not apply. Those questions matter when the assistant has to act on a claim, not just refer to it. A 948-line document is a single block of trust. A schema with frontmatter, sources, and a verification date is something an AI can navigate at the granularity of a single trait. Self-Model emerged four days after Soul stabilized. Not as a replacement, as an extraction.

The schema in five lines

Self-Model/ has 173 files. 169 carry full frontmatter; the four that don’t are service files. 152 carry full provenance: type, status, domain, confidence, sources, last-verified, tags. There are 15 distinct type values, from evidence-quote (55 files) and stance (19) down to single-instance types like cross-reference. Confidence is high for 127 files, medium for 23, low only for two templates. 17 files carry last-verified: null, they’re staged, awaiting human verification, with the absence of a date acting as an explicit honesty marker rather than a bug.

Self-Model/
├── core/         27 files in identity/, cognition/, values/, biology/
├── patterns/     47 files in triggers/, recovery/, responses/, communication/
├── stances/      19 files in self-care/, existential/, work/, relationships/
├── voice/         4 files: lexicon, syntax, tone-shifts, examples/
└── evidence/     55 files, flat

That much you could infer from any well-organized vault. The interesting part is one level deeper.

The user manual is in the schema

When I inspected section headings across the vault, the pattern was unambiguous. Every trait, pattern, and stance is treated as operating documentation.

In core/: 14 files have an H2 section titled ## What the Owner Needs. Seven have paired sections ## Where She Doesn't Compromise and ## Where There Is Flexibility.

In patterns/: 15 files have paired H3 sections ### Do and ### Don't. Ten have ### What a Close Person Can Do. Ten have ### What She Does Herself. Two-sided instructions for the owner and for collaborators, on the same page.

In stances/: every one of the 19 files has ## What the Owner Needs and ## What She Rejects. The structure of a stance is: position, what it means in practice, what it rejects, what the owner needs, the boundary.

This isn’t a writing convention added on top of the schema. The headings are part of the document type, when I create a new pattern file, the template ships with ### Do and ### Don't already in place. The user-manual register is a structural constraint enforced across 173 files. The schema is saying: every fact about how this human operates must be expressed in a form a stranger could act on.

The third-person register in the section titles is also deliberate. The schema documents the owner; it isn’t written by her in the moment. That distance is what keeps the artifact a specification rather than a journal entry.

One full document

Take roughly the same content I wrote out as the work manual, the practical rules for how I prefer collaboration to flow, and express it as a single schema-bound file. patterns/communication/async-default.md. Frontmatter and headings; body text trimmed for length:

---
type: communication-preference
domain: communication
confidence: high
sources:
  - "[[../../evidence/pattern-communication-01]]"
  - "[[../../evidence/voice-syntax-hedging-01]]"
last-verified: null
status: active
tags: [communication, async, default-mode]
---

# Asynchronous by default; urgency must be explicit

## TL;DR

Default channel is written and asynchronous. Synchronous calls are the exception, scheduled with explicit purpose, not the rule. Urgency is signaled in words, not in channel choice.

## What it means in practice

[Five specific protocols]

## Why it works

[Linking to attention-pattern and predictive-coding evidence]

## Special cases

[Emergencies, time-zone friction, first contact]

### Do

- Send the question in writing first; meet later if needed
- Flag urgency in the message, "urgent", "blocking me", or a stated deadline
- For things under 5 minutes, I try to handle them immediately; otherwise I batch a reply

### Don't

- Don't escalate channels (chat to call to text) without warning
- Don't treat fast response as proxy for engagement
- Don't schedule a call to communicate something a paragraph would carry

## Related patterns

- [[full-context-before-commitment]]
- [[written-over-verbal]]

A new collaborator could read this in two minutes and adjust how they work with me. The assistant reads the same file as part of an assembled context and uses it to compose the right next move when I ask it to draft a message. The file is the primary artifact; whether the consumer is human or model is a property of the runtime, not the document.

The difference between the work manual and the schema is small but operational. The work manual collapses a behavior into one sentence: I respond async by default to protect deep work. The schema surfaces the same content at a different granularity: when I sense asynchronous context is missing, I defer response by 24 hours and ask for written prep, this is not procrastination; it’s the operating condition. Both can be true; the schema makes the second one queryable at the granularity of a single trait, with confidence, source, and a verification date attached.

The evidence layer

This is the part that distinguishes Self-Model from the personal-README genre. Every claim in core/, patterns/, stances/, and voice/ is grounded in evidence, and every evidence file lists what consumes it.

Forward edges live in frontmatter: the sources: array of any non-evidence file points to one or more wikilinks under evidence/. Back edges live in a body section that every evidence file ends with: ## Used In, followed by a list of files that cite this evidence. Both directions are maintained.

---
type: evidence-quote
domain: evidence
confidence: high
sources:
  - external: "Diary-Corpus/digital/notes/"
  - external: "Diary-Corpus/digital/jsonl/"
last-verified: 2026-05-06
status: active
tags: [cognition, attention-pattern, anchoring]
---

# Cognition Evidence. Attention Pattern: Anchoring

## What this supports

[One paragraph]

## Quotes

[Verbatim fragments from the diary corpus, each with file path]

## Used in

- [[../core/cognition/attention-pattern]]

55 evidence files, zero orphans. The cost is real: adding a claim means producing or pointing to an evidence quote, and the evidence file needs to back-reference. The benefit is that the entire model is auditable. I can ask where the source for a stance is, when I last verified it, what corpus snapshot the extraction ran against. The schema makes those questions answerable in seconds. Prose can’t.

There’s a particular debate I find quietly funny, when philosophers argue whether LLMs have consciousness, or whether they could. Not because the answer is obvious, and not because I’m dismissive of the question on the grounds that it’s all just bits, context windows, and math over already-existing information. That’s the easier objection. The harder one is that consciousness itself doesn’t have a working definition. We don’t agree on what it is or how to recognize it. How can we search for something we can’t yet specify?

For me mathematics isn’t just a tool we use, it’s the language the universe runs on. From inside that system, I don’t think we’ll find clean answers to some of these questions, not because the questions are wrong, but because of where we’re standing when we ask them. That uncertainty is part of why I treat this entire project as an experiment. The schema isn’t a claim about who I am. It’s an artifact I’m running to see what it does.

Stack: Obsidian, plain markdown, YAML frontmatter, wikilinks, MCP-served context.