Google's Open Knowledge Format (OKF)
Google recently published the Open Knowledge Format (OKF), a proposed standard for packaging organizational knowledge so AI agents can read, share, and act on it. Announced on June 12, 2026 by Google Cloud's Sam McVeety and Amir Hormati, OKF is deliberately minimal: a directory of markdown files with a small YAML frontmatter block. The spec fits on a single page, yet it tries to solve a problem that has absorbed the AI industry for years — how to give a model reliable, reusable memory without exotic infrastructure.
What it is
OKF is a file-based knowledge representation. A bundle is just a folder of markdown documents; each file is one concept, such as a table, metric, API, playbook, or dataset. The file's path inside the folder becomes its identity, and ordinary markdown links between files form a knowledge graph.
The format has exactly one hard rule: every concept file must include a type field in its frontmatter. Everything else is optional but recommended:
title— a human-readable display namedescription— a one-line summaryresource— a canonical URI for the underlying assettags— cross-cutting categoriestimestamp— last-modified time in ISO 8601 format
Two filenames are reserved: index.md for directory listings and log.md for change history. The body of each file is standard markdown, so it renders on GitHub, diffs in pull requests, and can be read by any agent that can open a text file.
The idea has roots in Andrej Karpathy's "LLM Wiki" gist from earlier this year: instead of repeatedly searching the same raw documents, you build a living encyclopedia once and let the agent read it like a codebase. Google has taken that community pattern and formalized the interoperability layer.
Why it matters
The dominant alternative today is RAG: chop documents into chunks, embed them, and retrieve the closest snippets for every query. It works, but it repeats the same reasoning every time. OKF shifts the work forward. Concepts are summarized, cross-linked, and organized once — when the bundle is built — so each question starts from a curated answer rather than a pile of fragments.
The folder model also scales without a database. A root index.md lets an agent peek at the table of contents, pick the one file it needs, and ignore the other thousands. Because everything is plain text, bundles live in git, ship as tarballs, and run offline.
Finally, OKF is explicitly not competing with MCP. MCP is a live pipe between agents and tools; OKF is the cargo that can travel through that pipe. One moves data in real time, the other preserves knowledge across time and organizations.
Pros and cons
Pros
- Extremely simple. One required field, no SDK, no runtime, no API key. If you can write markdown, you can produce OKF.
- Portable and version-controlled. Git diffs, PR reviews, and offline access come for free because the bundle is just files.
- Agent- and human-readable. The same document works for both; there is no separate "machine view."
- Forgiving by design. Consumers must tolerate unknown fields, broken links, and unfamiliar concept types, which makes partial or generated bundles practical.
Cons
- No built-in freshness. A
timestampfield records when something changed, but nothing in the format updates itself. Shared team folders can go stale quickly without a dedicated owner or automation. - The messy-librarian problem. LLMs are not always tidy markdown authors; they can mangle headers, invent links, or drift in style. Google's answer is to make readers tolerant, which helps but does not fix the source.
- Semantics are still up to you. The only required field is a free-form
typelabel. One team may writeBigQuery Table, another may writetable, and both are valid. Interoperability at the file level is easy; agreeing on meaning is still hard. - Ecosystem is young. At launch, tooling and adoption outside Google are thin. A standard with few producers and consumers risks becoming a suggestion rather than a standard.
How to use it
Getting started is straightforward:
-
Create a bundle folder. Any directory will do. A git repository is recommended for history and review.
-
Write concept files. Each file gets a YAML frontmatter block with at least a
type, followed by markdown content. For example:--- type: Metric title: Weekly Active Users description: Number of distinct users who performed any event in the last 7 days. tags: [growth, product] timestamp: 2026-06-28T10:00:00Z --- WAU is computed from the [events table](/tables/events.md) by counting unique `user_id` values with an `event_time` in the trailing 7-day window. -
Link concepts together. Use bundle-relative markdown links like
[events table](/tables/events.md)to build the graph. -
Add
index.mdfiles. Optional but useful: list the contents of each directory so an agent can navigate progressively without swallowing the whole bundle. -
Add a
log.md. Optional: keep a running history of updates grouped by date. -
Generate or enrich with an agent. Google's reference tools include a BigQuery enrichment agent that drafts concept documents from a dataset, and a static HTML visualizer that renders any bundle as an interactive graph. Neither is required by the format.
For a complete reference, the official spec lives in the GoogleCloudPlatform/knowledge-catalog repository, along with sample bundles for GA4 e-commerce, Stack Overflow, and Bitcoin public datasets.
Final note
OKF is less a product than a bet: that the best way to give AI agents memory is not a vector database or a proprietary knowledge platform, but a well-organized folder of text files. That idea already had momentum before Google weighed in. Whether OKF itself becomes the standard depends on whether enough teams produce and consume bundles, and on how well they solve the boring but critical problems of maintenance, taxonomy, and quality control. The container is simple; keeping it useful over time is where the real work lies.