Documentation Index Fetch the complete documentation index at: https://mintlify.com/bluesky-social/atproto/llms.txt
Use this file to discover all available pages before exploring further.
What are Repositories?
Repositories are signed, authenticated data structures that store all of a user’s records in AT Protocol. Each user has one repository containing their posts, likes, follows, profile, and other data.
Why Repositories Matter
Self-Authenticating
Every repository is cryptographically signed, making it impossible to tamper with data without detection.
Portable
Users can export their entire repository and import it on a different server, maintaining their complete history.
Efficient Sync
Repositories use Merkle trees, allowing efficient synchronization by only transferring changed data.
Verifiable
Anyone can verify the integrity and authorship of repository data using cryptographic proofs.
Repository Structure
A repository consists of:
Commit - Signed pointer to the current state
MST (Merkle Search Tree) - Ordered tree of records
Records - Individual data items (posts, likes, etc.)
Blocks - CBOR-encoded data blocks
Merkle Search Tree (MST)
The MST is the core data structure that organizes records in a repository. It combines properties of:
Merkle Trees - Cryptographic verification
B-Trees - Efficient searching and insertion
Deterministic ordering - Same data always produces same tree
Key Properties
/**
* MST characteristics:
* - Keys are stored in alphabetical order
* - Insert-order independent (deterministic)
* - Each key is hashed, leading zeros determine layer
* - ~4 fanout (2-bits of zero per layer)
* - Uses SHA-256 for key hashing
*/
How It Works
The MST uses a clever algorithm:
Hash each key with SHA-256
Count leading zero bits in the hash
Number of zeros determines tree layer
More zeros = higher in the tree
// Example key hashing:
// Key: "app.bsky.feed.post/abc123"
// Hash: 0x00F8A3... (8 leading zero bits)
// Layer: 4 (8 bits / 2 = layer 4)
This ensures:
Deterministic structure - Same records always produce same tree
Balanced tree - Probabilistically balanced by hash distribution
Efficient operations - O(log n) search, insert, delete
Using Repositories
Creating a Repository
import { MemoryBlockstore } from '@atproto/repo'
import { Repo } from '@atproto/repo'
import { Keypair } from '@atproto/crypto'
// Create storage and keypair
const storage = new MemoryBlockstore ()
const keypair = await Keypair . create ({ exportable: true })
const did = 'did:plc:user123'
// Create repository with initial records
const repo = await Repo . create (
storage ,
did ,
keypair ,
[
{
action: 'create' ,
collection: 'app.bsky.feed.post' ,
rkey: '3jzxvpqr2bc2a' ,
record: {
$type: 'app.bsky.feed.post' ,
text: 'Hello AT Protocol!' ,
createdAt: new Date (). toISOString ()
}
},
{
action: 'create' ,
collection: 'app.bsky.actor.profile' ,
rkey: 'self' ,
record: {
$type: 'app.bsky.actor.profile' ,
displayName: 'Alice' ,
description: 'AT Protocol enthusiast'
}
}
]
)
console . log ( 'Repository CID:' , repo . cid )
console . log ( 'DID:' , repo . did )
Reading Records
// Get a specific record
const uri = 'at://did:plc:user123/app.bsky.feed.post/3jzxvpqr2bc2a'
const record = await repo . getRecord ( uri )
console . log ( 'Post:' , record . value )
// List records in a collection
const posts = await repo . listRecords (
'app.bsky.feed.post' ,
{ limit: 50 }
)
for ( const post of posts . records ) {
console . log ( ` ${ post . uri } : ${ post . value . text } ` )
}
Writing Records
// Create a new record
const updated = await repo . applyWrites (
{
action: 'create' ,
collection: 'app.bsky.feed.post' ,
rkey: '3jzxvpqr2bc2b' ,
record: {
$type: 'app.bsky.feed.post' ,
text: 'Another post!' ,
createdAt: new Date (). toISOString ()
}
},
keypair
)
console . log ( 'New repository state:' , updated . cid )
Updating Records
// Update an existing record
const updated = await repo . applyWrites (
{
action: 'update' ,
collection: 'app.bsky.actor.profile' ,
rkey: 'self' ,
record: {
$type: 'app.bsky.actor.profile' ,
displayName: 'Alice Smith' ,
description: 'Updated bio'
}
},
keypair
)
Deleting Records
// Delete a record
const updated = await repo . applyWrites (
{
action: 'delete' ,
collection: 'app.bsky.feed.post' ,
rkey: '3jzxvpqr2bc2a'
},
keypair
)
Batch Operations
// Apply multiple writes in one commit
const updated = await repo . applyWrites (
[
{
action: 'create' ,
collection: 'app.bsky.feed.post' ,
rkey: 'post1' ,
record: { $type: 'app.bsky.feed.post' , text: 'Post 1' , createdAt: new Date (). toISOString () }
},
{
action: 'create' ,
collection: 'app.bsky.feed.post' ,
rkey: 'post2' ,
record: { $type: 'app.bsky.feed.post' , text: 'Post 2' , createdAt: new Date (). toISOString () }
},
{
action: 'update' ,
collection: 'app.bsky.actor.profile' ,
rkey: 'self' ,
record: { $type: 'app.bsky.actor.profile' , displayName: 'Alice' }
}
],
keypair
)
Record Keys (rkeys)
Records are identified by their collection and rkey (record key):
// AT URI format:
// at://{did}/{collection}/{rkey}
// Example:
// at://did:plc:abc123/app.bsky.feed.post/3jzxvpqr2bc2a
// └─────┬─────┘ └────────┬────────┘ └──────┬──────┘
// DID collection rkey
TID-based Keys
Most records use TIDs (Timestamp Identifiers) as rkeys:
import { TID } from '@atproto/common'
// Generate a TID (time-based, k-sortable)
const rkey = TID . nextStr ()
// Example: '3jzxvpqr2bc2a'
// TIDs are:
// - Timestamp-based (roughly sortable by creation time)
// - Collision-resistant
// - URL-safe base32 encoded
Literal Keys
Some records use fixed keys:
// Profile always uses 'self'
at : //did:plc:abc123/app.bsky.actor.profile/self
// Defined in Lexicon:
{
"type" : "record" ,
"key" : "literal:self"
}
Commits
Each repository state is represented by a signed commit:
interface Commit {
did : string // Repository owner
version : 3 // Commit format version
rev : string // Revision (TID)
data : CID // Pointer to MST root
}
Commit Lifecycle:
// Format a commit (doesn't apply it)
const commitData = await repo . formatCommit (
writeOps ,
keypair
)
console . log ( 'Commit CID:' , commitData . cid )
console . log ( 'Revision:' , commitData . rev )
console . log ( 'New blocks:' , commitData . newBlocks . size )
console . log ( 'Removed CIDs:' , commitData . removedCids . size )
// Apply the commit
const updated = await repo . applyCommit ( commitData )
CAR Files
Repositories are distributed as CAR (Content Addressed aRchive) files:
import { CarWriter } from '@atproto/repo'
// Export repository to CAR format
const car = await repo . exportCar ()
// CAR contains:
// - All MST nodes
// - All record blocks
// - Commit object
// - Organized by CID
CAR files enable:
Repository export - Users can download their complete data
Efficient sync - Only transfer changed blocks
Backup and migration - Portable repository format
MST Operations
Direct MST usage (lower-level API):
import { MST } from '@atproto/repo'
const storage = new MemoryBlockstore ()
// Create an MST
let mst = await MST . create ( storage )
// Add entries
mst = await mst . add ( 'app.bsky.feed.post/abc' , recordCid )
mst = await mst . add ( 'app.bsky.feed.post/def' , recordCid2 )
mst = await mst . add ( 'app.bsky.feed.post/xyz' , recordCid3 )
// Get entry
const cid = await mst . get ( 'app.bsky.feed.post/abc' )
// Update entry
mst = await mst . update ( 'app.bsky.feed.post/abc' , newCid )
// Delete entry
mst = await mst . delete ( 'app.bsky.feed.post/abc' )
// List entries
const entries = await mst . list ( 50 , 'app.bsky.feed.post/' )
// Get MST root CID
const rootCid = await mst . getPointer ()
Walking the Tree
// Walk all entries
for await ( const entry of mst . walk ()) {
if ( entry . isLeaf ()) {
console . log ( `Key: ${ entry . key } , Value: ${ entry . value } ` )
}
}
// Walk from a specific key
for await ( const leaf of mst . walkLeavesFrom ( 'app.bsky.feed.post/abc' )) {
console . log ( ` ${ leaf . key } : ${ leaf . value } ` )
}
// List with prefix
const posts = await mst . listWithPrefix ( 'app.bsky.feed.post/' , 100 )
Data Diff
Compute differences between repository states:
import { DataDiff } from '@atproto/repo'
// Compare two MST states
const diff = await DataDiff . of ( newMst , oldMst )
console . log ( 'New MST blocks:' , diff . newMstBlocks )
console . log ( 'New leaf CIDs:' , diff . newLeafCids )
console . log ( 'Removed CIDs:' , diff . removedCids )
Useful for:
Computing repository updates
Generating sync payloads
Tracking changes
Proofs and Verification
MSTs support cryptographic proofs:
// Get proof for a specific record
const proof = await mst . getCoveringProof ( 'app.bsky.feed.post/abc' )
// Proof includes:
// - MST nodes on path to record
// - Sibling records (left and right)
// - Enough data to verify record existence
// Verify a record against proof
// (Typically done by receiving party)
Repository Sync Protocol
Repositories sync using the following protocol:
Event Format:
interface RepoCommit {
seq : number // Sequence number
rebase : boolean // If true, full repo resync needed
tooBig : boolean // If true, repo too large to include
repo : string // DID
commit : CID // Commit CID
rev : string // Revision (TID)
since : string | null // Previous revision
blocks : Uint8Array // CAR file of new blocks
ops : RepoOp [] // List of operations
blobs : CID [] // New blobs
time : string // Timestamp
}
interface RepoOp {
action : 'create' | 'update' | 'delete'
path : string // collection/rkey
cid : CID | null // Record CID (null for deletes)
}
Best Practices
Use TIDs for time-based records
For posts, likes, and other time-series data, use TID-based rkeys for chronological ordering.
Batch writes when possible
Combine multiple operations into a single commit to reduce overhead and improve atomicity.
Validate records before writing
Use Lexicon validation to ensure records conform to schemas before adding to repository.
Handle repository migrations
Design your system to support repository export/import for user portability.
Large repositories can be expensive to sync. Consider archiving or pagination strategies.
Storage Backends
Repositories can use different storage implementations:
import {
MemoryBlockstore , // In-memory (testing)
SqliteBlockstore , // SQLite (production)
// Custom implementations possible
} from '@atproto/repo'
// Memory storage (ephemeral)
const memStorage = new MemoryBlockstore ()
// Persistent storage
const sqlStorage = new SqliteBlockstore ( 'path/to/db.sqlite' )
Storage Interface:
interface Blockstore {
has ( cid : CID ) : Promise < boolean >
get ( cid : CID ) : Promise < Uint8Array >
put ( cid : CID , bytes : Uint8Array ) : Promise < void >
delete ( cid : CID ) : Promise < void >
// ... additional methods
}
Error Handling
import {
MissingBlockError ,
MissingBlocksError
} from '@atproto/repo'
try {
const record = await repo . getRecord ( uri )
} catch ( error ) {
if ( error instanceof MissingBlockError ) {
console . error ( 'Block not found:' , error . cid )
} else if ( error instanceof MissingBlocksError ) {
console . error ( 'Multiple blocks missing:' , error . cids )
} else {
console . error ( 'Unexpected error:' , error )
}
}
Additional Resources
@atproto/repo Package NPM package documentation
Repository Spec Official repository specification
MST Paper Academic paper on Merkle Search Trees
CAR Format Content Addressed aRchive specification