Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/bluesky-social/atproto/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The com.atproto.sync namespace provides lexicons for synchronizing repository data across the AT Protocol network. These are primarily used by crawlers, feed generators, and other services that need to replicate and index repository data.

Key Concepts

  • Repository Synchronization: Replicating complete repository state
  • CAR Files: Content-Addressable aRchive format for data transfer
  • Commit History: Versioned sequence of repository changes
  • Blocks: Individual data blocks in a repository
  • Blobs: Binary large objects (images, videos, etc.)

Repository Queries

getRepo

Get a repository export as a CAR file. Endpoint: com.atproto.sync.getRepo
did
string
required
DID of the repository
since
string
Only include commits since this revision
Response: Binary CAR file containing repository data Example:
const response = await agent.com.atproto.sync.getRepo({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur'
})

// Response is a CAR file (binary)
const carBytes = response.data
curl "https://bsky.social/xrpc/com.atproto.sync.getRepo?did=did:plc:z72i7hdynmk6r22z27h6tvur" \
  -o repo.car

getRecord

Get a specific record from a repository. Endpoint: com.atproto.sync.getRecord
did
string
required
DID of the repository
collection
string
required
NSID of the record collection
rkey
string
required
Record key
commit
string
Specific commit CID to retrieve record from
Response: Binary record data

listRepos

List repositories on a host. Endpoint: com.atproto.sync.listRepos
limit
integer
Maximum repositories to return (1-1000, default 500)
cursor
string
Pagination cursor
Response:
cursor
string
Next page cursor
repos
array
required
Array of repository information
Example:
const response = await agent.com.atproto.sync.listRepos({
  limit: 100
})

for (const repo of response.data.repos) {
  console.log(repo.did, repo.head)
}

listReposByCollection

List repositories that contain a specific collection. Endpoint: com.atproto.sync.listReposByCollection
collection
string
required
NSID of the collection (e.g., app.bsky.feed.post)
limit
integer
Maximum repositories to return
cursor
string
Pagination cursor

getRepoStatus

Get the sync status of a repository. Endpoint: com.atproto.sync.getRepoStatus
did
string
required
DID of the repository
Response:
did
string
required
Repository DID
active
boolean
required
Whether the repository is active
status
string
Repository status (e.g., takendown, suspended, deactivated)
rev
string
Current revision

Commit Operations

getHead

Get the current head commit of a repository. Endpoint: com.atproto.sync.getHead
did
string
required
DID of the repository
Response:
root
string
required
CID of the current repository head

getLatestCommit

Get the latest commit for a repository. Endpoint: com.atproto.sync.getLatestCommit
did
string
required
DID of the repository
Response:
cid
string
required
CID of the latest commit
rev
string
required
Revision identifier

getCheckout

Get a repository checkout at a specific commit. Endpoint: com.atproto.sync.getCheckout
did
string
required
DID of the repository
Response: Binary CAR file of repository at specified commit

Block Operations

getBlocks

Get blocks from a repository. Endpoint: com.atproto.sync.getBlocks
did
string
required
DID of the repository
cids
array
required
Array of block CIDs to retrieve
Response: Binary CAR file containing requested blocks

Blob Operations

getBlob

Get a blob from a repository. Endpoint: com.atproto.sync.getBlob
did
string
required
DID of the repository
cid
string
required
CID of the blob
Response: Binary blob data with appropriate Content-Type header Example:
const response = await agent.com.atproto.sync.getBlob({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur',
  cid: 'bafkreiabcdef...'
})

// Response is binary image/video/etc data
const blobData = response.data
curl "https://bsky.social/xrpc/com.atproto.sync.getBlob?did=did:plc:...&cid=bafkrei..." \
  -o image.jpg

listBlobs

List blobs in a repository. Endpoint: com.atproto.sync.listBlobs
did
string
required
DID of the repository
since
string
Only include blobs since this revision
limit
integer
Maximum blobs to return
cursor
string
Pagination cursor
Response:
cursor
string
Next page cursor
cids
array
required
Array of blob CIDs

Subscriptions

subscribeRepos

Subscribe to repository update events (firehose). Endpoint: com.atproto.sync.subscribeRepos Protocol: WebSocket
cursor
integer
Start from specific event sequence number
Events:
  • commit: Repository commit with new/updated/deleted records
  • identity: Identity update (handle change)
  • account: Account status change
  • handle: Handle update
  • migrate: Repository migration
  • tombstone: Repository deleted
Example:
import { Subscription } from '@atproto/sync'

const sub = new Subscription({
  service: 'wss://bsky.network',
  method: 'com.atproto.sync.subscribeRepos'
})

sub.on('commit', (evt) => {
  console.log('New commit from', evt.repo)
  console.log('Operations:', evt.ops)
})

sub.on('identity', (evt) => {
  console.log('Identity update:', evt.did)
})

sub.on('account', (evt) => {
  console.log('Account status:', evt.status)
})
Commit Event Structure:
{
  repo: string,      // DID of repository
  rev: string,       // Revision
  seq: number,       // Sequence number
  since: string,     // Previous revision
  time: string,      // Timestamp
  blocks: Uint8Array, // CAR file with blocks
  ops: [
    {
      action: 'create' | 'update' | 'delete',
      path: string,   // collection/rkey
      cid: string     // Record CID (null for deletes)
    }
  ],
  blobs: string[],   // Blob CIDs referenced
  commit: string,    // Commit CID
  prev: string       // Previous commit CID
}

Crawling

notifyOfUpdate

Notify a crawling service of a repository update. Endpoint: com.atproto.sync.notifyOfUpdate Authentication: Required
hostname
string
required
Hostname of the PDS that has updates

requestCrawl

Request a crawl of a repository. Endpoint: com.atproto.sync.requestCrawl Authentication: Required
hostname
string
required
Hostname of the PDS to crawl

Host Management

listHosts

List known PDS hosts. Endpoint: com.atproto.sync.listHosts
limit
integer
Maximum hosts to return
cursor
string
Pagination cursor
Response:
cursor
string
Next page cursor
hosts
array
required
Array of host information

getHostStatus

Get the status of a PDS host. Endpoint: com.atproto.sync.getHostStatus
hostname
string
required
Hostname of the PDS
Response:
status
string
required
Host status: active, idle, offline, throttled, or banned

Type Definitions

hostStatus

Possible values for PDS host status:
  • active: Host is actively serving requests
  • idle: Host is online but not active
  • offline: Host is not responding
  • throttled: Host is rate-limited
  • banned: Host is banned from network

Common Use Cases

Building a Firehose Consumer

import { Subscription } from '@atproto/sync'
import { cborToLexRecord, readCar } from '@atproto/repo'

const sub = new Subscription({
  service: 'wss://bsky.network',
  method: 'com.atproto.sync.subscribeRepos'
})

sub.on('commit', async (evt) => {
  // Parse the CAR file to get record data
  const car = await readCar(evt.blocks)
  
  for (const op of evt.ops) {
    if (op.action === 'create' && op.path.includes('app.bsky.feed.post')) {
      // Get the record from the CAR
      const record = cborToLexRecord(car.blocks.get(op.cid))
      
      console.log('New post from', evt.repo)
      console.log('Text:', record.text)
    }
  }
})

Crawling a Repository

// Get the full repository
const repo = await agent.com.atproto.sync.getRepo({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur'
})

// Save to file
import { writeFile } from 'fs/promises'
await writeFile('repo.car', repo.data)

// Parse the CAR file
import { readCar } from '@atproto/repo'
const car = await readCar(repo.data)

// Process records
for (const [cid, block] of car.blocks.entries()) {
  const record = cborToLexRecord(block)
  if (record.$type === 'app.bsky.feed.post') {
    console.log('Post:', record.text)
  }
}

Downloading Blobs

// List all blobs in a repository
const blobs = await agent.com.atproto.sync.listBlobs({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur'
})

// Download each blob
for (const cid of blobs.data.cids) {
  const blob = await agent.com.atproto.sync.getBlob({
    did: 'did:plc:z72i7hdynmk6r22z27h6tvur',
    cid
  })
  
  // Save blob to file
  await writeFile(`blobs/${cid}`, blob.data)
}

Monitoring Repository Updates

// Get current head
const head1 = await agent.com.atproto.sync.getHead({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur'
})

// ... wait for changes ...

// Check for updates
const head2 = await agent.com.atproto.sync.getHead({
  did: 'did:plc:z72i7hdynmk6r22z27h6tvur'
})

if (head1.data.root !== head2.data.root) {
  // Repository has been updated
  // Get only the new commits
  const updates = await agent.com.atproto.sync.getRepo({
    did: 'did:plc:z72i7hdynmk6r22z27h6tvur',
    since: head1.data.root
  })
}

Resources