The Sync Engine: How AACFlow Keeps 170+ Data Sources Fresh Without Melting Your Server

A knowledge base with stale data is worse than no knowledge base at all. An AI agent that acts on last week's CRM records will make decisions that cost real money — wrong follow-ups, missed deals, incorrect inventory counts.

The problem isn't "getting data in." It's keeping it fresh — across 170+ connectors, each with different APIs, pagination schemes, rate limits, and quirks — without melting your infrastructure.

This is the story of AACFlow's sync engine. It processes millions of documents daily, detects changes with surgical precision, and handles failure with the discipline of a database replication system.

The Core Loop: What Happens When You Click "Sync"

When a user triggers a sync — manually or on schedule — the engine executes a carefully orchestrated loop:

Acquire lock — a distributed Redis lock prevents duplicate syncs on the same connector
Resolve auth — OAuth tokens are refreshed if needed; API keys are decrypted
Validate config — a quick test call confirms the connector can reach the source
Call listDocuments — page by page, with cursor pagination
Hash and compare — for each document, compare content hash against stored version
Fetch deferred content — documents flagged contentDeferred: true get full content via

The Sync Engine: How AACFlow Keeps 170+ Data Sources Fresh Without Melting Your Server

The Core Loop: What Happens When You Click "Sync"

Related posts

Incremental Sync: Don't Re-Sync the Entire Universe

Content Hashing: The Universal Change Detector

syncContext: Cross-Page Caching Without Global State

contentDeferred: The Lazy Content Pattern

Retry Logic: When APIs Misbehave

Adaptive Rate Limiting: Learning the Limits

Parallel Sync: Many Connectors, One Engine

SyncResult: The Report Card

Deletion Detection: What's No Longer There

Conflict Resolution: When Multiple Users Edit the Same KB

What We Learned Building It