Alexandr Chibilyaev reveals how AACFlow always-on agents work: event listeners, polling intervals, webhook triggers, hosting infrastructure, health monitoring, and automatic failure recovery.
An agent that only works when you click "run" is a tool. An agent that operates continuously — watching for events, polling for changes, responding to triggers — is infrastructure.
This is the difference between "AI that helps when I ask" and "AI that handles things I don't even know about." The former is nice to have. The latter changes how a business operates.
AACFlow's always-on agents are the foundation of our 24/7 AI operations. They run continuously — on our cloud or on your infrastructure — executing workflows in response to real-world events, on schedules, and as background processes that never stop.
Diadoc sends a webhook when a counterparty signs a document → the document reconciliation agent runs
GitHub sends a webhook when a PR is opened → the code review agent analyzes the diff
Stripe / CloudPayments sends a webhook on payment success → the invoicing agent generates the invoice
The webhook system supports 15+ providers natively (Gmail, Outlook, GitHub, Slack, Discord, Telegram, and major CRM/e-commerce platforms). Each provider's webhook registration and lifecycle management is handled automatically by the platform.
1
// Simplified webhook flow
2
Webhook received → Validate signature → Decode payload →
3
Match against registered workflows → Execute agent → Return response
Webhooks are the fastest trigger — latency from event to agent execution is typically under 2 seconds. They're also the most efficient — the agent only runs when something actually happens, not polling empty feeds.
Some data sources don't support webhooks. Or the event you care about isn't an API event — it's a condition: "alert me if any deal has been stagnant for more than 7 days" or "check if inventory on any marketplace dropped below the safety threshold."
Polling intervals solve this. An agent is configured with a cron-like schedule:
Every 15 minutes: check inventory levels across all marketplaces
Every hour: scan all CRM deals for follow-up gaps
Every 6 hours: reconcile Chestny Znak codes against physical inventory
Every day at 9:00 AM: generate the competitive intelligence briefing
Every Monday at 8:00 AM: produce the weekly lost deal analysis report
The polling system is built on a distributed scheduler. Each workspace's polling agents are evaluated against their schedules. When a schedule fires, the agent workflow is queued for execution.
Polling is the most flexible trigger — any condition that can be checked via the knowledge base or an API call can become a scheduled agent. But it's also the most resource-intensive, since the agent runs on schedule regardless of whether there's anything to do. The agent itself should be smart about this: on each poll, check if there's actual work to perform, and exit early if not.
Always-on agents can also be triggered manually — by a user clicking "run" in the dashboard, by another agent via the A2A protocol, or by an external system calling the AACFlow API.
This hybrid model is important. An agent that primarily runs on webhooks might need manual triggering for backfilling historical data. An agent that runs on a schedule might need ad-hoc execution when a special situation arises. The infrastructure supports all three modes simultaneously on the same agent.
An always-on agent isn't a script running on someone's laptop. It's a hosted service with defined infrastructure requirements:
AACFlow Cloud — agents run on our infrastructure. No setup required. The agent is always available, automatically scaled, and monitored by our operations team. This is the default for most users.
Self-Hosted (Docker Compose) — agents run on the user's own infrastructure using our Docker Compose deployment. The same webhook receiver, polling scheduler, and execution engine run inside the customer's Docker containers. Deployable via Coolify in under 20 minutes.
Private Cloud — for enterprise customers, agents run in a dedicated single-tenant VPC. Fully isolated infrastructure with customer-managed networking, encryption keys, and access controls. Suitable for regulated industries handling compliance-sensitive data.
The architecture is identical across all three deployment models. An agent built for AACFlow Cloud runs identically on a self-hosted server or in a private cloud. No code changes. No "enterprise version." Just the same engine, deployed where the customer needs it.
Always-on capability is a premium feature priced at $20/month per always-on agent (in addition to the base plan). This covers:
Infrastructure — compute, memory, and storage for continuous agent operation
Webhook ingestion — receiving, validating, and processing webhook events at any volume
Polling scheduler — distributed cron execution with sub-second precision
Health monitoring — automatic detection and recovery from agent failures
Execution logs — full observability into every always-on agent run
Token costs for LLM calls during agent execution are separate — billed through AACFlow credits or the user's own API keys (BYOK). The $20/month is purely for the always-on infrastructure, not the AI compute.
For comparison: a virtual assistant or part-time operations person handling the same monitoring and response work would cost 50-100x more. The economics of always-on agents are transformative for businesses that operate continuously — which is to say, all of them.
An always-on agent that silently fails is worse than no agent at all — because you're relying on it without knowing it's broken. Our health monitoring system prevents this:
Heartbeat checks. Every always-on agent emits a heartbeat at a configurable interval. If the heartbeat stops, the monitoring system escalates.
Execution monitoring. Every agent execution is tracked: start time, end time, exit status. If an agent execution exceeds its expected duration, it's flagged. If it exits with an error, it's logged with full context.
Automatic restart. If an agent process crashes, the system attempts to restart it automatically — up to a configurable maximum number of restarts per hour. After the threshold, human intervention is requested.
Alerting. When an agent is unhealthy for more than 5 minutes, alerts are sent: in-app notification, email, and (if configured) Telegram or Slack messages to the workspace admins.
Degradation, not failure. If an always-on agent is unhealthy, the platform continues operating. Other agents are unaffected. Manual triggers still work. Only the specific failing agent is impacted. This isolation is critical for production reliability.
The health monitoring system is itself monitored. A meta-monitor checks that the monitoring system is running. If the monitor's monitor fails... well, at some point you need a human. But we've built enough redundancy that the human is rarely needed.
Always-on agents consume resources — compute, memory, database connections, LLM tokens. The platform manages this through several mechanisms:
Per-workspace concurrency limits. A workspace can run up to N always-on agents simultaneously (configurable by plan). This prevents a single workspace from overwhelming shared infrastructure.
Execution queuing. If multiple triggers fire simultaneously for the same agent, executions are queued and processed sequentially. This prevents race conditions and resource spikes.
Rate limit awareness. Agents that call external APIs inherit the connector's rate limiting configuration. An always-on agent polling a government API at 0.5 req/s won't accidentally hammer that API with concurrent requests.
Warm vs. cold starts. Frequently-triggered agents maintain warm execution contexts for low latency. Rarely-triggered agents cold-start when needed — higher latency but zero resource consumption when idle.
Multi-tenant isolation. On AACFlow Cloud, always-on agents from different workspaces are isolated at the process level. A runaway agent from Workspace A cannot impact Workspace B.
What it does: queries the knowledge base for inventory levels across Wildberries, Ozon, and Avito. Flags any product below safety stock on any marketplace. Checks for orders older than 24 hours that haven't shipped. Verifies that yesterday's Chestny Znak sales were reported.
Impact: reduced overselling incidents by 92%, eliminated missed Chestny Znak reporting deadlines.
Trigger: webhook from Diadoc when a new contract is received
What it does: queries FNS EGRUL for the counterparty's registration status, checks FNS for tax debts, queries FedResurs for bankruptcy filings, checks the FSSP database for enforcement proceedings. Posts a risk summary to the deal in Bitrix24. Flags high-risk counterparties for manual review.
Impact: caught 3 fraudulent counterparties in the first month, prevented an estimated $40,000 in bad contracts.
What it does: scans all active deals in AmoCRM. Identifies deals with no activity in 7+ days. For each, drafts a personalized follow-up message based on deal context and posts it as a deal comment with @mention to the responsible manager. If no action within 24 hours, escalates to team lead.
Impact: follow-up consistency improved from 62% to 90%. Recovered an average of 3 "forgotten" deals per week that resulted in closed business.
What it does: checks the reporting calendar (maintained in the knowledge base from Kontur.Extern and regulatory sources). Identifies all tax declarations, statistical reports, and compliance filings due this week. Generates a prioritized task list with deadlines, required documents, and submission instructions. Posts to the accounting team's Telegram channel.
Impact: zero missed filing deadlines since deployment. The accounting team lead described it as "having a compliance officer who never takes a day off."
The best always-on agent is one you forget exists — until you realize it's been handling critical operations flawlessly for months.
A salesperson who always gets follow-up reminders on time doesn't think about the agent. They just think "the CRM is really good at reminding me." An operations manager who never sees an overselling incident doesn't credit the inventory sync agent. They just think "our inventory system works well."
This is the goal. Always-on agents should fade into the background — reliable, silent, always working. Like electricity. You only notice it when it stops.
Identify a repetitive monitoring task — something a human checks regularly: "are there any overdue invoices?", "did any deals go stale?", "are inventory levels okay?"
Build the agent workflow — define the check logic, the action to take when a condition is met, and the notification channel
Configure the trigger — webhook (fastest, most efficient) or polling interval (most flexible)
Test in manual mode — run the agent manually a few times, verify it works correctly
Switch to always-on — enable the agent as always-on, configure the health monitoring, and let it run
Start with one agent. Observe it for a week. Build trust. Then expand. Before long, you'll have a fleet of always-on agents handling the monitoring, alerting, and responding that used to consume hours of human attention — freeing your team to focus on work that actually requires human judgment.