Architecture
System architecture, component layout, database schema, NATS stream topology, and Redis caching strategy.
This document details the high-level system architecture, components integration, database schema, caching strategy, and queue topology of the Meta Business MCP platform.
High-Level Architecture
The system is built as a single Go binary that operates in dual-mode:
| Mode | Transport | Purpose |
|---|---|---|
| MCP Server | stdio (stdin/stdout) | Exposes 24 structured tools to AI clients (Gemini, Claude, Cursor) |
| HTTP Server | TCP :8080 | Webhook receiver, health checks, Prometheus metrics |
┌───────────────────────────────┐
│ AI Agent │
└──────────────┬────────────────┘
│ Stdio (MCP Protocol)
▼
┌────────────────────────────────────────────────────────┐
│ Meta Business MCP │
│ │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ MCP Server ├───────►│ Compliance Engine │ │
│ └───────────────────┘ └─────────┬─────────┘ │
│ ▼ │
│ ┌───────────────────┐ ┌───────────────────┐ │
│ │ Webhook Receiver │ │ Policy Engine │ │
│ └─────────▲─────────┘ └─────────┬─────────┘ │
│ │ ▼ │
│ │ ┌───────────────────┐ │
│ │ │ Orchestrator │ │
│ │ └─────────┬─────────┘ │
└─────────────┼────────────────────────────┼─────────────┘
│ Webhook │ POST /messages
│ ▼
┌─────────────┴────────────────────────────┴─────────────┐
│ Meta Cloud API (or Mock :8081) │
└────────────────────────────────────────────────────────┘Component Breakdown
| Service | Package | Role |
|---|---|---|
| MCP Server | pkg/mcp/ | Registers and handles 24 MCP tool calls from AI Agents. Split into 6 handler files (~2,000 lines). |
| Compliance Engine | pkg/compliance/ | Enforces the 24h care window, opt-out/opt-in state, and frequency caps. |
| Policy Engine | pkg/policy/ | Evaluates dynamic business rules (time limits, tag exclusions) from the DB. |
| Conversation State Engine | pkg/state/ | Manages care window TTL with Redis caching and PostgreSQL fallback. |
| User Intelligence | pkg/userintel/ | Tracks opt-ins, tags (e.g. vip), and interaction timelines. |
| Template Manager | pkg/template/ | Syncs approved templates from Meta, validates variables, caches locally. |
| Delivery Orchestrator | pkg/delivery/ | Enqueues messages to NATS JetStream; manages the background worker pool. |
| Scheduler | pkg/delivery/ | In-process poll-based scheduler for scheduled messages and campaigns. |
| Rate Limiter | pkg/ratelimit/ | Per-customer token bucket rate limits via Redis Lua scripts. |
| Error Intelligence | pkg/errorintel/ | Maps Meta error codes to retry/no-retry classifications and explanations. |
| Campaign Module | pkg/campaign/ | Campaign schema, audience filters, and tier-gated operations. |
| Webhook Receiver | pkg/webhook/ | Parses and dispatches inbound Meta events (messages, status, templates). |
| Observability | pkg/observability/ | Custom Prometheus metrics. |
Caching Strategy
The system uses Redis to cache conversation states to achieve sub-millisecond latencies for compliance checks:
- Cache Key:
conv:<customer_id>:<channel>(e.g.conv:628119989630:whatsapp) - TTL: Dynamically set to match the exact remaining duration of the 24-hour care window.
- Graceful Fallback: If Redis is offline, the state engine automatically queries PostgreSQL directly without throwing errors.
Phone number format: Internal cache and DB store numbers without the leading + prefix. MCP tool boundaries normalize +-prefixed inputs.
Queue Topology
The platform integrates NATS JetStream configured with a WorkQueue retention policy (messages are removed from NATS once acknowledged by workers).
META_MCP_DELIVERY Stream
| Property | Value |
|---|---|
| Subjects | whatsapp.messages.outbound, whatsapp.messages.retry |
| Consumer | Durable Pull Consumer (delivery-workers) |
| Max Deliver | 3 attempts |
| Ack Wait | 30s |
| Backoff Delay | Native exponential backoff (NakWithDelay at 1s, then 5s) |
META_MCP_CAMPAIGN Stream
| Property | Value |
|---|---|
| Subjects | whatsapp.campaigns.trigger |
| Consumer | campaign-workers |
Database Schema (PostgreSQL)
Channel-agnostic architecture. WhatsApp is the active channel. Messenger and Instagram are activation targets, not roadmap commitments.
Persistence Layers
| Layer | Technology | Usage |
|---|---|---|
| Primary DB | PostgreSQL 16 (pgxpool) | All persistent data: conversations, messages, customers, templates, policies, campaigns, audit logs. |
| Cache | Redis 7 | Care window TTL state; token-bucket counters via Lua scripts; graceful PG fallback. |
| Message Broker | NATS JetStream | Async delivery queue with WorkQueue retention and pull consumers. |