Architecture Overview

High-level overview of the Vows Social AI system architecture.

For complete technical details, see docs/ARCHITECTURE.md.

System Overview

graph TB
    User[👤 User] --> Mobile[📱 Mobile App]
    User --> Web[🌐 Web App]

    Mobile --> API[⚡ Cloudflare Workers API]
    Web --> API

    API --> Orchestrator[🎯 Orchestrator DO]
    Orchestrator --> ML[🧠 ML Inference<br/>Fly.io]
    Orchestrator --> Qdrant[📊 Qdrant<br/>Vector DB]
    Orchestrator --> Supabase[💾 Supabase<br/>PostgreSQL]

    Instagram[📷 Instagram] --> Discovery[🔍 Discovery Agent]
    Discovery --> Qdrant
    Discovery --> Supabase

Core Components

1. Foundation Model (The Brain)

A custom transformer that learns user preferences from interaction sequences.

Architecture: 384 → 128 → 384 dimensions - Input: Sentence-BERT embeddings (pre-trained) - Output: 128-dimensional user representation - Training: User interaction sequences

Purpose: Understand each user's unique wedding vision

See: Foundation Model Deep Dive

2. Orchestrator (The Coordinator)

Cloudflare Durable Object that manages per-user feed ranking.

Responsibilities: - Fetch user embedding from ML service - Query Qdrant for candidate content - Apply Thompson Sampling for ranking - Track interactions and update rewards

Performance: <500ms p99 latency

See: Orchestrator Documentation

3. Multi-Agent Crew

Six specialized agents working together:

Agent	Purpose	Technology
Discovery	Find content from Instagram	Instagram API
Quality Guardian	Filter low-quality posts	Gemini scoring
Personal Archivist	Remember user preferences	Cloudflare KV
Serendipity Engine	Diversify feed	Diversity metrics
Engagement Forecaster	Predict user actions	Foundation model
Orchestrator	Coordinate everything	MAGRPO

See: Multi-Agent System

4. Thompson Sampling (Exploration/Exploitation)

Bayesian bandit algorithm that balances: - Exploitation: Show content you'll probably love - Exploration: Discover new styles and ideas

How it works: - Each content item has alpha/beta parameters (Beta distribution) - Sample reward estimate for each candidate - Rank by sampled rewards - Update parameters based on user interactions

Benefits: - Adapts to user quickly - Never gets stuck in filter bubble - Proven at scale (Instagram, Pinterest, TikTok)

See: Thompson Sampling Details

5. Data Layer

Three databases working together:

Qdrant (Vector DB) - Content embeddings (384-dim) - User embeddings (128-dim) - Semantic similarity search - 1GB free tier

Supabase (PostgreSQL) - User profiles and metadata - Interaction history - Content metadata - 500MB free tier

Cloudflare KV - Thompson Sampling parameters (alpha/beta) - Agent memory and state - Cache for hot data - Effectively unlimited (free tier)

See: Database Schema

Data Flow

Feed Generation Request

sequenceDiagram
    participant User
    participant API
    participant Orchestrator
    participant ML
    participant Qdrant

    User->>API: GET /api/feed
    API->>Orchestrator: Request feed
    Orchestrator->>ML: Get user embedding
    ML->>Orchestrator: 128-dim vector
    Orchestrator->>Qdrant: Search similar content
    Qdrant->>Orchestrator: 100 candidates
    Orchestrator->>Orchestrator: Thompson Sampling rank
    Orchestrator->>API: Ranked feed
    API->>User: Personalized content

See: Complete Data Flow

Content Discovery Pipeline

graph LR
    A[Instagram Curators] --> B[Discovery Agent]
    B --> C{Quality > 0.7?}
    C -->|Yes| D[Qdrant + Supabase]
    C -->|No| E[Discard]
    D --> F[Available in Feed]

Technology Stack

Compute Layer

Cloudflare Workers - Serverless API endpoints
Cloudflare Durable Objects - Stateful per-user orchestration
Fly.io - ML inference service (PyTorch)

ML/AI Layer

PyTorch - Foundation model training
Sentence-BERT - Content embeddings
Qdrant - Vector similarity search
Gemini - Quality scoring

Data Layer

Supabase - PostgreSQL for structured data
Cloudflare KV - Thompson Sampling state
Cloudflare R2 - Model checkpoints

Why These Choices?

See ADRs: - ADR-0001: AI-First Architecture - ADR-0002: Free Tier First - ADR-0003: Cloudflare Native Stack

Scalability Strategy

Phase 1: Free Tier (0-1K users)

Sentence-BERT embeddings (pre-trained)
Simple user representations
Thompson Sampling only
Manual Instagram monitoring

Phase 2: Growth (1K-100K users)

Custom foundation model
Full multi-agent system
MAGRPO coordination
Automated discovery

Phase 3: Scale (100K+ users)

Workers AI for edge inference
ENR for Thompson Sampling
Multiple content sources
Real-time training

Performance Targets

Metric	Target	Current
Feed latency (p99)	<500ms	TBD
Discovery throughput	1K posts/hour	TBD
Embedding generation	<100ms	TBD
Model training time	<2 hours	TBD

Security & Privacy

User data: Encrypted at rest (Supabase)
API keys: Cloudflare Secrets
Embeddings: No PII stored
Interactions: Anonymizable for training

Monitoring

Cloudflare Analytics - Request metrics
Supabase Logs - Database queries
Fly.io Metrics - ML service performance
Custom: Thompson Sampling regret, model drift

Next Steps

Deep Dive: System Architecture Diagrams
Implementation: Phase 1 Plan
Components: Component Documentation
API: API Reference

Questions? See the complete Technical Architecture or check ADRs.