Skip to content

Architecture Overview

High-level overview of the Vows Social AI system architecture.

For complete technical details, see docs/ARCHITECTURE.md.

System Overview

graph TB
    User[👤 User] --> Mobile[📱 Mobile App]
    User --> Web[🌐 Web App]

    Mobile --> API[⚡ Cloudflare Workers API]
    Web --> API

    API --> Orchestrator[🎯 Orchestrator DO]
    Orchestrator --> ML[🧠 ML Inference<br/>Fly.io]
    Orchestrator --> Qdrant[📊 Qdrant<br/>Vector DB]
    Orchestrator --> Supabase[💾 Supabase<br/>PostgreSQL]

    Instagram[📷 Instagram] --> Discovery[🔍 Discovery Agent]
    Discovery --> Qdrant
    Discovery --> Supabase

Core Components

1. Foundation Model (The Brain)

A custom transformer that learns user preferences from interaction sequences.

Architecture: 384 → 128 → 384 dimensions - Input: Sentence-BERT embeddings (pre-trained) - Output: 128-dimensional user representation - Training: User interaction sequences

Purpose: Understand each user's unique wedding vision

See: Foundation Model Deep Dive

2. Orchestrator (The Coordinator)

Cloudflare Durable Object that manages per-user feed ranking.

Responsibilities: - Fetch user embedding from ML service - Query Qdrant for candidate content - Apply Thompson Sampling for ranking - Track interactions and update rewards

Performance: <500ms p99 latency

See: Orchestrator Documentation

3. Multi-Agent Crew

Six specialized agents working together:

Agent Purpose Technology
Discovery Find content from Instagram Instagram API
Quality Guardian Filter low-quality posts Gemini scoring
Personal Archivist Remember user preferences Cloudflare KV
Serendipity Engine Diversify feed Diversity metrics
Engagement Forecaster Predict user actions Foundation model
Orchestrator Coordinate everything MAGRPO

See: Multi-Agent System

4. Thompson Sampling (Exploration/Exploitation)

Bayesian bandit algorithm that balances: - Exploitation: Show content you'll probably love - Exploration: Discover new styles and ideas

How it works: - Each content item has alpha/beta parameters (Beta distribution) - Sample reward estimate for each candidate - Rank by sampled rewards - Update parameters based on user interactions

Benefits: - Adapts to user quickly - Never gets stuck in filter bubble - Proven at scale (Instagram, Pinterest, TikTok)

See: Thompson Sampling Details

5. Data Layer

Three databases working together:

Qdrant (Vector DB) - Content embeddings (384-dim) - User embeddings (128-dim) - Semantic similarity search - 1GB free tier

Supabase (PostgreSQL) - User profiles and metadata - Interaction history - Content metadata - 500MB free tier

Cloudflare KV - Thompson Sampling parameters (alpha/beta) - Agent memory and state - Cache for hot data - Effectively unlimited (free tier)

See: Database Schema

Data Flow

Feed Generation Request

sequenceDiagram
    participant User
    participant API
    participant Orchestrator
    participant ML
    participant Qdrant

    User->>API: GET /api/feed
    API->>Orchestrator: Request feed
    Orchestrator->>ML: Get user embedding
    ML->>Orchestrator: 128-dim vector
    Orchestrator->>Qdrant: Search similar content
    Qdrant->>Orchestrator: 100 candidates
    Orchestrator->>Orchestrator: Thompson Sampling rank
    Orchestrator->>API: Ranked feed
    API->>User: Personalized content

See: Complete Data Flow

Content Discovery Pipeline

graph LR
    A[Instagram Curators] --> B[Discovery Agent]
    B --> C{Quality > 0.7?}
    C -->|Yes| D[Qdrant + Supabase]
    C -->|No| E[Discard]
    D --> F[Available in Feed]

Technology Stack

Compute Layer

  • Cloudflare Workers - Serverless API endpoints
  • Cloudflare Durable Objects - Stateful per-user orchestration
  • Fly.io - ML inference service (PyTorch)

ML/AI Layer

  • PyTorch - Foundation model training
  • Sentence-BERT - Content embeddings
  • Qdrant - Vector similarity search
  • Gemini - Quality scoring

Data Layer

  • Supabase - PostgreSQL for structured data
  • Cloudflare KV - Thompson Sampling state
  • Cloudflare R2 - Model checkpoints

Why These Choices?

See ADRs: - ADR-0001: AI-First Architecture - ADR-0002: Free Tier First - ADR-0003: Cloudflare Native Stack

Scalability Strategy

Phase 1: Free Tier (0-1K users)

  • Sentence-BERT embeddings (pre-trained)
  • Simple user representations
  • Thompson Sampling only
  • Manual Instagram monitoring

Phase 2: Growth (1K-100K users)

  • Custom foundation model
  • Full multi-agent system
  • MAGRPO coordination
  • Automated discovery

Phase 3: Scale (100K+ users)

  • Workers AI for edge inference
  • ENR for Thompson Sampling
  • Multiple content sources
  • Real-time training

Performance Targets

Metric Target Current
Feed latency (p99) <500ms TBD
Discovery throughput 1K posts/hour TBD
Embedding generation <100ms TBD
Model training time <2 hours TBD

Security & Privacy

  • User data: Encrypted at rest (Supabase)
  • API keys: Cloudflare Secrets
  • Embeddings: No PII stored
  • Interactions: Anonymizable for training

Monitoring

  • Cloudflare Analytics - Request metrics
  • Supabase Logs - Database queries
  • Fly.io Metrics - ML service performance
  • Custom: Thompson Sampling regret, model drift

Next Steps


Questions? See the complete Technical Architecture or check ADRs.