Architecture Overview
High-level overview of the Vows Social AI system architecture.
For complete technical details, see docs/ARCHITECTURE.md.
System Overview
graph TB
User[👤 User] --> Mobile[📱 Mobile App]
User --> Web[🌐 Web App]
Mobile --> API[⚡ Cloudflare Workers API]
Web --> API
API --> Orchestrator[🎯 Orchestrator DO]
Orchestrator --> ML[🧠 ML Inference<br/>Fly.io]
Orchestrator --> Qdrant[📊 Qdrant<br/>Vector DB]
Orchestrator --> Supabase[💾 Supabase<br/>PostgreSQL]
Instagram[📷 Instagram] --> Discovery[🔍 Discovery Agent]
Discovery --> Qdrant
Discovery --> Supabase
Core Components
1. Foundation Model (The Brain)
A custom transformer that learns user preferences from interaction sequences.
Architecture: 384 → 128 → 384 dimensions - Input: Sentence-BERT embeddings (pre-trained) - Output: 128-dimensional user representation - Training: User interaction sequences
Purpose: Understand each user's unique wedding vision
See: Foundation Model Deep Dive
2. Orchestrator (The Coordinator)
Cloudflare Durable Object that manages per-user feed ranking.
Responsibilities: - Fetch user embedding from ML service - Query Qdrant for candidate content - Apply Thompson Sampling for ranking - Track interactions and update rewards
Performance: <500ms p99 latency
See: Orchestrator Documentation
3. Multi-Agent Crew
Six specialized agents working together:
| Agent | Purpose | Technology |
|---|---|---|
| Discovery | Find content from Instagram | Instagram API |
| Quality Guardian | Filter low-quality posts | Gemini scoring |
| Personal Archivist | Remember user preferences | Cloudflare KV |
| Serendipity Engine | Diversify feed | Diversity metrics |
| Engagement Forecaster | Predict user actions | Foundation model |
| Orchestrator | Coordinate everything | MAGRPO |
See: Multi-Agent System
4. Thompson Sampling (Exploration/Exploitation)
Bayesian bandit algorithm that balances: - Exploitation: Show content you'll probably love - Exploration: Discover new styles and ideas
How it works: - Each content item has alpha/beta parameters (Beta distribution) - Sample reward estimate for each candidate - Rank by sampled rewards - Update parameters based on user interactions
Benefits: - Adapts to user quickly - Never gets stuck in filter bubble - Proven at scale (Instagram, Pinterest, TikTok)
See: Thompson Sampling Details
5. Data Layer
Three databases working together:
Qdrant (Vector DB) - Content embeddings (384-dim) - User embeddings (128-dim) - Semantic similarity search - 1GB free tier
Supabase (PostgreSQL) - User profiles and metadata - Interaction history - Content metadata - 500MB free tier
Cloudflare KV - Thompson Sampling parameters (alpha/beta) - Agent memory and state - Cache for hot data - Effectively unlimited (free tier)
See: Database Schema
Data Flow
Feed Generation Request
sequenceDiagram
participant User
participant API
participant Orchestrator
participant ML
participant Qdrant
User->>API: GET /api/feed
API->>Orchestrator: Request feed
Orchestrator->>ML: Get user embedding
ML->>Orchestrator: 128-dim vector
Orchestrator->>Qdrant: Search similar content
Qdrant->>Orchestrator: 100 candidates
Orchestrator->>Orchestrator: Thompson Sampling rank
Orchestrator->>API: Ranked feed
API->>User: Personalized content
See: Complete Data Flow
Content Discovery Pipeline
graph LR
A[Instagram Curators] --> B[Discovery Agent]
B --> C{Quality > 0.7?}
C -->|Yes| D[Qdrant + Supabase]
C -->|No| E[Discard]
D --> F[Available in Feed]
Technology Stack
Compute Layer
- Cloudflare Workers - Serverless API endpoints
- Cloudflare Durable Objects - Stateful per-user orchestration
- Fly.io - ML inference service (PyTorch)
ML/AI Layer
- PyTorch - Foundation model training
- Sentence-BERT - Content embeddings
- Qdrant - Vector similarity search
- Gemini - Quality scoring
Data Layer
- Supabase - PostgreSQL for structured data
- Cloudflare KV - Thompson Sampling state
- Cloudflare R2 - Model checkpoints
Why These Choices?
See ADRs: - ADR-0001: AI-First Architecture - ADR-0002: Free Tier First - ADR-0003: Cloudflare Native Stack
Scalability Strategy
Phase 1: Free Tier (0-1K users)
- Sentence-BERT embeddings (pre-trained)
- Simple user representations
- Thompson Sampling only
- Manual Instagram monitoring
Phase 2: Growth (1K-100K users)
- Custom foundation model
- Full multi-agent system
- MAGRPO coordination
- Automated discovery
Phase 3: Scale (100K+ users)
- Workers AI for edge inference
- ENR for Thompson Sampling
- Multiple content sources
- Real-time training
Performance Targets
| Metric | Target | Current |
|---|---|---|
| Feed latency (p99) | <500ms | TBD |
| Discovery throughput | 1K posts/hour | TBD |
| Embedding generation | <100ms | TBD |
| Model training time | <2 hours | TBD |
Security & Privacy
- User data: Encrypted at rest (Supabase)
- API keys: Cloudflare Secrets
- Embeddings: No PII stored
- Interactions: Anonymizable for training
Monitoring
- Cloudflare Analytics - Request metrics
- Supabase Logs - Database queries
- Fly.io Metrics - ML service performance
- Custom: Thompson Sampling regret, model drift
Next Steps
- Deep Dive: System Architecture Diagrams
- Implementation: Phase 1 Plan
- Components: Component Documentation
- API: API Reference
Questions? See the complete Technical Architecture or check ADRs.