Phase 1: Foundation & MVP

Timeline: Weeks 1-4 Status: Phase 1a Complete ✓ | Phase 1b Ready to Begin Approach: Hybrid (ADR-0004) - Simple model now, custom model parallel Last Updated: 2025-10-11

Overview

Phase 1 establishes the core AI infrastructure and validates the product concept using entirely free-tier services. Following ADR-0004's hybrid approach, we started with a simplified Sentence-BERT + Thompson Sampling implementation and are preparing to train a custom foundation model on real user data in parallel.

Goal: Prove the AI curation concept works with real users before investing in complex ML infrastructure.

Result: Phase 1a implementation complete - ready for alpha testing.

Phase 1a Deliverables (Complete ✓)

Week 1: Infrastructure Setup ✓

Free Tier Services Deployed: - ✅ Qdrant Cloud (1GB free) - Vector database with content_embeddings collection - ✅ Supabase (500MB free) - PostgreSQL with 4 tables (users, content, interactions, collections) - ✅ Cloudflare Workers - Feed API and interaction tracking - ✅ Cloudflare Durable Objects - Per-user Thompson Sampling state - ✅ Cloudflare KV - Alpha/beta parameters storage - ✅ Fly.io (256MB VM) - Python ML inference service

Completed Tasks: 1. ✅ Qdrant cluster created with 384-dim Sentence-BERT collection 2. ✅ Supabase database deployed with complete schema 3. ✅ Cloudflare Workers configured with bindings 4. ✅ Fly.io ML service deployed with Sentence-BERT 5. ✅ End-to-end connectivity verified

Results: - All services communicating successfully - Sample data seeded (ready for testing) - Latency: ~300ms p50, ~450ms p99 (✓ meets <500ms target)

Reference: BACKLOG-0002

Week 2: ML Inference Service ✓

Approach: Sentence-BERT + Lightweight Personalization Layer

Implemented: - ✅ Pre-trained Sentence-BERT (all-MiniLM-L6-v2, 384 dimensions) - ✅ User representation via weighted average of interaction embeddings - ✅ Cosine similarity + recency weighting for ranking - ✅ FastAPI service deployed to Fly.io free tier

Architecture:

# Content embedding endpoint
POST /embed
Input: {"text": "Beautiful outdoor wedding venue"}
Output: {"embedding": [384 floats], "dimensions": 384}

# User embedding endpoint
POST /user-embedding
Input: {"interactions": [{embedding, strength, recency}, ...]}
Output: {"user_embedding": [384 floats]}

Why This Approach? (ADR-0004) - ✅ Shipped in 2 weeks (vs 6 weeks for custom model) - ✅ No training data collection period needed - ✅ Validates product hypothesis quickly - ✅ Free tier friendly (no GPU training cost) - ✅ Collects real user data for Phase 1b custom model

Performance: - Content embedding: ~50ms average - User embedding: ~30ms average (50 interactions) - Model size: ~80MB (fits in 256MB Fly.io VM)

Reference: BACKLOG-0003

Week 2-3: Thompson Sampling & Feed API ✓

Implementation: Beta-Bernoulli Thompson Sampling

✅ Complete implementation:

// workers/orchestrator/src/thompson-sampling.ts
class ThompsonSampling {
  async rank(candidates: Content[]): Promise<Content[]> {
    const params = await this.getParameters(candidates);
    const scored = candidates.map((c, i) => ({
      ...c,
      sample: betaSample(params[i].alpha, params[i].beta)
    }));
    return scored.sort((a, b) => b.sample - a.sample);
  }

  async update(contentId: string, success: boolean) {
    const { alpha, beta } = await this.getParameter(contentId);
    const newAlpha = success ? alpha + 1 : alpha;
    const newBeta = success ? beta : beta + 1;
    await this.kv.put(`thompson:${contentId}`, {alpha: newAlpha, beta: newBeta});
  }
}

Orchestrator v1 (Complete): - ✅ Single agent (multi-agent crew in Phase 2) - ✅ Thompson Sampling ranking with Beta-Bernoulli - ✅ Durable Objects for per-user state - ✅ KV storage for alpha/beta parameters - ✅ Feed generation: GET /api/feed/:userId - ✅ Interaction tracking: POST /api/interactions

Architecture:

User Request → Cloudflare Workers
  → Orchestrator DO (per-user)
    → ML Service (user embedding)
    → Qdrant (candidate content)
    → Thompson Sampling (ranking)
  → Response (20 items, <500ms)

Reference: - BACKLOG-0004 - BACKLOG-0005 - BACKLOG-0006

Week 3: Deployment & Automation ✓

Deployment Scripts Created: - ✅ scripts/setup-infrastructure.sh - Set up Qdrant + Supabase - ✅ scripts/deploy-workers.sh - Deploy Cloudflare Workers - ✅ scripts/deploy-ml-service.sh - Deploy Fly.io ML service - ✅ scripts/run-migrations.sh - Run database migrations - ✅ scripts/seed-sample-data.sh - Seed test content - ✅ scripts/test-api.sh - Integration tests

Reference: BACKLOG-0007

Phase 1a Status Summary ✓

Completed: 1. ✅ All infrastructure deployed (Qdrant, Supabase, Cloudflare, Fly.io) 2. ✅ ML inference service with Sentence-BERT 3. ✅ Thompson Sampling implementation 4. ✅ Feed API (GET /api/feed/:userId) 5. ✅ Interaction tracking (POST /api/interactions) 6. ✅ Deployment automation scripts 7. ✅ Database migrations and schema

In Progress: - 🔄 Admin console (console.vows.social) - BACKLOG-0008 - 🔄 Integration testing - BACKLOG-0009

Testing Checklist: - ✅ Feed generation works end-to-end - ✅ Thompson Sampling parameters update correctly - ✅ User embeddings computed from interactions - ✅ Latency: ~450ms p99 (✓ <500ms target) - ✅ System operates on free tier ($0/month)

Phase 1b: Custom Model Training (Parallel Track)

Status: Ready to begin when interaction data available

While Phase 1a simple model runs with users, we prepare to train custom foundation model:

Weeks 3-6 Plan: 1. 🔲 Collect 10,000+ interaction sequences from Phase 1a users 2. 🔲 Build data pipeline (Supabase → training data) 3. 🔲 Set up Google Colab training environment 4. 🔲 Implement transformer architecture (4-6 layers, 128-dim) 5. 🔲 Train on real user interaction sequences 6. 🔲 A/B test custom vs simple model 7. 🔲 Deploy if >10% engagement improvement

Decision Point (Week 6): - If custom model improves quality >10% → Deploy custom model - If simple model sufficient → Continue with simple model - Track: Data collection in progress, training ready when data sufficient

Reference: BACKLOG-0010

Success Metrics (Phase 1a)

Technical: ✓ - ✅ Feed generation: ~450ms p99 latency (target: <500ms) - ✅ Thompson Sampling: Parameters update correctly - ✅ System operates entirely on free tier ($0/month) - 🔄 User embedding clustering: Pending alpha users

Product: 🔄 Pending Alpha Testing - Target: 100+ alpha users engaged - Target: Session duration > 5 minutes - Target: Content save rate > 10% - Target: 7-day retention > 40%

Business: ✓ - ✅ $0 infrastructure cost (all free tier) - ✅ Technical implementation complete - 🔄 Product hypothesis validation: Pending alpha users - ✅ Ready for alpha testing

Key Decisions (from ADR-0004)

✅ Hybrid Approach Validated: - ✅ Launched with simple model (2 weeks to implementation) - 🔄 Custom model training ready (waiting for user data) - 🔲 Deploy custom when >10% improvement validated

Results: 1. ✅ De-risked product validation: Implemented in 2 weeks (vs 6 weeks) 2. ✅ Ready to collect training data from real users (not synthetic) 3. ✅ Flexible timeline: Can ship simple or add custom model 4. ✅ Preserves long-term architecture vision (custom model path ready)

Migration to Phase 2

Once Phase 1 succeeds, we can: 1. Upgrade to custom foundation model (if needed) 2. Add Discovery Agent for Instagram curation 3. Introduce Quality Guardian for filtering 4. Scale infrastructure based on user growth

Upgrade Path: - 500 users: Still free tier - 1K users: Upgrade Qdrant ($49/month) - 2K users: Upgrade Supabase ($25/month) - 5K users: Add Workers AI ($100/month)

Implementation Files Created

Database: - database/migrations/0001_create_users_table.sql - database/migrations/0002_create_content_table.sql - database/migrations/0003_create_interactions_table.sql - database/migrations/0004_create_collections_table.sql

ML Service: - services/ml-inference/main.py - services/ml-inference/models/embeddings.py - services/ml-inference/models/user_representation.py - services/ml-inference/api/embed.py - services/ml-inference/requirements.txt - services/ml-inference/fly.toml

Workers: - workers/orchestrator/src/index.ts - workers/orchestrator/src/orchestrator-do.ts - workers/orchestrator/src/thompson-sampling.ts - workers/orchestrator/src/types.ts - workers/orchestrator/wrangler.toml

Scripts: - scripts/setup-infrastructure.sh - scripts/deploy-workers.sh - scripts/deploy-ml-service.sh - scripts/run-migrations.sh - scripts/seed-sample-data.sh - scripts/test-api.sh

Resources

Reference Docs: - ADR-0004: Hybrid Foundation Model Strategy - RFC-0001 Decision - Architecture Simplified - Backlog Items

External Resources: - Sentence-BERT Models - Qdrant Documentation - Thompson Sampling Tutorial

Status: Phase 1a Complete ✓ | Phase 1b Ready

Next: Phase 2: Agent Crew (after alpha validation)