How It Works - Visual Guide
A plain-English, visual explanation of how Vows Social curates your perfect wedding feed.
The Big Picture
Think of Vows Social as a team of specialized experts working together to find content you'll love:
graph LR
You[👰 You<br/>Browse & Save] --> AI[🧠 AI Learns<br/>Your Taste]
AI --> Agents[🤖 Agents<br/>Find Matches]
Agents --> Rank[🎯 Ranking<br/>Algorithm]
Rank --> Feed[📱 Perfect<br/>Feed]
Feed --> You
style You fill:#FFE0F0
style AI fill:#E0F2FE
style Agents fill:#F3E8FF
style Rank fill:#FEF3C7
style Feed fill:#D1FAE5
Let's break down each step...
Step 1: You Interact 👰
Every time you browse wedding content, you're teaching our AI:
- 💾 Save content → "I love this style"
- 👀 View for 30+ seconds → "This is interesting"
- 📤 Share with partner → "This is important"
- ⏭️ Skip quickly → "Not my style"
sequenceDiagram
actor Sarah as 👰 Sarah
participant App as vows.social
participant DB as Database
Sarah->>App: Saves rustic barn venue 🏚️
App->>DB: Log interaction:<br/>user=sarah, content=barn_42,<br/>action=save, timestamp=now
Note over DB: All interactions<br/>stored for learning
Real Example:
Sarah saves 5 rustic barn venues, 3 sage green color palettes, and 2 outdoor ceremony photos. Our AI now knows: Sarah likes rustic + outdoor + earth tones.
Step 2: AI Learns Your Style 🧠
The Two-Tower Model
We use the same architecture as Pinterest and YouTube - a Two-Tower Model:
graph TB
subgraph "User Tower 🧑"
History[Sarah's History<br/>📚 Last 50 interactions] --> UserEncoder[User Encoder<br/>🧠 Neural Network]
UserEncoder --> UserEmbed[User Embedding<br/>📊 128 numbers]
end
subgraph "Content Tower 🎨"
Content[Venue Photo<br/>🏚️ Image + Metadata] --> ContentEncoder[Content Encoder<br/>🧠 SigLIP 2 Model]
ContentEncoder --> ContentEmbed[Content Embedding<br/>📊 384 numbers]
end
UserEmbed --> Match[🎯 Match Score<br/>Dot Product]
ContentEmbed --> Match
Match --> Score[📈 How well they match<br/>0.0 to 1.0]
style History fill:#FFE0F0
style Content fill:#E0F2FE
style Match fill:#FEF3C7
style Score fill:#D1FAE5
In Plain English:
- User Tower looks at everything Sarah has saved and creates a "taste profile" (128 numbers)
- Content Tower understands every piece of content (image, style, colors) as 384 numbers
- Match Score calculates how well they fit together (dot product = multiplication + sum)
Example:
Sarah's Taste Profile (simplified):
- rustic_score: 0.92
- modern_score: 0.15
- outdoor_score: 0.88
- sage_green: 0.95
Barn Venue #42:
- rustic_score: 0.89
- modern_score: 0.10
- outdoor_score: 0.92
- sage_green: 0.91
Match = (0.92×0.89) + (0.15×0.10) + (0.88×0.92) + (0.95×0.91)
= 0.82 + 0.02 + 0.81 + 0.87
= 2.52 → High match! ✅
Multimodal Understanding (Images + Text)
We use SigLIP 2, the state-of-the-art 2025 model for understanding images:
graph LR
Image[🖼️ Venue Photo] --> SigLIP[SigLIP 2<br/>400M parameters<br/>Google AI]
Text[📝 Description<br/>"Rustic barn venue"] --> SigLIP
SigLIP --> Understanding[🧠 Understanding<br/>• Style: Rustic<br/>• Setting: Outdoor<br/>• Colors: Earth tones<br/>• Lighting: Natural]
Understanding --> Vector[📊 384-number vector<br/>Computer-friendly format]
style Image fill:#E0F2FE
style Text fill:#FFE0F0
style SigLIP fill:#F3E8FF
style Understanding fill:#FEF3C7
style Vector fill:#D1FAE5
Why This Matters:
- Understands visual style (composition, lighting, colors)
- Reads text descriptions and tags
- Works across 89 languages (Jina CLIP v2 variant)
- Fine-grained understanding (not just "wedding photo")
Step 3: Agents Find Perfect Matches 🤖
Six specialized AI agents work like a team of wedding planners:
graph TB
Orchestrator[🎯 Orchestrator<br/>"Coordinate the team"]
Discovery[🔍 Discovery Agent<br/>"Find hidden gems"]
Quality[✨ Quality Guardian<br/>"Ensure high standards"]
Archivist[📖 Personal Archivist<br/>"Remember the journey"]
Serendipity[🎲 Serendipity Engine<br/>"Add variety"]
Forecaster[⏰ Engagement Forecaster<br/>"Perfect timing"]
Orchestrator --> Discovery
Orchestrator --> Quality
Orchestrator --> Archivist
Orchestrator --> Serendipity
Orchestrator --> Forecaster
Discovery --> Scores[Agent Scores]
Quality --> Scores
Archivist --> Scores
Serendipity --> Scores
Forecaster --> Scores
Scores --> Orchestrator
style Orchestrator fill:#8B5CF6,color:#fff
style Discovery fill:#06B6D4,color:#fff
style Quality fill:#F59E0B,color:#fff
style Archivist fill:#10B981,color:#fff
style Serendipity fill:#EC4899,color:#fff
style Forecaster fill:#8B5CF6,color:#fff
What Each Agent Does
🔍 Discovery Agent
Role: Find exceptional vendors before they're popular
Example:
"New photographer in Sydney posted amazing work yesterday. Only 500 Instagram followers but composition is stunning. Recommend to Sarah before they book up."
✨ Quality Guardian
Role: Ensure only high-quality content reaches you
Checks: - Image quality (blur, exposure, composition) - Professionalism (real weddings vs staged shoots) - Authenticity (genuine vendors vs stock photos)
Example:
"This venue photo is blurry and poorly lit. Quality score: 0.3/1.0. REJECT."
📖 Personal Archivist
Role: Remember your journey and planning phase
Tracks: - Wedding date (months until wedding) - Planning phase (inspiration → vendors → details) - Style evolution (preferences change over time) - Saved collections
Example:
"Sarah is 4 months out. She's past inspiration phase, now booking vendors. Prioritize photographers and florists with availability."
🎲 Serendipity Engine
Role: Prevent filter bubbles by introducing variety
Why It Matters: Without this, Sarah only sees rustic content and misses an elegant ballroom she'd actually love.
Strategy: - 80% proven taste (rustic barns) - 20% exploration (elegant venues, modern styles)
Example:
"Sarah loves rustic, but let's show her one elegant venue. If she saves it, we've discovered a secondary style she likes!"
⏰ Engagement Forecaster
Role: Predict perfect timing for notifications
Analyzes: - Time of day Sarah is most active - Days of week she engages - Planning phase timing (venue search → vendor booking → details)
Example:
"Sarah usually browses 7-9 PM on weekdays. She just saved 3 venues. Send 'New venues in your style' notification tomorrow at 7:15 PM."
Step 4: Thompson Sampling Ranks Everything 🎯
This is where the magic happens. Thompson Sampling is the same algorithm Instagram and Pinterest use.
The Problem We're Solving
You want two things: 1. Exploitation - Show content you'll probably love (safe bets) 2. Exploration - Try new content you might love (discoveries)
Too much exploitation = filter bubble (only see what you know you like) Too much exploration = bad recommendations (random content)
How Thompson Sampling Works
Think of it as a smart gambling strategy:
graph TB
Content[🏚️ Venue Content<br/>Has been shown to Sarah 7 times]
Stats[📊 Track Record<br/>✅ Saved: 5 times<br/>❌ Skipped: 2 times]
Beta[🎲 Beta Distribution<br/>α=5 (successes), β=2 (failures)]
Sample[🎯 Sample Score<br/>Random number from distribution<br/>Example: 0.82]
Quality[✨ Quality Score<br/>From Quality Guardian<br/>Example: 0.90]
Final[📈 Final Score<br/>0.82 × 0.90 = 0.74]
Content --> Stats
Stats --> Beta
Beta --> Sample
Sample --> Final
Quality --> Final
style Content fill:#E0F2FE
style Stats fill:#FFE0F0
style Beta fill:#F3E8FF
style Sample fill:#FEF3C7
style Quality fill:#FBCFE8
style Final fill:#D1FAE5
Real Example: Ranking Sarah's Feed
Let's rank 3 venues for Sarah:
graph TB
subgraph "Venue A: Rustic Barn"
A1[Track Record<br/>✅ 5 saves<br/>❌ 2 skips] --> A2[Sample: 0.82<br/>High certainty]
A2 --> A3[Quality: 0.90]
A3 --> A4[Final: 0.74<br/>🥇 RANK #1]
end
subgraph "Venue B: New Photographer"
B1[Track Record<br/>✅ 0 saves<br/>❌ 0 skips<br/>Never shown!] --> B2[Sample: 0.73<br/>High uncertainty]
B2 --> B3[Quality: 0.85]
B3 --> B4[Final: 0.62<br/>🥈 RANK #2]
end
subgraph "Venue C: Elegant Ballroom"
C1[Track Record<br/>✅ 3 saves<br/>❌ 4 skips] --> C2[Sample: 0.51<br/>Lower certainty]
C2 --> C3[Quality: 0.95]
C3 --> C4[Final: 0.48<br/>🥉 RANK #3]
end
style A4 fill:#D1FAE5
style B4 fill:#FEF3C7
style C4 fill:#FED7AA
What Happened:
- Venue A (Rustic Barn) - Ranked #1
- Sarah has saved this style many times (high α=5)
- High confidence it's a good match
-
Safe bet ✅
-
Venue B (New Photographer) - Ranked #2
- Never shown before (α=0, β=0)
- High uncertainty = try it!
-
Discovery opportunity 🔍
-
Venue C (Elegant Ballroom) - Ranked #3
- Mixed results (3 saves, 4 skips)
- Lower confidence
- But still shown for diversity 🎲
The Beauty: System automatically balances exploration vs exploitation!
Self-Learning in Action
After Sarah interacts:
sequenceDiagram
participant Sarah as 👰 Sarah
participant Feed as Feed
participant TS as Thompson Sampling
Feed->>Sarah: Shows Venue B (new photographer)
Sarah->>Feed: ❤️ Saves it!
Feed->>TS: Update α_B: 0 → 1
Note over TS: Next time, Venue B<br/>will rank higher!
Feed->>Sarah: Shows Venue C (elegant ballroom)
Sarah->>Feed: ⏭️ Skips it
Feed->>TS: Update β_C: 4 → 5
Note over TS: Next time, Venue C<br/>will rank lower
No manual tuning needed. The system learns from every interaction.
Step 5: Feed Gets Smarter Every Day 📊
Three Learning Mechanisms
graph TB
subgraph "Real-Time Learning ⚡"
Interaction[User Interaction] --> Thompson[Thompson Sampling<br/>Updates α/β instantly]
Thompson --> NextFeed[Next Feed Request<br/>Already improved!]
end
subgraph "Nightly Training 🌙 (2 AM)"
Batch[Daily Interactions] --> TwoTower[Two-Tower Model<br/>Retrains on A100 GPU]
TwoTower --> BetterEmbeds[Better Embeddings<br/>Deployed automatically]
end
subgraph "Agent Training 🤖 (2 AM)"
Episodes[Agent Episodes] --> RLlib[Ray RLlib<br/>Multi-Agent PPO]
RLlib --> BetterAgents[Better Policies<br/>Agents collaborate better]
end
style Interaction fill:#D1FAE5
style Batch fill:#E0F2FE
style Episodes fill:#F3E8FF
Example: Sarah's First Week
| Day | What Happens | Learning |
|---|---|---|
| Day 1 | Sarah saves 5 rustic barns | Thompson α increases for rustic content |
| Day 2 | Feed shows 80% rustic, 20% other | Sarah saves 1 elegant venue (surprise!) |
| Day 3 | Thompson explores elegant style more | Two-Tower trains overnight, learns Sarah → rustic+elegant |
| Day 4 | Feed now shows both styles | Sarah's embedding updated, better matches |
| Day 5 | Agents learn Sarah is 4 months out | Engagement Forecaster prioritizes vendor bookings |
| Day 7 | Two-Tower retrains with 7 days data | User embedding refined, content matches improve |
Result: After 1 week, Sarah's feed is dramatically more personalized than Day 1.
Complete Flow: Request to Response
Let's watch Sarah's morning coffee browse:
sequenceDiagram
actor Sarah as 👰☕ Sarah
participant App as vows.social
participant API as Modal API
participant Orch as Orchestrator
participant TwoTower as Two-Tower Model
participant Qdrant as Vector DB
participant Agents as Agent Crew
participant TS as Thompson Sampling
Note over Sarah: Opens app, 8:00 AM
Sarah->>App: Request feed
App->>API: GET /feed/for-you?user=sarah
Note over API,Orch: Step 1: Understand Sarah
API->>Orch: generate_feed(sarah, limit=20)
Orch->>TwoTower: get_user_embedding(sarah)
TwoTower->>TwoTower: Load Sarah's 50 recent interactions<br/>Run neural network
TwoTower-->>Orch: [0.92, -0.43, 0.88, ...] (128 dims)
Note over Orch,Qdrant: Step 2: Find similar content
Orch->>Qdrant: vector_search(sarah_embedding, limit=100)
Qdrant->>Qdrant: ANN search in 384-dim space<br/>Find nearest neighbors
Qdrant-->>Orch: 100 candidates [venues, photos, vendors]
Note over Orch,Agents: Step 3: Agent evaluation
Orch->>Agents: evaluate_batch(candidates, sarah_context)
par Discovery Agent
Agents->>Agents: Score content freshness
and Quality Guardian
Agents->>Agents: Score visual quality
and Personal Archivist
Agents->>Agents: Check planning phase fit
and Serendipity Engine
Agents->>Agents: Measure diversity
and Engagement Forecaster
Agents->>Agents: Predict engagement
end
Agents-->>Orch: agent_scores {discovery: 0.85, quality: 0.92, ...}
Note over Orch,TS: Step 4: Rank with Thompson Sampling
Orch->>TS: rank(candidates, agent_scores)
loop For each candidate
TS->>TS: Sample from Beta(α, β)<br/>Multiply by quality<br/>Sort
end
TS-->>Orch: ranked_ids [42, 87, 19, ...]
Orch-->>API: {items: [...], metadata: {...}}
API-->>App: JSON with 20 ranked items
App-->>Sarah: Beautiful, personalized feed 🎨
Note over Sarah: Sarah saves a venue
Sarah->>App: ❤️ Saves venue #42
App->>API: POST /interactions {action: save, content: 42}
API->>TS: update_reward(42, reward=1.0)
TS->>TS: α_42 += 1 ✅
Note over TS: Next feed will rank<br/>similar content higher!
Total Time: < 500ms from request to response 🚀
Why This Architecture Works
1. Industry-Proven Approaches
We use the same techniques as the giants:
| Company | What They Use | What We Use |
|---|---|---|
| Two-Tower Model | ✅ Two-Tower Model | |
| Thompson Sampling | ✅ Thompson Sampling | |
| TikTok | Contextual Bandits | ✅ Beta-Bernoulli Bandit |
| SigLIP for images | ✅ SigLIP 2 (2025 SOTA) | |
| OpenAI | Multi-Agent PPO | ✅ Ray RLlib PPO |
2. Self-Learning Without Manual Tuning
No one at Vows manually adjusts weights or tunes parameters. The system learns from user behavior:
- Thompson Sampling learns from every save/skip
- Two-Tower Model retrains nightly on interactions
- Agent Policies optimize via Multi-Agent PPO
3. Full Observability
Every decision is logged and traceable via LangSmith:
graph LR
Decision[Agent Decision] --> LangSmith[🔬 LangSmith]
LangSmith --> Trace[Full Trace<br/>• Input<br/>• Reasoning<br/>• Output]
Trace --> Debug[Debug Issues<br/>• Why this content?<br/>• Agent scores?<br/>• Match score?]
style Decision fill:#F3E8FF
style LangSmith fill:#F59E0B,color:#fff
style Trace fill:#FEF3C7
style Debug fill:#D1FAE5
Example Debug:
User complaint: "Why did I see modern venues? I prefer rustic."
LangSmith Trace:
1. Personal Archivist score: 0.45 (user prefers rustic)
2. Serendipity Engine score: 0.95 (diversity injection)
3. Final rank: #18 (shown for variety, not primary)
Action: Working as intended - diversity prevents filter bubble.
4. Unified Python Stack
All ML/AI code runs on Modal (Python):
- No JavaScript/Python split like Cloudflare Workers + Fly.io
- GPU access for embeddings (SigLIP 2 on A10G)
- Serverless scaling ($0 when idle)
- Single deployment platform
What Makes This Unique?
vs Pinterest
- Pinterest: Generic recommendations for millions
- Vows: Hyper-personalized for couples planning ONE wedding
vs Instagram
- Instagram: Algorithmic feed you don't control
- Vows: AI learns YOUR specific taste, no ads, pure curation
vs Google Search
- Google: You search for what you know exists
- Vows: Discovers vendors and ideas you didn't know existed
Next Steps
Ready to dive deeper?
- Architecture Overview - Technical system design
- Technology Stack - Detailed component breakdown
- Implementation Plan - How we're building this
- Visual Architecture - Complete end-to-end guide
Or start contributing:
- Development Guides - Git workflow, testing, deployment
- Backlog - What we're building next
Questions? Check the PRD or ADRs for architectural decisions! 🎨