Foundation Model
The Personalization Foundation Model is the single source of truth for all user and content understanding in Vows Social. It learns from interaction sequences to create embeddings that power personalized recommendations.
Overview
Purpose: Unified model for user and content representation
Architecture: Pinterest-style Two-Tower transformer trained on interaction sequences (full version) or Sentence-BERT + lightweight layer (Phase 1)
Key Benefit: One model serves all downstream tasks - ranking, discovery, quality assessment, and personalization.
Phase 1: Simplified Approach
Following RFC-0001, Phase 1 uses a lightweight implementation:
Content Embeddings
Pre-trained Sentence-BERT (all-MiniLM-L6-v2):
- 384-dimensional embeddings
- No training required
- High quality out-of-the-box
- 80MB model size
from sentence_transformers import SentenceTransformer
# Load pre-trained model (free!)
encoder = SentenceTransformer('all-MiniLM-L6-v2')
# Embed wedding content
content_embedding = encoder.encode(
f"{title} {description} {vendor_name} {tags}"
)
# Returns: 384-dim vector
User Embeddings
Weighted Aggregation of interaction history:
def get_user_embedding(user_interactions: List[Interaction]) -> np.ndarray:
"""
Aggregate content embeddings weighted by:
- Interaction strength (view < save < share)
- Recency (exponential decay)
"""
embeddings = []
weights = []
for interaction in user_interactions:
# Get content embedding
content_embed = get_content_embedding(interaction.content_id)
# Compute weight
strength = INTERACTION_WEIGHTS[interaction.type] # view=1, save=3, share=5
recency = np.exp(-DECAY_RATE * interaction.days_ago)
weight = strength * recency
embeddings.append(content_embed)
weights.append(weight)
# Weighted average
weights = np.array(weights) / sum(weights)
user_embedding = np.average(embeddings, axis=0, weights=weights)
return user_embedding # 384-dim vector
Cold Start Handling
New users with no interaction history:
def get_cold_start_embedding(user: User) -> np.ndarray:
"""Use onboarding preferences as seed"""
if user.onboarding_preferences:
# Embed preference text
pref_text = " ".join(user.onboarding_preferences)
return encoder.encode(pref_text)
# Ultimate fallback: popular content average
return get_popular_content_average()
Ranking via Similarity
def rank_candidates(
user_embedding: np.ndarray,
candidate_embeddings: List[np.ndarray]
) -> List[float]:
"""Cosine similarity scoring"""
from sklearn.metrics.pairwise import cosine_similarity
scores = cosine_similarity(
user_embedding.reshape(1, -1),
candidate_embeddings
)[0]
return scores
Full Architecture (Future)
The production foundation model will be a custom transformer:
Model Architecture
Input: Sequence of user interactions
[interaction_1, interaction_2, ..., interaction_n]
Each interaction:
[action_embed, content_embed, duration, device, time_of_day, ...]
Transformer:
12-24 layers
Multi-head self-attention
Sparse attention for long sequences
Sliding window training
Training Objective:
Next-token prediction
"Given user's history, predict next action"
Outputs:
1. User Embedding (final hidden state, 128-dim)
2. Content Embeddings (learned during training, 384-dim)
3. Action Probabilities
Training Strategy
Data: - User interaction sequences (view, save, share, skip) - Temporal context (time of day, day of week, planning phase) - Device signals (mobile vs desktop, session length)
Loss Function:
def foundation_model_loss(
predicted_next: Tensor,
actual_next: Tensor,
user_embed: Tensor,
content_embed: Tensor
) -> Tensor:
# Next-token prediction (main objective)
reconstruction_loss = cross_entropy(predicted_next, actual_next)
# Contrastive learning (quality signal)
# Pull together: user + engaged content
# Push apart: user + skipped content
contrastive_loss = compute_contrastive(user_embed, content_embed)
return reconstruction_loss + 0.1 * contrastive_loss
Training Infrastructure: - Google Colab (free GPU for 12 hours) - Save checkpoints to HuggingFace Hub - Incremental training on new data
Benefits Over Simple Model
- Long-range Dependencies: Learns from months of user history
- Temporal Patterns: Understands planning phase progression
- Nuanced Preferences: Captures subtle style evolutions
- Better Cold Start: Metadata-based embeddings for new users
Cold Start Strategy (Full Model)
def compute_content_embedding(
content: Content,
interaction_count: int
) -> np.ndarray:
# Metadata-based (always available)
metadata_embed = embed_metadata(
content.text,
content.images,
content.tags,
content.vendor
)
# ID-based (learned from interactions)
id_embed = lookup_learned_embedding(content.id)
# Blend based on interaction count
weight = sigmoid(interaction_count / 100) # 0 to 1
return (1 - weight) * metadata_embed + weight * id_embed
API Reference
Inference Service (Fly.io)
Endpoint: POST /embed
Request:
{
"type": "content",
"text": "Stunning botanical garden wedding venue in Melbourne",
"images": ["url1", "url2"],
"tags": ["venue", "garden", "melbourne"]
}
Response:
{
"embedding": [0.123, -0.456, ...], // 384-dim
"model_version": "sentence-bert-v1",
"latency_ms": 45
}
User Embedding
Endpoint: GET /user/:id/embedding
Response:
{
"embedding": [0.789, -0.234, ...], // 384-dim
"interaction_count": 127,
"last_updated": "2025-10-11T10:30:00Z",
"model_version": "simple-v1"
}
Performance
Phase 1 (Simple Model)
| Metric | Target | Actual |
|---|---|---|
| Embedding latency | < 100ms | ~50ms |
| Memory usage | < 256MB | ~100MB |
| Model size | < 100MB | 80MB |
| Cold start quality | Within 10% of warm | TBD |
Full Model (Future)
| Metric | Target |
|---|---|
| Embedding quality | Cosine sim > 0.85 |
| Next-action prediction | > 70% accuracy |
| Training time | < 12 hours (Colab) |
| Inference latency | < 200ms |
Migration Path
Week 1-3: Simple model in production Week 4-6: Train custom model on real data Week 7: A/B test simple vs custom Week 8: Deploy custom if it wins
Success Criteria for Custom Model: - Engagement > 10% improvement - Latency stays < 500ms - Cold start quality maintained
Related Components
- Orchestrator - Uses embeddings for ranking
- Discovery Agent - Uses embeddings for vendor search
- Personal Archivist - Uses style evolution signals