Skip to content

Foundation Model

The Personalization Foundation Model is the single source of truth for all user and content understanding in Vows Social. It learns from interaction sequences to create embeddings that power personalized recommendations.


Overview

Purpose: Unified model for user and content representation

Architecture: Pinterest-style Two-Tower transformer trained on interaction sequences (full version) or Sentence-BERT + lightweight layer (Phase 1)

Key Benefit: One model serves all downstream tasks - ranking, discovery, quality assessment, and personalization.


Phase 1: Simplified Approach

Following RFC-0001, Phase 1 uses a lightweight implementation:

Content Embeddings

Pre-trained Sentence-BERT (all-MiniLM-L6-v2): - 384-dimensional embeddings - No training required - High quality out-of-the-box - 80MB model size

from sentence_transformers import SentenceTransformer

# Load pre-trained model (free!)
encoder = SentenceTransformer('all-MiniLM-L6-v2')

# Embed wedding content
content_embedding = encoder.encode(
    f"{title} {description} {vendor_name} {tags}"
)
# Returns: 384-dim vector

User Embeddings

Weighted Aggregation of interaction history:

def get_user_embedding(user_interactions: List[Interaction]) -> np.ndarray:
    """
    Aggregate content embeddings weighted by:
    - Interaction strength (view < save < share)
    - Recency (exponential decay)
    """
    embeddings = []
    weights = []

    for interaction in user_interactions:
        # Get content embedding
        content_embed = get_content_embedding(interaction.content_id)

        # Compute weight
        strength = INTERACTION_WEIGHTS[interaction.type]  # view=1, save=3, share=5
        recency = np.exp(-DECAY_RATE * interaction.days_ago)
        weight = strength * recency

        embeddings.append(content_embed)
        weights.append(weight)

    # Weighted average
    weights = np.array(weights) / sum(weights)
    user_embedding = np.average(embeddings, axis=0, weights=weights)

    return user_embedding  # 384-dim vector

Cold Start Handling

New users with no interaction history:

def get_cold_start_embedding(user: User) -> np.ndarray:
    """Use onboarding preferences as seed"""
    if user.onboarding_preferences:
        # Embed preference text
        pref_text = " ".join(user.onboarding_preferences)
        return encoder.encode(pref_text)

    # Ultimate fallback: popular content average
    return get_popular_content_average()

Ranking via Similarity

def rank_candidates(
    user_embedding: np.ndarray,
    candidate_embeddings: List[np.ndarray]
) -> List[float]:
    """Cosine similarity scoring"""
    from sklearn.metrics.pairwise import cosine_similarity

    scores = cosine_similarity(
        user_embedding.reshape(1, -1),
        candidate_embeddings
    )[0]

    return scores

Full Architecture (Future)

The production foundation model will be a custom transformer:

Model Architecture

Input: Sequence of user interactions
       [interaction_1, interaction_2, ..., interaction_n]

Each interaction:
       [action_embed, content_embed, duration, device, time_of_day, ...]

Transformer:
       12-24 layers
       Multi-head self-attention
       Sparse attention for long sequences
       Sliding window training

Training Objective:
       Next-token prediction
       "Given user's history, predict next action"

Outputs:
       1. User Embedding (final hidden state, 128-dim)
       2. Content Embeddings (learned during training, 384-dim)
       3. Action Probabilities

Training Strategy

Data: - User interaction sequences (view, save, share, skip) - Temporal context (time of day, day of week, planning phase) - Device signals (mobile vs desktop, session length)

Loss Function:

def foundation_model_loss(
    predicted_next: Tensor,
    actual_next: Tensor,
    user_embed: Tensor,
    content_embed: Tensor
) -> Tensor:
    # Next-token prediction (main objective)
    reconstruction_loss = cross_entropy(predicted_next, actual_next)

    # Contrastive learning (quality signal)
    # Pull together: user + engaged content
    # Push apart: user + skipped content
    contrastive_loss = compute_contrastive(user_embed, content_embed)

    return reconstruction_loss + 0.1 * contrastive_loss

Training Infrastructure: - Google Colab (free GPU for 12 hours) - Save checkpoints to HuggingFace Hub - Incremental training on new data

Benefits Over Simple Model

  1. Long-range Dependencies: Learns from months of user history
  2. Temporal Patterns: Understands planning phase progression
  3. Nuanced Preferences: Captures subtle style evolutions
  4. Better Cold Start: Metadata-based embeddings for new users

Cold Start Strategy (Full Model)

def compute_content_embedding(
    content: Content,
    interaction_count: int
) -> np.ndarray:
    # Metadata-based (always available)
    metadata_embed = embed_metadata(
        content.text,
        content.images,
        content.tags,
        content.vendor
    )

    # ID-based (learned from interactions)
    id_embed = lookup_learned_embedding(content.id)

    # Blend based on interaction count
    weight = sigmoid(interaction_count / 100)  # 0 to 1

    return (1 - weight) * metadata_embed + weight * id_embed

API Reference

Inference Service (Fly.io)

Endpoint: POST /embed

Request:

{
  "type": "content",
  "text": "Stunning botanical garden wedding venue in Melbourne",
  "images": ["url1", "url2"],
  "tags": ["venue", "garden", "melbourne"]
}

Response:

{
  "embedding": [0.123, -0.456, ...],  // 384-dim
  "model_version": "sentence-bert-v1",
  "latency_ms": 45
}

User Embedding

Endpoint: GET /user/:id/embedding

Response:

{
  "embedding": [0.789, -0.234, ...],  // 384-dim
  "interaction_count": 127,
  "last_updated": "2025-10-11T10:30:00Z",
  "model_version": "simple-v1"
}


Performance

Phase 1 (Simple Model)

Metric Target Actual
Embedding latency < 100ms ~50ms
Memory usage < 256MB ~100MB
Model size < 100MB 80MB
Cold start quality Within 10% of warm TBD

Full Model (Future)

Metric Target
Embedding quality Cosine sim > 0.85
Next-action prediction > 70% accuracy
Training time < 12 hours (Colab)
Inference latency < 200ms

Migration Path

Week 1-3: Simple model in production Week 4-6: Train custom model on real data Week 7: A/B test simple vs custom Week 8: Deploy custom if it wins

Success Criteria for Custom Model: - Engagement > 10% improvement - Latency stays < 500ms - Cold start quality maintained


Resources