Foundation Model API
API for generating embeddings and user representations using the Personalization Foundation Model.
Base URL
https://ml-inference.vows.social (Fly.io service)
Endpoints
Embed Content
Generate embedding for wedding content.
Endpoint: POST /embed/content
Request:
{
"text": "Beautiful botanical garden wedding venue in Melbourne",
"images": ["https://example.com/image1.jpg"],
"tags": ["venue", "garden", "melbourne"],
"vendorType": "venue"
}
Response:
{
"embedding": [0.123, -0.456, ...], // 384-dim vector
"metadata": {
"model": "sentence-bert-v1",
"dimension": 384,
"latency_ms": 45
}
}
cURL Example:
curl -X POST https://ml-inference.vows.social/embed/content \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"text": "Modern minimalist wedding photographer",
"tags": ["photographer", "modern"]
}'
Get User Embedding
Generate or retrieve user embedding based on interaction history.
Endpoint: POST /embed/user
Request:
{
"userId": "user-123",
"interactions": [
{
"contentId": "content-1",
"action": "save",
"duration": 5.2,
"timestamp": "2025-10-11T10:00:00Z"
},
{
"contentId": "content-2",
"action": "view",
"duration": 3.5,
"timestamp": "2025-10-11T10:05:00Z"
}
],
"decayRate": 0.01 // Optional, default 0.01
}
Response:
{
"embedding": [0.789, -0.234, ...], // 384-dim vector
"metadata": {
"model": "simple-aggregation-v1",
"interactionCount": 127,
"dominantStyle": "modern_minimalist",
"confidence": 0.85,
"latency_ms": 120
}
}
Retrieve Cached Embedding
Get previously computed user embedding (cached).
Endpoint: GET /embed/user/:userId
Response:
{
"embedding": [0.789, -0.234, ...],
"metadata": {
"cachedAt": "2025-10-11T10:30:00Z",
"validUntil": "2025-10-11T11:30:00Z",
"interactionCount": 127
}
}
Cache Invalidation:
- User embedding is cached for 1 hour
- Invalidated on significant interactions (save, share)
- Force refresh with ?refresh=true
Batch Embed Content
Embed multiple content items efficiently.
Endpoint: POST /embed/batch
Request:
{
"items": [
{
"id": "content-1",
"text": "Garden wedding venue Melbourne",
"tags": ["venue", "garden"]
},
{
"id": "content-2",
"text": "Modern wedding photographer Sydney",
"tags": ["photographer", "modern"]
}
]
}
Response:
{
"embeddings": {
"content-1": [0.1, -0.2, ...],
"content-2": [0.3, -0.4, ...]
},
"metadata": {
"batchSize": 2,
"totalLatency_ms": 80,
"avgLatency_ms": 40
}
}
Cold Start Embedding
Generate embedding for new user with no history.
Endpoint: POST /embed/cold-start
Request:
{
"userId": "user-new-123",
"preferences": {
"styles": ["modern", "minimalist"],
"region": "Melbourne",
"budget": "medium",
"weddingDate": "2026-06-15"
}
}
Response:
{
"embedding": [0.5, -0.3, ...],
"metadata": {
"method": "preference_based",
"confidence": 0.6,
"note": "Will improve with user interactions"
}
}
Compute Similarity
Calculate similarity between embeddings.
Endpoint: POST /embed/similarity
Request:
{
"embedding1": [0.1, -0.2, ...],
"embedding2": [0.3, -0.4, ...],
"metric": "cosine" // cosine, euclidean, or dot
}
Response:
{
"similarity": 0.87,
"metric": "cosine",
"interpretation": "highly_similar" // highly_similar, similar, somewhat_similar, different
}
Model Versions
Current Models
Phase 1 (Simple):
- sentence-bert-v1 - Pre-trained Sentence-BERT
- Dimension: 384
- No training required
- Fast inference (~50ms)
Phase 2 (Custom):
- foundation-model-v1 - Custom transformer
- Dimension: 128 (user) / 384 (content)
- Trained on interaction data
- Better personalization
Model Selection
Specify model in request:
Or use header:
Advanced Features
Style Classification
Get style classification along with embedding.
Request:
Response:
{
"embedding": [...],
"style": {
"primary": "romantic",
"secondary": "garden",
"confidence": 0.82,
"scores": {
"romantic": 0.82,
"modern": 0.15,
"rustic": 0.12,
"bohemian": 0.08
}
}
}
Explain Embedding
Get human-readable explanation of what the embedding captures.
Request:
Response:
{
"explanation": {
"dominantFeatures": [
"Modern aesthetic with clean lines",
"Garden/outdoor setting preference",
"Natural light emphasis"
],
"styleProfile": "modern_garden",
"vendorTypeAffinity": {
"venue": 0.9,
"photographer": 0.7,
"florist": 0.6
}
}
}
Performance
Latency Targets
| Operation | Target | Typical |
|---|---|---|
| Single embed | < 100ms | ~50ms |
| Batch embed (10) | < 200ms | ~150ms |
| User embedding | < 150ms | ~120ms |
| Similarity | < 10ms | ~5ms |
Optimization Tips
- Batch requests - Embed multiple items at once
- Cache embeddings - Content embeddings rarely change
- Use simple model - Sentence-BERT for speed
- Compress vectors - Use PCA for storage
Error Handling
Common Errors
Invalid Input:
Model Not Found:
{
"error": {
"code": "MODEL_NOT_FOUND",
"message": "Model 'custom-v2' not available",
"availableModels": ["sentence-bert-v1", "foundation-model-v1"]
}
}
Timeout:
{
"error": {
"code": "INFERENCE_TIMEOUT",
"message": "Model inference took too long",
"timeout_ms": 5000
}
}
Usage Examples
TypeScript
import { VowsMLClient } from '@vows/ml-sdk';
const client = new VowsMLClient({
apiKey: process.env.API_KEY
});
// Embed content
const contentEmbedding = await client.embed.content({
text: "Modern wedding venue Melbourne",
tags: ["venue", "modern"]
});
// Get user embedding
const userEmbedding = await client.embed.user({
userId: "user-123",
interactions: recentInteractions
});
// Compute similarity
const similarity = await client.embed.similarity(
userEmbedding.embedding,
contentEmbedding.embedding
);
console.log(`Similarity: ${similarity}`);
Python
from vows_ml import VowsMLClient
client = VowsMLClient(api_key=os.environ['API_KEY'])
# Embed content
embedding = client.embed.content(
text="Garden wedding photographer",
tags=["photographer", "garden"]
)
# Get user embedding
user_emb = client.embed.user(
user_id="user-123",
interactions=recent_interactions
)
# Find similar users
similar = client.embed.find_similar(
embedding=user_emb,
collection="users",
limit=10
)
cURL
# Embed content
curl -X POST https://ml-inference.vows.social/embed/content \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Modern wedding venue", "tags": ["venue"]}'
# Get user embedding
curl -X POST https://ml-inference.vows.social/embed/user \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"userId": "user-123",
"interactions": [...]
}'
# Compute similarity
curl -X POST https://ml-inference.vows.social/embed/similarity \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"embedding1": [0.1, -0.2, ...],
"embedding2": [0.3, -0.4, ...],
"metric": "cosine"
}'
Related APIs
- Orchestrator API - Uses embeddings for ranking
- Vector Store API - Stores and searches embeddings
- Discovery API - Uses embeddings for vendor search