Free Tier Setup
Run the entire Vows Social AI platform on free-tier cloud services. Perfect for MVP validation.
For complete details, see docs/FREE_TIER_IMPLEMENTATION.md.
Free Tier Services
| Service | Free Tier | Usage | Cost at Scale |
|---|---|---|---|
| Qdrant Cloud | 1GB storage | Vector DB | $0.10/GB after |
| Fly.io | 3 VMs, 256MB | ML inference | $0.02/hour after |
| Supabase | 500MB DB, 2GB bandwidth | PostgreSQL | $25/mo after |
| Cloudflare Workers | 100K req/day | API endpoints | $5/10M req after |
| Cloudflare KV | 100K reads/day | Thompson Sampling state | $0.50/1M after |
| Cloudflare R2 | 10GB storage | Model checkpoints | $0.015/GB after |
| Google Colab | Free GPU (12hrs) | Model training | $10/mo for Pro |
Total Monthly Cost: $0 for first 1,000 users
Setup Guide
1. Qdrant Cloud (Vector Database)
Sign up: https://cloud.qdrant.io
Free Tier: 1GB storage, unlimited queries
Setup:
from qdrant_client import QdrantClient
client = QdrantClient(
url="https://your-cluster.qdrant.io",
api_key="your-api-key"
)
# Create collections
client.create_collection(
collection_name="content_embeddings",
vectors_config={"size": 384, "distance": "Cosine"}
)
client.create_collection(
collection_name="user_embeddings",
vectors_config={"size": 128, "distance": "Cosine"}
)
Optimization: - Use 384-dim Sentence-BERT (not 768-dim) - Compress old embeddings - Archive inactive content - 1GB = ~2.5K content items + 1K users
2. Fly.io (ML Inference)
Sign up: https://fly.io
Free Tier: 3 shared-CPU VMs, 256MB RAM each
Setup:
# Install flyctl
brew install flyctl
fly auth login
# Deploy ML service
cd services/ml-inference
fly launch
fly deploy
# Configure secrets
fly secrets set QDRANT_API_KEY=xxx
fly secrets set SUPABASE_URL=xxx
Optimization: - Use 1 VM initially (save 2 for redundancy) - Load models lazily - Cache embeddings in Cloudflare KV - Use workers AI for edge inference later - 256MB RAM = Sentence-BERT only (no custom model)
3. Supabase (PostgreSQL)
Sign up: https://supabase.com
Free Tier: 500MB database, 2GB bandwidth/month
Setup:
# Create project at https://supabase.com/dashboard
# Run migrations
psql $SUPABASE_URL < migrations/001_initial_schema.sql
# Configure Row Level Security
psql $SUPABASE_URL < migrations/002_rls_policies.sql
Schema Optimization:
-- Store only essential user data
CREATE TABLE users (
id UUID PRIMARY KEY,
created_at TIMESTAMP,
preferences JSONB
);
-- Store interactions efficiently
CREATE TABLE interactions (
id UUID PRIMARY KEY,
user_id UUID,
content_id UUID,
action VARCHAR(20),
duration INT,
created_at TIMESTAMP
);
-- Archive old interactions monthly
-- Keep only last 90 days hot
Optimization: - Archive interactions >90 days - Store embeddings in Qdrant (not Supabase) - Use JSONB efficiently - Index strategically - 500MB = ~100K interactions + 10K content items
4. Cloudflare Workers
Sign up: https://dash.cloudflare.com
Free Tier: 100K requests/day, 10ms CPU time
Setup:
npm install -g wrangler
wrangler login
cd workers/orchestrator
wrangler dev # Local development
wrangler publish # Deploy to production
# Configure secrets
wrangler secret put QDRANT_API_KEY
wrangler secret put SUPABASE_URL
wrangler secret put ML_SERVICE_URL
Optimization: - Cache user embeddings in KV - Batch Qdrant queries - Use Durable Objects for stateful ranking - Keep CPU time <10ms per request - 100K req/day = ~3K requests/hour = hundreds of active users
5. Cloudflare KV
Free Tier: 100K reads/day, 1K writes/day, 1GB storage
Usage:
// Store Thompson Sampling parameters
await KV.put(`thompson:${contentId}:${userId}`, JSON.stringify({
alpha: 1.0,
beta: 1.0
}));
// Cache user embeddings
await KV.put(`user_embedding:${userId}`, embedding, {
expirationTtl: 3600 // 1 hour
});
Optimization: - Cache hot embeddings - Store Thompson Sampling params only - Use TTL aggressively - Batch updates - 1GB = ~1M Thompson Sampling param sets
6. Cloudflare R2
Free Tier: 10GB storage, 1M reads/month
Usage:
Optimization: - Store only production models - Delete old checkpoints - Compress models - 10GB = ~10-20 PyTorch models
7. Google Colab (Training)
Free Tier: T4 GPU, 12-hour sessions
Usage:
# In Colab notebook
from google.colab import drive
drive.mount('/content/drive')
# Train model
model = FoundationModel()
trainer.fit(model, train_loader)
# Save to R2
upload_to_r2(model.state_dict())
Optimization: - Train during off-peak hours - Use checkpointing (12hr limit) - Mixed precision training - Distill models - Free tier sufficient for Phase 1
Resource Monitoring
Qdrant (1GB limit)
# Check cluster usage
cluster_info = client.get_cluster_info()
print(f"Storage used: {cluster_info.storage_used_bytes / 1e9:.2f} GB")
Supabase (500MB limit)
-- Check database size
SELECT pg_size_pretty(pg_database_size('postgres'));
-- Check largest tables
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;
Cloudflare (Daily limits)
Scaling Strategy
At 1K Users (~$0/month)
✓ All services free tier ✓ Sentence-BERT embeddings only ✓ Simple Thompson Sampling ✓ Manual Instagram monitoring
At 5K Users (~$25/month)
- Upgrade Supabase to Pro ($25/mo)
- Still free: Qdrant, Fly.io, Cloudflare
- Consider custom foundation model
At 10K Users (~$100/month)
- Upgrade Fly.io (~$50/mo for better VMs)
- Upgrade Qdrant (~$25/mo for 5GB)
- Cloudflare still free
- Full multi-agent system
At 100K Users (~$500-1K/month)
- Qdrant: ~$100/mo
- Fly.io: ~$200/mo
- Cloudflare: ~$100/mo
- Supabase: ~$100/mo
- Workers AI for edge inference
Free Tier Limits
What's Constrained?
- Qdrant 1GB
- Limits content corpus to ~2.5K items
-
Solution: Archive old content, curate quality
-
Fly.io 256MB RAM
- Can't run large custom models
-
Solution: Start with Sentence-BERT, upgrade when revenue justifies
-
Supabase 500MB
- Limits interaction history
-
Solution: Archive to R2 monthly
-
Cloudflare 100K req/day
- Limits daily active users
- Solution: Upgrade to paid ($5 for 10M req)
What's NOT Constrained?
✓ Cloudflare KV - Effectively unlimited for our use case ✓ Cloudflare R2 - 10GB plenty for model checkpoints ✓ Google Colab - Sufficient for training ✓ Compute latency - Workers are fast enough
Cost Projections
| Users | Qdrant | Fly.io | Supabase | Cloudflare | Total/mo |
|---|---|---|---|---|---|
| 1K | $0 | $0 | $0 | $0 | $0 |
| 5K | $0 | $0 | $25 | $0 | $25 |
| 10K | $25 | $50 | $25 | $0 | $100 |
| 50K | $50 | $100 | $50 | $50 | $250 |
| 100K | $100 | $200 | $100 | $100 | $500 |
Revenue assumptions: $5-10 per paying user/month = profitable at 100 users
Troubleshooting
"Qdrant quota exceeded"
- Archive old embeddings
- Reduce embedding dimensions
- Delete unused content
"Fly.io out of memory"
- Reduce model size
- Use model quantization
- Upgrade to paid tier
"Supabase storage full"
- Archive old interactions
- Move large blobs to R2
- Upgrade to Pro
"Workers CPU limit exceeded"
- Reduce computation per request
- Cache more aggressively
- Use Durable Objects for stateful work
Next Steps
- Quick Start Guide - Set everything up
- Architecture - Understand the system
- Phase 1 Implementation - Start building
Complete details: docs/FREE_TIER_IMPLEMENTATION.md