Multi-Agent Collaboration
Agent Hierarchy
graph TB
User[User Request] --> Orchestrator
subgraph "Orchestrator (Lead Agent)"
Orchestrator[Orchestrator]
Thompson[Thompson Sampling<br/>✓ KEPT - Core Algorithm]
RLlib[Ray RLlib<br/>Multi-Agent PPO]
end
subgraph "Specialized Agents"
Discovery[Discovery Agent]
Quality[Quality Guardian]
Archivist[Personal Archivist]
Serendipity[Serendipity Engine]
Forecaster[Engagement Forecaster]
end
subgraph "Observability"
LangSmith[LangSmith<br/>Agent Tracing]
end
Orchestrator --> Thompson
Orchestrator --> RLlib
RLlib --> Discovery
RLlib --> Quality
RLlib --> Archivist
RLlib --> Serendipity
RLlib --> Forecaster
LangSmith -.monitors.-> Discovery
LangSmith -.monitors.-> Quality
LangSmith -.monitors.-> Archivist
LangSmith -.monitors.-> Serendipity
LangSmith -.monitors.-> Forecaster
Discovery --> Content[Content Pool]
Quality --> Scores[Quality Scores]
Archivist --> Memory[User Memory]
Serendipity --> Diversity[Diversity Metrics]
Forecaster --> Notifications[Push Notifications]
Content --> Orchestrator
Scores --> Orchestrator
Memory --> Orchestrator
Diversity --> Orchestrator
Notifications --> User
Multi-Agent PPO Training Flow
sequenceDiagram
participant User
participant Orchestrator
participant Agents as Specialized Agents
participant RLlib as Ray RLlib (Multi-Agent PPO)
participant LangSmith
Orchestrator->>Agents: Delegate subtasks
LangSmith->>Agents: Trace agent decisions
Agents->>Orchestrator: Return results
Orchestrator->>User: Present feed
User->>Orchestrator: Interaction (reward signal)
Orchestrator->>RLlib: Report episode data
RLlib->>RLlib: PPO policy updates
RLlib->>Agents: Updated policy networks
LangSmith->>LangSmith: Log training metrics
loop Scheduled Training (2 AM daily)
RLlib->>RLlib: Batch policy optimization
RLlib->>LangSmith: Training metrics
end
Observability Architecture
graph LR
subgraph "Modal Platform"
Agents[Multi-Agent System]
RLlib[Ray RLlib Training]
end
subgraph "LangSmith (5K traces/month free)"
Traces[Agent Traces]
Metrics[Training Metrics]
Dashboard[Dashboard UI]
end
subgraph "console.vows.social"
AdminDash[Admin Dashboard]
RLMetrics[RL Performance]
end
Agents --> Traces
RLlib --> Metrics
Traces --> Dashboard
Metrics --> Dashboard
Dashboard --> AdminDash
Metrics --> RLMetrics