Architecture
Daemon Architecture
The Rust daemon that runs on each node
Daemon Architecture
The StellarStack daemon is a lightweight Rust application that runs on each node, managing Docker containers and communicating with the control plane.
Responsibilities
- Docker container management
- Direct WebSocket server for console access
- Metrics collection and reporting
- File system operations
- Health monitoring
- Secure communication with control plane
Architecture
// Daemon components
├── main.rs // Entry point, config loading
├── api/
│ ├── mod.rs
│ ├── websocket.rs // WebSocket server for console
│ └── grpc.rs // Optional gRPC for internal comms
├── docker/
│ ├── mod.rs
│ ├── container.rs // Container lifecycle
│ ├── images.rs // Image management
│ └── networks.rs // Network management
├── redis/
│ ├── mod.rs
│ ├── subscriber.rs // Command listener
│ └── publisher.rs // Event publisher
├── metrics/
│ ├── mod.rs
│ ├── collector.rs // System metrics
│ └── container_stats.rs // Per-container stats
├── auth/
│ ├── mod.rs
│ ├── token.rs // JWT validation
│ └── mtls.rs // mTLS for API comm
└── files/
├── mod.rs
└── manager.rs // File operationsKey Features
- Async Runtime: Built on Tokio for high performance
- Zero-copy: Efficient memory usage where possible
- Graceful Shutdown: Proper cleanup on termination
- Auto-reconnect: Automatic Redis reconnection
- Local State: SQLite for offline queue persistence
Daemon Lifecycle
┌─────────────┐
│ OFFLINE │
└──────┬──────┘
│
(Registration)
│
▼
┌─────────────┐
┌───────│ STARTING │───────┐
│ └──────┬──────┘ │
│ │ │
(Failed) (Connected) (Timeout)
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ERROR │ │ ONLINE │ │ UNHEALTHY │
└─────────────┘ └──────┬──────┘ └──────┬──────┘
│ │
(Missed heartbeats) │
│ │
└──────┬───────┘
│
(Recovery or)
(Admin action)
│
▼
┌─────────────┐
│ DRAINING │──── (Move servers)
└──────┬──────┘
│
(All servers moved)
│
▼
┌─────────────┐
│ OFFLINE │
└─────────────┘Heartbeat & Health
The daemon sends a heartbeat every 30 seconds:
{
"type": "heartbeat",
"nodeId": "node_xxxxx",
"timestamp": 1702156800,
"status": "healthy",
"metrics": {
"cpuUsage": 45.2,
"memoryUsage": 68.5,
"diskUsage": 32.1,
"activeContainers": 12,
"networkRx": 1024000,
"networkTx": 512000
}
}Health Status:
- Unhealthy: After 3 missed heartbeats (90s)
- Offline: After 5 missed heartbeats (150s)
Command Types
type CommandType =
| "server.create"
| "server.start"
| "server.stop"
| "server.restart"
| "server.kill"
| "server.delete"
| "server.reinstall"
| "files.read"
| "files.write"
| "files.delete"
| "backup.create"
| "backup.restore"
| "node.update"
| "node.drain"
| "node.shutdown";Configuration
The daemon is configured via daemon.toml:
[node]
id = "node_xxxxx"
api_token = "strk_node_xxxxx"
[redis]
url = "redis://redis.stellarstack.app:6379"
[server]
host = "0.0.0.0"
port = 5000
[docker]
socket = "/var/run/docker.sock"
[logging]
level = "info"
file = "/var/log/stellarstack/daemon.log"WebSocket Server
The daemon runs a WebSocket server for direct console access:
pub async fn console_handler(
ws: WebSocketUpgrade,
headers: HeaderMap,
) -> impl IntoResponse {
// Extract and validate token
let token = headers
.get("Authorization")
.and_then(|v| v.to_str().ok())
.and_then(|v| v.strip_prefix("Bearer "))
.ok_or(AuthError::MissingToken)?;
let claims = decode::<ConsoleTokenPayload>(
token,
&DecodingKey::from_secret(config.jwt_secret.as_bytes()),
&Validation::default(),
)?;
// Attach to container PTY
ws.on_upgrade(|socket| handle_console(socket, claims))
}Security
mTLS Configuration
pub struct MtlsConfig {
pub ca_cert: Certificate, // Control plane CA
pub node_cert: Certificate, // Node's certificate (signed by CA)
pub node_key: PrivateKey, // Node's private key
}
impl MtlsConfig {
pub fn client_config(&self) -> ClientConfig {
// Configure mTLS for outbound connections to control plane
}
pub fn server_config(&self) -> ServerConfig {
// Configure mTLS for inbound connections (optional)
}
}Token Types
| Token Type | Format | Purpose | Lifetime |
|---|---|---|---|
| Registration | strk_reg_xxxxx | One-time node registration | 24 hours |
| Node API | strk_node_xxxxx | Daemon ↔ Redis communication | 1 year |
| Console | strk_console_xxxxx | User ↔ Daemon WebSocket | 5 minutes |