Architecture Components

Detailed documentation of Noid's core components and their interactions.

CLI Component

The command-line interface provides the primary user interaction layer.

Responsibilities

  • Parse and validate user commands
  • Communicate with Noid server via HTTP/WebSocket
  • Format and display responses
  • Manage local configuration (.noid files)
  • Handle authentication tokens

Key Features

  • Context awareness: Uses .noid file for active VM
  • Interactive console: WebSocket-based terminal access
  • Error handling: User-friendly error messages
  • Auto-completion: Tab completion for commands

Server Component

The central control plane orchestrating all VM operations.

API Layer

REST Endpoints:

EndpointMethodPurpose
/vmsGETList all VMs
/vmsPOSTCreate new VM
/vms/:nameGETGet VM details
/vms/:nameDELETEDestroy VM
/vms/:name/execPOSTExecute command
/vms/:name/checkpointsGETList checkpoints
/vms/:name/checkpointsPOSTCreate checkpoint
/vms/:name/restorePOSTRestore from checkpoint

WebSocket Endpoints:

EndpointPurpose
/console/:nameInteractive console
/logs/:nameStream VM logs

VM Manager

Manages the complete VM lifecycle:

Creation Flow:

  1. Validate VM parameters
  2. Allocate resources (IP, storage)
  3. Copy/reflink base rootfs
  4. Configure Firecracker
  5. Start VM process
  6. Register VM in state store

Execution Flow:

  1. Receive exec request
  2. Connect to VM via vsock
  3. Send command with environment
  4. Stream stdout/stderr
  5. Capture exit code
  6. Return results

Destruction Flow:

  1. Stop VM process (SIGTERM)
  2. Wait for graceful shutdown
  3. Force kill if timeout (SIGKILL)
  4. Release IP address
  5. Clean up filesystem
  6. Remove from state store

Checkpoint Service

Handles snapshot creation and restoration.

Checkpoint Creation:

1. Pause VM (freeze process)
2. Call Firecracker snapshot API
   - Memory → /checkpoints/{vm}/{id}/memory.snap
   - Disk state captured
3. Copy/reflink rootfs
4. Store metadata (name, label, timestamp)
5. Resume VM

Restore Process:

1. Stop existing VM (if in-place restore)
2. Copy checkpoint rootfs
3. Configure Firecracker with snapshot
4. Load memory state
5. Resume VM from snapshot
6. Assign new IP

Network Manager

Manages VM networking and IP allocation.

IP Pool Management:

  • DHCP-style IP allocation from pool
  • Track assigned IPs
  • Release on VM destruction
  • Reassign on checkpoint restore

Network Configuration:

  • Create TAP devices for VMs
  • Configure bridge networking
  • Set up NAT for internet access
  • Manage routing rules

Storage Manager

Handles filesystem operations.

Rootfs Management:

# With btrfs
cp --reflink=always base/rootfs.ext4 vms/{name}/rootfs.ext4

# Without btrfs
cp base/rootfs.ext4 vms/{name}/rootfs.ext4

Checkpoint Storage:

  • Organized by VM name
  • Metadata in JSON
  • Automatic cleanup of old checkpoints

Firecracker VMM

Lightweight virtual machine monitor from AWS.

Architecture

┌───────────────────────────┐
│   Firecracker Process     │
│  ┌─────────────────────┐  │
│  │   API Server        │  │ ← HTTP API
│  │   (REST)            │  │
│  └─────────────────────┘  │
│  ┌─────────────────────┐  │
│  │   VMM Core          │  │
│  │   - vCPU threads    │  │
│  │   - Device emulation│  │
│  │   - Snapshot/restore│  │
│  └─────────────────────┘  │
│           │                │
│           ▼                │
│  ┌─────────────────────┐  │
│  │   KVM Interface     │  │
│  └─────────────────────┘  │
└───────────┬───────────────┘
            │
            ▼
  ┌─────────────────┐
  │  Linux KVM      │
  │  (Kernel)       │
  └─────────────────┘

Key Features

  • Fast boot: <125ms cold boot
  • Small footprint: ~5MB memory overhead
  • Secure: Minimal attack surface
  • Snapshot support: Pause/resume VMs

Device Emulation

Firecracker emulates minimal devices:

  • virtio-block: Disk I/O
  • virtio-net: Network I/O
  • Serial console: TTY access
  • vsock: VM-host communication

State Management

In-Memory State

Server maintains runtime state:

interface ServerState {
  vms: Map<string, VMState>;
  ipPool: IPAllocator;
  checkpoints: Map<string, Checkpoint[]>;
}

interface VMState {
  name: string;
  process: ChildProcess;
  config: VMConfig;
  status: 'starting' | 'running' | 'stopped';
  ip: string;
  createdAt: Date;
}

Persistent State

Stored on disk:

  • VM metadata: /var/lib/noid/vms/{name}/metadata.json
  • Checkpoints: /var/lib/noid/checkpoints/{vm}/{id}/
  • Server config: /etc/noid/config.toml

Communication Protocols

CLI ↔ Server

HTTP/REST:

  • Standard CRUD operations
  • JSON request/response
  • Token-based authentication

WebSocket:

  • Console: bidirectional terminal
  • Logs: server-to-client streaming

Server ↔ Firecracker

HTTP API:

# Start VM
PUT /machine-config
PUT /boot-source
PUT /drives/rootfs
PUT /network-interfaces/eth0
PUT /actions {"action_type": "InstanceStart"}

# Create snapshot
PUT /snapshot/create

vsock:

  • Direct VM communication
  • Command execution
  • File transfer

Error Handling

Retry Logic

async function withRetry(operation, maxAttempts = 3) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await operation();
    } catch (error) {
      if (attempt === maxAttempts || !isRetryable(error)) {
        throw error;
      }
      await sleep(1000 * attempt);
    }
  }
}

Cleanup on Failure

  • Failed VM creation → cleanup partial state
  • Failed checkpoint → rollback changes
  • Server crash → orphaned VMs cleaned on restart

Monitoring & Logging

Server Logs

[INFO] VM created: my-vm (172.16.0.10)
[INFO] Checkpoint created: my-vm/golden
[ERROR] VM creation failed: insufficient memory
[WARN] Slow checkpoint creation: 5.2s (expected `<1s`)

Metrics

  • VM creation time
  • Checkpoint size and duration
  • Memory usage per VM
  • Network throughput
  • Error rates

Next Steps