Architecture Overview

Learn about Noid's architecture, design principles, and how the components work together.

System Architecture

Noid consists of several key components working together to provide fast, isolated microVM environments:

┌─────────────────────────────────────────────────┐
│                  Noid CLI                       │
│         (Command-line Interface)                │
└────────────────┬────────────────────────────────┘
                 │ HTTP/WebSocket
                 ▼
┌─────────────────────────────────────────────────┐
│              Noid Server                        │
│  ┌──────────────────────────────────────────┐  │
│  │         API Layer (REST + WS)            │  │
│  └──────────────────────────────────────────┘  │
│  ┌──────────────────────────────────────────┐  │
│  │       VM Management Service              │  │
│  └──────────────────────────────────────────┘  │
│  ┌──────────────────────────────────────────┐  │
│  │      Checkpoint Service                  │  │
│  └──────────────────────────────────────────┘  │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│           Firecracker VMM                       │
│  ┌────────┐  ┌────────┐  ┌────────┐            │
│  │  VM 1  │  │  VM 2  │  │  VM 3  │  ...       │
│  └────────┘  └────────┘  └────────┘            │
└─────────────────────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│              Host Kernel (Linux)                │
│           KVM, Network, Storage                 │
└─────────────────────────────────────────────────┘

Core Components

1. Noid CLI

The command-line interface for managing VMs:

User-friendly commands (create, destroy, exec, etc.)
Communication with Noid server via REST API
Interactive console via WebSocket
Local configuration management

2. Noid Server

The central control plane:

API Layer: REST endpoints and WebSocket handlers
VM Manager: Lifecycle management for VMs
Checkpoint Service: Snapshot creation and restoration
Network Manager: IP allocation and routing
Storage Manager: Rootfs and checkpoint storage

3. Firecracker VMM

AWS's open-source virtual machine monitor:

Lightweight microVM creation (< 125ms boot time)
Minimal memory overhead (~5MB per VM)
Strong isolation via KVM
Snapshot/restore support

4. Host System

Linux kernel with KVM support:

Hardware virtualization (KVM)
Network namespaces for isolation
Storage (btrfs recommended for instant snapshots)

Design Principles

1. Speed

Sub-second VM creation using golden snapshots
Instant checkpoints with btrfs copy-on-write
Minimal overhead with Firecracker's lightweight design

2. Isolation

Process isolation: Each VM runs in its own process
Network isolation: Separate network namespaces
Filesystem isolation: Independent rootfs for each VM
Resource limits: CPU and memory constraints

3. Simplicity

Simple CLI: Intuitive commands
Minimal configuration: Works out of the box
Clear abstractions: VMs and checkpoints

4. Reproducibility

Deterministic environments: Checkpoints capture complete state
Golden snapshots: Reusable base images
Version control: Track checkpoint history

Request Flow

VM Creation

1. CLI → Server: POST /vms
2. Server validates request
3. Server copies base rootfs
4. Server configures Firecracker
5. Firecracker starts VM with KVM
6. Server assigns IP address
7. Server returns VM details → CLI

Command Execution

1. CLI → Server: POST /vms/:name/exec
2. Server connects to VM via vsock
3. Server sends command
4. VM executes command
5. Server captures output
6. Server returns result → CLI

Checkpoint Creation

1. CLI → Server: POST /vms/:name/checkpoints
2. Server pauses VM
3. Firecracker creates snapshot
   - Memory snapshot → file
   - Disk snapshot → btrfs reflink (instant)
4. Server resumes VM
5. Server stores checkpoint metadata
6. Server returns checkpoint → CLI

Storage Architecture

Rootfs Management

/var/lib/noid/
├── base/
│   └── rootfs.ext4          # Base root filesystem
├── vms/
│   ├── vm-1/
│   │   └── rootfs.ext4      # VM 1's filesystem
│   ├── vm-2/
│   │   └── rootfs.ext4      # VM 2's filesystem
│   └── ...
└── checkpoints/
    ├── vm-1/
    │   ├── checkpoint-1/
    │   │   ├── memory.snap  # Memory state
    │   │   ├── rootfs.ext4  # Disk state
    │   │   └── metadata.json
    │   └── ...
    └── ...

With btrfs:

VM rootfs created via cp --reflink (instant)
Checkpoints use reflinks (no copy overhead)

Without btrfs:

VM rootfs copied normally (~30s)
Checkpoints perform full copy

Network Architecture

IP Allocation

┌──────────────┐
│   Host       │
│  (bridge)    │
│ 172.16.0.1   │
└──────┬───────┘
       │
       ├── VM 1: 172.16.0.10
       ├── VM 2: 172.16.0.11
       ├── VM 3: 172.16.0.12
       └── ...

VMs connected via TAP devices
DHCP assigns IPs from pool
NAT for internet access
VMs can communicate with each other

Performance Characteristics

Operation	Time	Notes
VM creation (cold)	~30-60s	Copying base rootfs
VM creation (golden)	~5-10s	From checkpoint
VM boot	<125ms	Firecracker boot time
Checkpoint creation	<100ms	With btrfs
Checkpoint restore	<50ms	Memory + disk restore
Command execution	~10-50ms	Latency overhead

Scalability

Single Host Limits:

VMs: 100-1000+ depending on resources
Memory: ~5MB overhead per VM + allocated memory
CPU: Oversubscription supported
Disk: Limited by storage capacity

Resource Sharing:

VMs share kernel via KVM
Copy-on-write for filesystem
Network bandwidth shared
CPU time-sliced

Security Model

Isolation Layers

KVM: Hardware-level isolation
Firecracker: Minimal attack surface
Network: Separate namespaces
Filesystem: Isolated rootfs

Attack Surface

Minimal: Firecracker has ~50K lines of code
Regular security updates
No direct VM-to-host access
API authentication required

Fault Tolerance

VM Failures

VMs are isolated; one failure doesn't affect others
Server monitors VM health
Automatic cleanup of crashed VMs

Server Failures

VMs are lost on server restart
Checkpoints persist on disk
Restore from checkpoints after recovery

Next Steps

Components - Detailed component breakdown
Security - Security architecture
Server Setup - Deploy Noid server