Alpina Tech deploys and manages Fly Machines for teams that need fast, lightweight compute with full control over machine lifecycle. We handle provisioning, sizing, autoscaling, and multi-region placement β giving you Firecracker microVMs that boot in milliseconds, scale to zero when idle, and run anywhere on Fly.ioβs global network.
Machine Provisioning & Configuration
We deploy Fly Machines sized and configured for your workload:
- CPU and memory allocation per machine β shared vCPU for lightweight services, dedicated CPU for compute-heavy tasks
- GPU machine provisioning for ML inference, image processing, and video transcoding workloads
- Persistent volume attachment for stateful applications, databases, and file storage
- Machine metadata and labeling for organized multi-service deployments
- Init and restart policies β configuring how machines behave on crash, exit, and OOM events
Autoscaling & Scale-to-Zero
Fly Machines provide granular control over compute scaling. We configure:
- Scale-to-zero for development, staging, and low-traffic services β machines stop when idle, start on incoming requests within ~300ms
- Horizontal autoscaling β adding machines based on request concurrency, CPU load, or custom metrics
- Standby machines in target regions for instant failover without cold-start latency
- Minimum and maximum machine counts per region for predictable capacity
- Scheduled scaling for workloads with known traffic patterns β scaling up before peak hours, down during off-hours
Process Groups & Multi-Service Architecture
Fly Machines support multiple process types within a single application. We architect:
- Web process group for HTTP-serving machines with Fly Proxy routing
- Worker process group for background job processing β separate sizing and scaling from web machines
- Cron and scheduler machines that run on a timer and stop between executions
- One-off machines for database migrations, data processing, and batch jobs
- Independent scaling per process group β scale web and worker machines based on different metrics
Machines API & Orchestration
For dynamic workloads, Fly Machines expose a REST API for programmatic control:
- Machine creation and destruction via API for on-demand compute β CI runners, preview environments, sandboxed execution
- Custom orchestration scripts for multi-step workflows with dependent machines
- Webhook integration for machine lifecycle events β start, stop, exit, and error notifications
- Terraform provider for infrastructure-as-code management of Fly Machines
- flyctl CLI automation for deployment pipelines and operational scripts
Multi-Region Placement & Networking
We deploy machines across Fly.ioβs global network:
- Region selection based on user geography and latency requirements
- Private networking via 6PN β machines communicate over WireGuard mesh without public internet exposure
- Fly Proxy configuration with connection handlers, concurrency limits, and health checks
- Cross-region volume replication strategies for stateful multi-region deployments
- Anycast IP routing β users connect to the nearest healthy machine automatically
We extend these setups with custom routing logic for region-specific traffic handling.
How We Approach Fly Machines Projects
Workload Analysis We evaluate your compute requirements β CPU, memory, GPU, disk β and map them to Fly Machine specs. Shared vCPUs handle most web services; dedicated CPUs and GPUs serve specialized workloads. We size based on actual metrics and load testing.
Architecture Design We define process groups, scaling policies, and region placement in fly.toml and deployment scripts. Your infrastructure is versioned in Git and reproducible across environments.
Staged Deployment Machines deploy to a single region first. We validate performance, networking, volume persistence, and autoscaling behavior before expanding to additional regions.
Optimization & Handoff We tune machine sizes based on real traffic, configure scale-to-zero for cost savings, and set up standby machines for critical services. Your team receives flyctl workflows, API integration examples, and operational runbooks.
Technology Stack with Fly Machines
Compute & Runtime
- Fly Machines β Firecracker microVMs with sub-second boot times
- Shared vCPU β cost-efficient compute for web services and lightweight workloads
- Dedicated CPU β consistent performance for compute-heavy applications
- GPU Machines β A100 and L40S GPUs for ML inference and media processing
Networking & Routing
- Fly Proxy β automatic TLS, connection handling, and load balancing
- 6PN Private Network β WireGuard mesh between machines across all regions
- Anycast IPs β global IP addresses routing to nearest healthy machine
- Flycast β private load balancing for internal services without public exposure
Storage & Data
- Persistent Volumes β NVMe-backed block storage attached to individual machines
- Tigris Object Storage β S3-compatible storage on Fly.ioβs network
- LiteFS β distributed SQLite replication across machines
- Fly Postgres β managed PostgreSQL running on Fly Machines
Business Benefits
- Sub-second boot times β Fly Machines start in ~300ms thanks to Firecracker microVM technology. Incoming requests wake stopped machines fast enough that users never notice scale-to-zero latency.
- Pay only for running compute β machines that scale to zero cost nothing when stopped. Development environments, preview deployments, and low-traffic services incur charges only when actively serving requests.
- Full machine lifecycle control β the Machines API lets you create, start, stop, and destroy VMs programmatically. Build custom orchestration for CI runners, sandboxed code execution, or on-demand preview environments.
- VM-level isolation β each Fly Machine runs in its own Firecracker microVM with a dedicated kernel. Stronger security isolation than containers without the overhead of traditional virtual machines.
- Multi-region by default β deploy machines to any of Fly.ioβs 30+ regions. Anycast routing sends users to the nearest machine β no CDN configuration, no geographic load balancing setup.
- Granular per-process scaling β scale web, worker, and cron processes independently. Web machines autoscale on request concurrency while worker machines scale on queue depth β each process group sized to its own demand.
Page Updated: 2026-03-11






