WebAssembly Serverless vs Kubernetes at the Edge

A European retailer replaced their AWS Lambda functions with WebAssembly modules running at the edge. Same workload. 95% cost reduction. Not because Lambda was expensive. Because over 80% of container spend is on idle infrastructure — and WebAssembly's instant scale-to-zero means you only pay when code is actually executing.

Fastly's Compute platform cold-starts a Wasm module in under 50 microseconds. A Docker container on the same workload? 100-300 milliseconds. That's not a percentage improvement. That's three orders of magnitude. And it changes what's architecturally possible at the edge.

The Numbers Are Real

WebAssembly isn't theoretical anymore. 41% of developers are using it in production. 28% are piloting or planning adoption. Cloudflare Workers processes 10 million+ Wasm-powered requests per second, with 34% of Workers deployments now including WASM components — up from 12% in 2023.

The adoption drivers are consistent across surveys: 47% cite faster execution, 46% cite cross-platform compatibility, and 45% cite improved security. Wasm 3.0 became a W3C standard in September 2025, and WASI 0.3.0 shipped in early 2026 with native async I/O support.

Nearly 40% of new enterprise applications will use edge computing capabilities by 2026. The Wasm ecosystem is gearing up: the community is working toward a Component Model 1.0 release that aims to provide a stable platform "for decades."

What's emerging is "Serverless 2.0" — stateful, distributed applications that scale instantly globally, replicating data and logic to the nearest edge node. Not just functions-as-a-service, but entire application backends running at the edge with sub-millisecond latency. And Wasm is the runtime making it technically possible.

39% of developers still consider Wasm "not applicable" to their work. That number will change fast as the tooling matures. Two years ago, Wasm was a browser technology. Today it's running payment processing at American Express and video rendering at Amazon Prime Video.

Cold Start: The Benchmark That Matters Most

Cold start latency determines whether serverless is viable for latency-sensitive workloads. Here's where Wasm obliterates containers:

Runtime	Median Cold Start	P99 Cold Start
Wasm (Wasmtime AOT)	0.04ms (40 microseconds)	0.12ms
Wasm (V8/Workers)	0.5ms	2ms
V8 Isolate (JavaScript)	1.5ms	5ms
gVisor Container	50ms	150ms
runc Container	100ms	300ms
Firecracker microVM	125ms	200ms

Wasm startup is 100-3,000x faster than container-based alternatives. The reason: a Wasm module is a pre-compiled binary. There's no filesystem to mount, no process namespace to create, no network stack to configure. The runtime loads the module into memory and starts executing.

Binary size tells the same story:

	Wasm Module	Container Image
Hello World	100-500 KB	10-15 MB
Typical workload	1-5 MB	30-100 MB

On a 128GB RAM host, you can run 15,000 Wasm instances versus 750 containers — a 20x density advantage. Wasmer reports running over half a million apps across just a handful of servers.

Execution Performance: Closer to Native Than You'd Think

The cold start numbers get all the attention, but steady-state performance matters too. Wasm runs at 10-18% overhead versus native execution:

Workload	Native (Rust)	Wasm	Overhead
JSON parsing (10KB)	1.2M ops/s	1.05M ops/s	12.5%
AES-256-GCM encrypt	4.8M ops/s	4.2M ops/s	12.5%
SHA-256 hash (1KB)	6.1M ops/s	5.3M ops/s	13.1%
HTTP request routing	3.4M ops/s	3.0M ops/s	11.8%

For CPU-bound tasks, Wasm is 5-15x faster than optimized JavaScript. That's not a theoretical benchmark — it's real-world production performance. Google migrated the Google Sheets calculation engine to WasmGC and it runs 2x faster than JavaScript. Figma's C++ graphics engine compiled to Wasm achieved 3x load time improvement with parsing running 20x faster.

The Platform War: Who's Winning at the Edge

Fermyon Spin + Akamai

Akamai acquired Fermyon in December 2025 to compete with Cloudflare Workers. Fermyon's Spin 3.0 claims "tens of thousands of WebAssembly binaries can run in a single Spin instance while keeping startup times under a millisecond."

Spin supports Rust, TypeScript, Python, .NET, and Go. It's built on Wasmtime with component dependencies, selective deployments, and OpenTelemetry observability. SpinKube bridges the gap between Wasm and Kubernetes by enabling Wasm workloads through the Container Runtime Interface.

# Create a new Spin app
spin new -t http-rust my-edge-function
cd my-edge-function

# Build and deploy
spin build
spin deploy

The Fermyon acquisition matters strategically. Akamai's existing EdgeWorkers (V8-based) were "more complex and less integrated" than Cloudflare Workers. Fermyon's Wasmtime sandboxing also offers "stronger security boundaries" than V8 isolates.

Cloudflare Workers

The incumbent. 10 million+ requests per second. 34% of deployments now include Wasm. Recently added Python Workers with Wasm snapshots, reducing heavy package load from ~10s to ~1s. Python Workers start 2.4x faster than AWS Lambda and 3x faster than Google Cloud Run.

Fastly Compute

The performance leader. Under 50 microsecond cold starts. Uses Wasmtime directly (not V8). More restrictive — Rust and JavaScript/TypeScript only — but faster for workloads that fit.

wasmCloud (CNCF)

A distributed application runtime using NATS for messaging. Focus on polyglot, multi-cloud/edge deployment. Used by Adobe, BMW, and MachineMetrics in production. wasmCloud's differentiator is its "actor model" approach — components communicate through well-defined interfaces, making it possible to swap implementations without recompiling.

The Runtimes Powering It All

Behind these platforms, three runtimes compete:

Wasmtime (Bytecode Alliance): The reference runtime, Rust-based. Powers Fermyon Spin and Fastly Compute. AOT compilation delivers the best cold start (40 microseconds median). This is the safe default for production.
WasmEdge (CNCF Sandbox): Claims 100x faster startup and 20% faster runtime than Linux containers. Focus on edge AI and automotive use cases. Strong in the Chinese tech ecosystem.
Wasmer: Offers Wasmer Edge with half a million apps running on a handful of servers. Promotes WASIX — a fuller POSIX compatibility layer that's non-standard but more practical for porting existing applications. Recently open-sourced Edge.js for safe Node.js workloads.

The runtime choice matters less than you'd think for most workloads. All three execute the same Wasm binaries. The differences are in ecosystem integration, advanced features (threading, SIMD), and the specific edges they're optimized for. For most teams, pick Wasmtime (widest ecosystem support) unless you have a specific reason for WasmEdge (edge AI) or Wasmer (POSIX porting).

Production Case Studies

Amazon Prime Video: 37,000 lines of Rust compiled to Wasm, supporting 8,000+ unique device types. Average frame times dropped from 28ms to 18ms. Worst-case frames: 45ms to 25ms.

Adobe: Using CNCF wasmCloud to integrate Wasm with multi-tenant Kubernetes. Quote from their engineers: "A lot of people are running Kubernetes, but when you run this kind of multi-tenant setup, while it's operationally excellent, it can be very expensive to run." Cold starts under 1ms using wasmCloud.

American Express: Deploying "the largest commercial WebAssembly implementation" — a Function-as-a-Service platform replacing traditional containers.

Google Sheets: Migrated calculation engine to WasmGC. 2x faster than JavaScript.

Figma: C++ graphics engine compiled to Wasm. 3x load time improvement. Parsing runs 20x faster.

MachineMetrics: Deployed wasmCloud for "dynamic fault-tolerance" across edge and cloud. Manufacturing IoT data processing with guaranteed delivery even when edge connectivity is intermittent.

The pattern across these case studies: Wasm isn't replacing entire application architectures. It's replacing the specific compute layer where startup time, binary size, and sandboxing matter most. Amazon Prime Video uses Wasm for the rendering engine, not the content delivery pipeline. Adobe uses wasmCloud for multi-tenant compute isolation, not for their databases.

When Wasm Replaces Kubernetes (And When It Doesn't)

This is where the "eating Kubernetes" headline needs nuance. Wasm doesn't replace Kubernetes. It replaces specific workloads that Kubernetes handles poorly.

Choose Wasm when:

Cold start latency matters (serverless, edge, event-driven)
High-density multi-tenant isolation is needed
Plugin systems for third-party code execution
Edge devices with limited resources — Kubernetes consumes 30-35% of device resources just for platform operations
Event-driven, short-lived workloads where scale-to-zero is critical

Keep Kubernetes when:

Full POSIX compatibility is needed
Stateful, long-running workloads (databases, caches, queues)
GPU support is required (Wasm runtimes don't support GPU)
Complex existing orchestration pipelines
Your team's operational knowledge is entirely Kubernetes-based

The hybrid approach (what Adobe and BMW are doing):

Run Wasm alongside containers on the same Kubernetes nodes using runwasi (integrates Wasmtime/WasmEdge into Kubernetes CRI)
Use SpinKube for Wasm workloads managed by Kubernetes
Let Kubernetes handle orchestration, networking, and scheduling
Let Wasm handle the actual compute at the edge

This isn't an either/or choice. It's a spectrum. The edge nodes run Wasm for latency-sensitive compute. The central cluster runs Kubernetes for stateful services. The two interoperate through standard APIs.

Here's a practical decision tree:

Is your workload...
├── Short-lived (under 30 seconds)?
│   ├── Latency-sensitive? → Wasm at the edge
│   └── Not latency-sensitive? → Lambda/Cloud Functions
├── Long-running?
│   ├── Stateful (database, cache)? → Kubernetes
│   └── Stateless worker? → Either (Wasm for cost, K8s for ecosystem)
└── Multi-tenant plugin execution?
    └── → Wasm (sandboxing is the killer feature)

The multi-tenant plugin case is underappreciated. If you run third-party code — user-uploaded functions, webhook processors, custom integrations — Wasm's sandboxing model is significantly stronger than container isolation. Each Wasm module runs in a capability-based sandbox where it can only access explicitly granted resources. No filesystem access by default. No network access by default. No ambient authority. This is why Shopify, Figma, and Cloudflare all use Wasm for running untrusted user code.

WASI: The Missing OS Layer

WebAssembly System Interface (WASI) is what makes server-side Wasm possible. Without WASI, Wasm modules can't access files, networks, or system clocks.

WASI 0.2 stabilized filesystem, sockets, HTTP, CLI, clocks, and random interfaces. WASI 0.3.0 (2026) adds the critical missing piece: async I/O and threading support. Previous versions only supported blocking I/O, which was a dealbreaker for high-concurrency server workloads.

The Component Model is the next frontier. It enables language-agnostic composition: a Rust library consumed by a JavaScript app, seamlessly, without FFI or serialization. Think of it as microservices without the network calls.

Solomon Hykes (Docker founder) famously tweeted in 2019: "If WASM+WASI existed in 2008, we wouldn't have needed to created Docker." He later clarified he doesn't believe Wasm will replace containers — Docker itself published a blog post on coexistence. But the statement captured something true: Wasm solves the portability problem better than containers for a specific class of workloads.

What's Still Broken

Honesty time. Wasm isn't ready for everything.

No GPU support. If your workload needs GPU compute (ML inference, graphics), Wasm isn't an option. This is the single biggest limitation for AI-at-the-edge use cases.

Limited multi-threading. WASI 0.3 adds async support, but the threads proposal isn't universally deployed. CPU-bound parallel workloads still favor containers.

57% of developers have never touched Wasm. The CNCF survey shows that Wasm experience actually declined from 2022 to 2023 (50% no experience to 57% no experience). The ecosystem gap is real.

Large modules defeat the purpose. A well-optimized Wasm module is 100-500 KB. But poorly compiled modules can be 5+ MB with cold starts of 500ms-5s — negating the entire advantage. Wasm performance is only as good as your build pipeline.

Rust tax. To get the best Wasm performance, you need Rust or C++. JavaScript/TypeScript compilation to Wasm is improving but adds overhead. Python support is even more nascent. The developer experience gap between writing a Cloudflare Worker in JavaScript and writing a Fermyon Spin app in Rust is significant.

Containers have a 10+ year head start in tooling, CI/CD integration, monitoring, and operational knowledge. Every SRE team knows how to debug a container. Far fewer know how to debug a Wasm module.

Ecosystem immaturity. The number of Wasm-native libraries is a fraction of what npm or PyPI offer. Need to parse XML? There's probably a Wasm-compatible Rust crate. Need to connect to Redis with connection pooling? You might be writing that adapter yourself. The library gap is closing but it's still a real friction point for teams evaluating Wasm for production workloads.

The WASI fragmentation risk. Wasmer's WASIX provides fuller POSIX compatibility but is non-standard. WASI Preview 2 is the Bytecode Alliance standard. If you build on WASIX today, you might be locked into Wasmer. If you build on WASI 0.2, you get portability but fewer system capabilities. The community needs to converge — and it hasn't yet.

Getting Started: Your First Edge Function

If you want to try Wasm at the edge today, the fastest path is Cloudflare Workers with Rust:

# Install wrangler (Cloudflare's CLI)
npm install -g wrangler

# Create a new Rust Worker
wrangler init my-edge-fn --type rust
cd my-edge-fn

// src/lib.rs — a simple edge function
use worker::*;

#[event(fetch)]
async fn main(req: Request, env: Env, _ctx: Context) -> Result<Response> {
    let url = req.url()?;
    let path = url.path();

    match path {
        "/api/hello" => Response::ok("Hello from the edge!"),
        "/api/time" => {
            let now = Date::now().as_millis();
            Response::ok(format!("Server time: {}", now))
        }
        _ => Response::error("Not found", 404),
    }
}

# Deploy to 300+ edge locations worldwide
wrangler deploy

That function runs in under 1ms cold start, at 300+ edge locations, with automatic scaling and zero infrastructure management. If you're already writing TypeScript, Cloudflare Workers supports that natively too — no Rust required. The Wasm compilation happens behind the scenes.

For teams that want more control, Fermyon Spin is the better starting point:

# Install Spin
curl -fsSL https://developer.fermyon.com/downloads/install.sh | bash

# Create a TypeScript HTTP handler
spin new -t http-ts my-function
cd my-function
spin build && spin up

Spin runs locally for development and deploys to Fermyon Cloud (now Akamai) or any Kubernetes cluster with SpinKube.

What I Actually Think

WebAssembly at the edge is real. The cold start advantage is so dramatic — 40 microseconds versus 100+ milliseconds — that it creates genuinely new architectural possibilities. Functions that execute in the time a container takes to start. Real-time processing at locations where running Kubernetes would eat a third of your compute budget just for platform overhead.

But "eating Kubernetes" is premature. The 2026 reality is coexistence. Adobe runs wasmCloud alongside Kubernetes. BMW does the same. SpinKube exists specifically because Kubernetes isn't going away — it's getting Wasm as a complementary runtime.

The winning strategy is surgical adoption. Don't replace your entire infrastructure. Identify the workloads where cold start and density matter — edge functions, serverless APIs, plugin execution, multi-tenant isolation — and move those to Wasm. Keep Kubernetes for everything else.

I think Akamai's Fermyon acquisition is the most important market signal of 2025. It means the CDN providers are betting on Wasm as the edge compute primitive. Cloudflare Workers proved the model. Fastly Compute proved the performance. Fermyon + Akamai bring it to the second-largest CDN in the world.

The developer experience is still the bottleneck. When writing a Wasm edge function is as easy as writing a Next.js API route, adoption will explode. We're not there yet. But Spin 3.0 with TypeScript support and Component Model composability is getting close.

The economics are the real story here, though. That European retailer's 95% cost reduction isn't magic — it's math. Containers sit idle. Wasm modules don't exist until a request arrives. In a serverless model where you pay per invocation, the cold start speed directly translates to cost: faster startup means less billable compute time per request, and scale-to-zero means no cost during idle periods. For workloads with variable traffic — which is most workloads — this changes the economics from "provision for peak" to "pay for actual use."

The security model deserves more attention than it gets. Wasm's capability-based sandboxing is fundamentally different from container isolation. A container shares the kernel with the host. A Wasm module shares nothing. It can't read files it wasn't given explicit access to. It can't make network connections it wasn't granted. Every system resource must be explicitly provided by the host. For multi-tenant platforms running untrusted code, this isn't just better isolation — it's a different security model entirely.

Give it two more years. By 2028, I think every major cloud provider will offer a Wasm-native serverless platform alongside their container services. Not instead of. Alongside. And for the edge, Wasm will be the default — because nobody wants to wait 125 milliseconds for a Firecracker microVM when they could wait 40 microseconds for a Wasm module.

The trajectory is clear even if the timeline is uncertain. Wasm at the edge is following the same adoption curve as containers in 2014: production use cases proving the model, major acquisitions validating the market, and a developer experience that's improving rapidly but isn't quite there yet. The teams that start experimenting now will have a significant operational advantage when the ecosystem matures.

WebAssembly Serverless Is Eating Kubernetes at the Edge

The Numbers Are Real

Cold Start: The Benchmark That Matters Most

Execution Performance: Closer to Native Than You'd Think

The Platform War: Who's Winning at the Edge

Fermyon Spin + Akamai

Cloudflare Workers

Fastly Compute

wasmCloud (CNCF)

The Runtimes Powering It All

Production Case Studies

When Wasm Replaces Kubernetes (And When It Doesn't)

WASI: The Missing OS Layer

What's Still Broken

Getting Started: Your First Edge Function

What I Actually Think

Sources

Enjoyed this article?

The Numbers Are Real

Cold Start: The Benchmark That Matters Most

Execution Performance: Closer to Native Than You'd Think

The Platform War: Who's Winning at the Edge

Fermyon Spin + Akamai

Cloudflare Workers

Fastly Compute

wasmCloud (CNCF)

The Runtimes Powering It All

Production Case Studies

When Wasm Replaces Kubernetes (And When It Doesn't)

WASI: The Missing OS Layer

What's Still Broken

Getting Started: Your First Edge Function

What I Actually Think

Sources