Speaker Notes: - Welcome everyone to "The Road to Multitenancy" - Introduce yourself: Lewis Denham-Parry from Edera - Set context: Platform engineering challenges in multi-tenant environments - Preview: We'll explore the trade-offs and introduce a better solution - Estimated time: 20-30 minutes with Q&A - Encourage questions throughout or save for end
Speaker Notes: - Multi-tenancy: running multiple customers/teams on shared infrastructure - Core challenge: untrusted workloads (you don't control what they run) - Security: tenant A shouldn't access or affect tenant B - Performance: isolation mechanisms often add overhead - Scale: Kubernetes promises density, but security limits it - Cost: shared resources save money vs dedicated infrastructure - The trilemma: traditionally you pick 2 of 3 (security, performance, scale) - Platform engineers face this daily - it's not theoretical
Speaker Notes: - Kubernetes was designed for bin-packing workloads onto nodes - Core value: efficient resource utilization through sharing - But sharing creates security concerns in multi-tenant scenarios - All containers on a node share the Linux kernel - Kernel vulnerability = all tenants on that node at risk - Container escape: break out and access host or other containers - The fundamental contradiction: isolation vs density - Using separate machines for each tenant? That's pre-Kubernetes thinking - We need a better solution that preserves both goals
Speaker Notes: - Most conservative approach: one tenant per machine/cluster - Security is straightforward: physical/VM isolation - No shared kernel = no kernel attack surface between tenants - BUT: this is expensive and wasteful - Example: 100 tenants = 100 machines, even if most are idle - Resource utilization typically 20-30% (wasted capacity) - Operational complexity: managing 100 clusters vs 1 - Infrastructure costs scale linearly with tenant count - This approach works, but it's economically unsustainable - Defeats the whole point of using Kubernetes for efficiency
Speaker Notes: - Default Kubernetes setup: containerd or CRI-O runtime - All containers share the Linux kernel on the host - Container = process isolation using namespaces and cgroups - But namespaces weren't designed for security boundaries - One kernel vulnerability can compromise all containers - Historical examples: Dirty COW, RunC escapes, kernel exploits - Container escape: break out of namespace and access host - Privilege escalation: gain root on host from container - Resource exhaustion: one tenant starves others (noisy neighbor) - Fine for trusted workloads (your own apps) - NOT acceptable for untrusted multi-tenant scenarios - Compliance and security teams rightfully reject this
Speaker Notes: - Kata Containers: lightweight VMs that look like containers - Each container gets its own VM with its own kernel - Uses hardware virtualization (KVM, Firecracker, Cloud Hypervisor) - Strong isolation: kernel vulnerability in one VM doesn't affect others - Kubernetes compatible: implements CRI, drop-in replacement - Security win: finally proper isolation for multi-tenancy - BUT: performance trade-offs - VM startup overhead: 150-300ms with modern configurations (optimized setups), up to 1-2s with older configurations (vs milliseconds for containers) - Memory: each VM reserves memory for kernel (~100MB overhead) - High-churn workloads (serverless, batch jobs) suffer most - Infrastructure: need nested virtualization in cloud, specific host setup - Good solution, but sacrifices the speed and density we want - Performance has improved significantly with recent optimizations
Speaker Notes: - gVisor (Google's contribution): userspace kernel approach - Every system call goes through gVisor's "Sentry" process - Implements subset of Linux kernel in Go (in userspace) - Reduces attack surface: app never directly accesses host kernel - Better than shared kernel, but not as strong as VMs - Smaller footprint than Kata: no full VM overhead - BUT: performance tax on system calls - Syscall interception adds latency (microseconds per call) - Performance varies widely: <1% overhead for CPU-bound workloads, 10-30%+ for I/O-heavy applications - At Ant Group production: 70% of apps have <1% overhead, 25% have <3% overhead [Source: gVisor.dev - Running gVisor in Production at Scale in Ant, Dec 2021] - Compatibility: doesn't support all syscalls (some apps won't run) - Debugging: syscall stack traces become complex - Used by Google Cloud Run and some serverless platforms - Trade-off: better than nothing, but still costs performance for syscall-heavy workloads
Speaker Notes: - Firecracker: AWS's answer to lightweight isolation - Powers AWS Lambda - production-proven at massive scale (trillions of requests/month) - Note: Fargate's use of Firecracker is disputed by some sources, so we focus on Lambda where it's confirmed - MicroVMs: stripped-down VMs with minimal device emulation - Fast startup: ~125ms vs older Kata (significant improvement) - Memory: ~5MB overhead vs ~100MB for traditional VMs - KVM virtualization: hardware-level isolation guarantee - BUT: still has VM layer, just optimized - Nested virtualization: need specific host configuration in cloud - Purpose-built for serverless: not general-purpose container runtime - Trade-off: better than traditional VMs for startup time, but still not container-native - Good for Function-as-a-Service, less ideal for long-running workloads
Speaker Notes: - Bare metal: the ultimate isolation approach - separate physical servers - Maximum isolation: physical network boundaries, no shared CPU/memory/kernel - Predictable performance: no virtualization overhead, no noisy neighbors - Full hardware access: GPUs, specialized hardware, direct I/O - No hypervisor tax: applications run at native hardware speed - BUT: this is the most expensive and least scalable option - Resource utilization: typical 10-30% (70-90% wasted capacity) - Infrastructure costs: $100-$500/month per server, multiplied by tenant count - Provisioning time: minutes to hours vs seconds for containers - Scaling: adding 100 tenants = buying 100 servers - This approach only makes sense for specialized workloads: - High-security government/financial workloads - GPU-intensive ML training with dedicated hardware - Compliance requirements mandating physical separation - For most multi-tenant platforms, bare metal defeats the purpose - Including this to show the full spectrum of isolation options
Speaker Notes: - Let's visualize what we've learned across these approaches - Separate machines: secure and performant per tenant, but doesn't scale - Shared kernel: scales great, but insecure for multi-tenancy - Kata: good security, but performance suffers (VM overhead) - gVisor: middle ground, but still performance penalty - Firecracker: better performance than Kata, but still VM-based (moderate scale) - Bare metal: maximum isolation and performance, but worst scalability and cost - Notice the pattern: every solution compromises something - Security OR performance OR scale - pick 2, sacrifice 1 - Complexity column: all add operational overhead - Even with 7 different approaches, the market gap remains - No solution delivers all three until now - Platform engineers are stuck with trade-offs - This is where Edera enters the picture - (Pause before next slide for impact)
Speaker Notes: - Introducing Edera: a different approach to the problem - Key insight: focus on the container runtime layer - Runtime sits between Kubernetes and the containers - This is where isolation decisions are made - By innovating at the runtime, we can optimize both security AND performance - Lightweight VMs (zones): each with its own kernel, but without traditional VM overhead - Uses paravirtualization to avoid VM startup and memory penalties - Near-native performance: minimal overhead through optimized hypercalls - Kubernetes native: implements CRI interface, drop-in compatible - Minimal changes: don't need to redesign your platform - This is the "best of all worlds" solution - Let's look at how it actually works
Speaker Notes: - Technical architecture: how Edera achieves security + performance - CRI compatible: works with any Kubernetes distribution (EKS, GKE, AKS, vanilla) - Zone isolation: each container gets its own "zone" (lightweight VM) with full Linux kernel - Type-1 hypervisor: microkernel written in Rust for minimal attack surface - Paravirtualization: guest kernel uses hypercalls for privileged operations - Unlike gVisor (intercepts all syscalls), Edera delegates through hypervisor - Unlike Kata/Firecracker (traditional VMs), Edera uses paravirtualization for efficiency - Resource guarantees: per-tenant CPU/memory/I/O limits enforced by hypervisor - Network segmentation: automatic tenant isolation at network layer - Secure compute profiles: eBPF-based security policies - Paravirtualized syscalls: 3% faster than Docker, avoids costly emulation - Gateway network control: protect-network service mediates all packet routing - Result: VM-level security isolation without traditional VM performance penalty
Speaker Notes: - Let's break down the concrete benefits for platform teams - SECURITY: VM-level tenant isolation with each zone having its own kernel - Reduced attack surface: Type-1 hypervisor with microkernel design - Container escape protection: hypervisor boundary prevents cross-zone access - Zero-trust network: no lateral movement between tenants - PERFORMANCE: this is where Edera shines vs Kata/gVisor - Near-native: < 5% overhead on most workloads (vs 10-30% for gVisor, startup delays for Kata) - Cold starts: ~750ms vs 1.9s for Kata, 2.5x faster (critical for serverless, batch) - Memory: minimal overhead per zone through paravirtualization - Paravirtualization advantage: avoids traditional VM overhead while maintaining isolation - 3% faster syscalls than Docker, 0.9% slower CPU - essentially native performance
Speaker Notes: - OPERATIONAL: platform engineers' favorite part - Kubernetes native: kubectl, Helm, GitOps all work unchanged - Simple deployment: update container runtime, no architecture redesign - Minimal changes: existing workloads run without modification - Finally a solution that doesn't force painful trade-offs - This is the trifecta: secure, fast, AND simple to deploy - You don't have to sacrifice one for the other anymore
Speaker Notes: - What does this mean for platform engineering teams in practice? - Developer experience: devs can deploy without waiting for security reviews - Self-service platforms: safe to give tenants direct k8s access - Cost optimization: 3-5x higher density vs separate machines - Cluster consolidation: 100 tenants on 20 nodes vs 100 clusters - Operations: single control plane, unified monitoring, simpler upgrades - Compliance: pass security audits without sacrificing speed - Use case 1: SaaS platforms - customer workloads are inherently untrusted - Use case 2: CI/CD - running arbitrary build scripts safely - Use case 3: Dev environments - developers testing risky code - Use case 4: Edge - limited resources, need density AND security - Bottom line: build platforms that are both secure and fast - No more "we can't do that for security reasons" blockers
Speaker Notes: - First two key use case categories for Edera Containers - UNTRUSTED CODE: This is the classic multi-tenancy problem - Example: GitHub Actions, GitLab Runners - running arbitrary user code - Code evaluation platforms: LeetCode, HackerRank, online IDEs - Production sandboxes: allow customers to run custom code in your SaaS - Key requirement: isolation without sacrificing speed - MULTI-TENANCY: Shared infrastructure scenarios - SaaS platforms: Shopify, Salesforce-style multi-tenant applications - Shared k8s clusters: avoid cluster-per-tenant cost explosion - Developer environments: give teams isolated namespaces with confidence - Key requirement: tenant isolation + resource efficiency
Speaker Notes: - Additional use case categories for Edera Containers - COMPLIANCE: Meeting regulatory requirements - PCI-DSS: payment processing workloads must be isolated - HIPAA: healthcare data workloads need strong boundaries - SOC 2: security audits require demonstrable isolation - Financial services: regulatory mandates for workload separation - Key requirement: auditable isolation that passes compliance - EDGE COMPUTING: Limited resources with security needs - Edge nodes: small servers with limited CPU/memory - IoT gateways: running third-party code at the edge - Retail/manufacturing: edge deployments in untrusted environments - Key requirement: lightweight isolation on constrained hardware - All of these work today with Edera - not theoretical use cases - For case studies, visit edera.dev
Speaker Notes: - GPU use cases: increasingly important as AI workloads grow - GPUS & AI INFRASTRUCTURE: The core problem - GPU sharing: GPUs are expensive ($10k-$50k each), need multi-tenancy - Training workloads: multiple teams training models on shared GPU clusters - Inference serving: serving multiple models/customers from shared GPUs - GPU security: GPUs have their own attack surface and side channels - GPU memory attacks: one tenant reading another's GPU memory - Side channels: timing attacks via shared GPU execution units - Edera isolates GPU access just like CPU/memory isolation - COMPLIANCE: Regulatory requirements for GPU workloads - Healthcare AI: training on patient data requires HIPAA compliance - Financial ML: fraud detection models under regulatory oversight - Government AI: defense and intelligence with strict security requirements - Research: universities with sensitive datasets (genomics, etc.) - Key issue: traditional isolation doesn't cover GPU attack surface
Speaker Notes: - KEY BENEFITS: What Edera for GPUs provides - GPU memory isolation: tenants can't access each other's GPU memory - Critical security boundary that most GPU platforms lack - Prevents sensitive model/data leakage between tenants - Performance: minimal overhead, near-native GPU throughput - Don't sacrifice speed for security - GPU operations run at near-hardware speed - Utilization: safely share expensive GPUs across multiple tenants - GPUs cost $10k-$50k each - sharing is essential - Enable multi-tenancy without compromising security - Monitoring: per-tenant GPU metrics and resource limits - Visibility into which tenant is using GPU resources - Enforce fair sharing and prevent resource hogging - This is cutting-edge: most GPU platforms don't have proper isolation - Shared GPU clusters today are often "trust-based" - not acceptable - Edera extends container isolation to GPU workloads - For GPU-specific case studies and benchmarks, visit edera.dev
Speaker Notes: - Wrapping up: the multi-tenancy challenge has a solution - Key takeaway 1: the old trade-offs (separate machines, Kata, gVisor) force compromises - Key takeaway 2: runtime layer is the innovation point - not app layer, not orchestrator - Key takeaway 3: Edera proves you can have security AND performance - Key takeaway 4: platform engineers can finally build what they've always wanted - Next steps for the audience: - Visit edera.dev to learn more about the technology - GitHub has open source tools and examples - am-i-isolated: fun tool to test your current isolation - Run it on your clusters, see how containers can escape - Demonstrates the problem visually - We're building the future of secure container orchestration - The road to multitenancy doesn't require trade-offs anymore
Speaker Notes: - Thank the audience for their time and attention - Open the floor for questions - Common questions to expect: Q: "How does Edera compare to Firecracker/AWS Lambda's approach?" A: See slide 9 for Firecracker details. Key difference: Both use VMs for isolation. Firecracker has faster startup (~125ms vs ~750ms for Edera) using traditional KVM microVMs, while Edera uses paravirtualization to achieve better runtime performance (near-native, <5% overhead). Firecracker optimized for serverless/FaaS cold starts, Edera optimized for general container workloads. Both provide strong isolation, different performance optimization strategies. Q: "What's the actual performance overhead percentage?" A: < 5% for most workloads, compared to 10-30% for gVisor and VM startup delays for Kata. Q: "Does this work with existing Kubernetes deployments?" A: Yes, CRI-compatible. Update container runtime, workloads run unchanged. Q: "What's the learning curve for platform teams?" A: Minimal. If you know Kubernetes, you already know how to use it. Q: "Is it production-ready?" A: Visit edera.dev for current status and case studies. - Available after the talk for one-on-one discussions - Point them to resources on the slide for self-service learning - Thank event organizers and venue