CSE221 - lec17: Networking: IX & Snap
date
Dec 3, 2024
slug
cse221-lec17
status
Published
tags
System
summary
type
Post
Datacenters
- applications must be scalable to run on multiple machines
- high network bandwidth: 100 - 400 Gbits/s (~120 ns per packet)
- storage: high speed flash, NVM
- must minimize overhead: kernel crossing is too expensive. (since many operations are so fast and may be frequent)
Kernel Bypassing
- traditional network stack: packet processing is in the kernel, the applications need to do sys call to get or send the packets.
- kernel bypassing: have a library that could access the NIC directly and application could access packets via RAM. (eliminate kernel crossings)
Data path v.s. Control path
- data path: sending / receiving packets
- control path: management tasks
Isolation in modern NICs
- separate virtual queues
- IOMMU: allows NIC to translate virtual address
IX: A Protected Dataplane Operating System for High Throughput and Low Latency
Challenges
- High packet rates
- resource efficiency
- protection: between applications, network stack is trusted
- microsecond tail latency: strict latency requirement, usually we are talking about things like p99 latency
IX Overview
Run to completion & Adaptive batching
- run to completion: process a packet from start to finish
- adaptive batching: dynamically vary the batch size based on load
Zero copy
- POSIX API: copy packets (from kernel to user / user to kernel)
- zero copy: design api to avoid copy
- drawback: hard programming model; may be harmed by uncooperative program
Synchronization free processing
- partition work across cores
- each core use dedicated NIC queues
- Receive Side Scaling
- hash packets to different queues
Summary
- protected data plane using hardware virtualization
- run to completion, adaptive batching, RSS, zero copy
Snap: a Microkernel Approach to Host Networking
Goals (besides those of IX)
- developing velocity
- maintainability, flexibility
- easier deployment
- easier optimization
Structure of Snap
LibOS Approach v.s. Microkernel Approach
LibOS(Exokernel) | Microkernel(Snap) |
lower overhead, lower latency | better modularity, easier to update |
easier to tailor network stack | shared common stack |
better isolation | decouple scheduling of application from network stack |
Threading
- use kernel threads
IPC
- lower overhead due to multi-cores, IPC can be done in different cores
Snap Microkernel
- control v.s. data plane
- engines: basic execution units of network stack processing
- cpu scheduling:
- dedicated mode: dedicating several cores to snap exclusively
- spreading engines: spread snap threads to all cores
- compact: only let snap compete for a small number of cores
MicroQuanta Kernel Scheduling Class
- guarantee snap could run for certain portion of time in a period
Transparent Upgrade
- the blackout period (when service is down) is about 250 ms, and snap allows weekly upgrade.
Summary
- microkernel based approach for easier updates
- different modes for engines
- widely used in google