CSE221 - lec17: Networking: IX & Snap

date
Dec 3, 2024
slug
cse221-lec17
status
Published
tags
System
summary
type
Post

Datacenters

  • applications must be scalable to run on multiple machines
  • high network bandwidth: 100 - 400 Gbits/s (~120 ns per packet)
  • storage: high speed flash, NVM
  • must minimize overhead: kernel crossing is too expensive. (since many operations are so fast and may be frequent)
 

Kernel Bypassing

notion image
  • traditional network stack: packet processing is in the kernel, the applications need to do sys call to get or send the packets.
  • kernel bypassing: have a library that could access the NIC directly and application could access packets via RAM. (eliminate kernel crossings)

Data path v.s. Control path

  • data path: sending / receiving packets
  • control path: management tasks

Isolation in modern NICs

  • separate virtual queues
  • IOMMU: allows NIC to translate virtual address

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

Challenges

  • High packet rates
  • resource efficiency
  • protection: between applications, network stack is trusted
  • microsecond tail latency: strict latency requirement, usually we are talking about things like p99 latency

IX Overview

notion image

Run to completion & Adaptive batching

  • run to completion: process a packet from start to finish
  • adaptive batching: dynamically vary the batch size based on load
notion image

Zero copy

  • POSIX API: copy packets (from kernel to user / user to kernel)
  • zero copy: design api to avoid copy
    • drawback: hard programming model; may be harmed by uncooperative program

Synchronization free processing

  • partition work across cores
  • each core use dedicated NIC queues
    • Receive Side Scaling
    • hash packets to different queues

Summary

  • protected data plane using hardware virtualization
  • run to completion, adaptive batching, RSS, zero copy

Snap: a Microkernel Approach to Host Networking

Goals (besides those of IX)

  • developing velocity
  • maintainability, flexibility
  • easier deployment
  • easier optimization

Structure of Snap

notion image

LibOS Approach v.s. Microkernel Approach

LibOS(Exokernel)
Microkernel(Snap)
lower overhead, lower latency
better modularity, easier to update
easier to tailor network stack
shared common stack
better isolation
decouple scheduling of application from network stack

Threading

  • use kernel threads

IPC

  • lower overhead due to multi-cores, IPC can be done in different cores

Snap Microkernel

  • control v.s. data plane
  • engines: basic execution units of network stack processing
  • cpu scheduling:
    • dedicated mode: dedicating several cores to snap exclusively
    • spreading engines: spread snap threads to all cores
    • compact: only let snap compete for a small number of cores
notion image

MicroQuanta Kernel Scheduling Class

  • guarantee snap could run for certain portion of time in a period

Transparent Upgrade

  • the blackout period (when service is down) is about 250 ms, and snap allows weekly upgrade.

Summary

  • microkernel based approach for easier updates
  • different modes for engines
  • widely used in google

© Lifan Sun 2023 - 2025