CSE221 Notes
date
Oct 4, 2024
slug
cse221
status
Published
tags
System
summary
type
Post
00 - Introduction and logistics
Couse Objectives & Logistics
- see course websites
Some questions to ponder on relating to “what is OS actually?”
What is OS
- provide abstractions to upper layers
- manage resources from lower layers
What is part of OS v.s. not?
- windowing environment?
- web browsers?
- network stack?
- memory allocation?
How similar are different OSs?
Do OSs change overtime?
01 - Historical Perspectives: THE & Nucleus
The Structure of the “THE” Multiprogramming System
The Goals of the THE Multiprogramming System
- Reduce turn-around time (time between submitted and completed) for short program
- Efficient resouce usage (hardware resources such as cpu, devices, memory)
- Soundness wrt. both design and implementation
- Manage the complexity of the system, which is important for the previous goal for soundness.
Approach used to achieve the goals
The main idea is to building abstractions layer by layer. The whole system is a society of cooperated sequential processes, with synchronization mechanism. The processes at different level implement different abstractions.
- Level 0 - CPU Allocation: handle the clock interrupt and process scheduling. on top of layer 0 the processors shared lost their identities.
- Level 1 - Segment Controller(Memory Management): do the book-keeping stuffs related to virtual memory. Above this level other processes can use segment id instead of physical address to refer to an information unit.
- Level 2 - Message Interpreter(Console Management, virtualizing console I/O device): take care of the allocation of the console keyboard via which the operator communicate with higher-level processes. Above this level the actual physical console keyboard lost its identity.
- Level 3 - I/O Buffering / Un-buffering (Communication with peripherals): manage the communication with devices
- Level 4 - User Program
- Level 5 - Operator
Comments on layered-structure and program verification
The author mentioned the importance of the layered-structure in the process of verifying the design and implementation of the system.
The layered-structure help to manage the complexity of the system, so they could test the system layer by layer, which makes it possible to test and debug such complex system using relatively limiting resources.
Synchronization via semarphore
Two usage:
- Mutex
- Private Semarphore: like conditional variable today
Deadlock prevention
- by leveraging the process hierarchy. High level process will only wait for low level process but the reverse direction is not true.
Summary & take-away
- Layered OS Design
- Abstraction of Sequential Process
- Synchronization via semarphore
- Goal for correctness (easy for program verification)
Additional Notes: Layered OS today
Linux:
- User Layer
- User-Kernel Interface
- Kernel Layer
Windows:
- Apps
- Win32 Sub System
- User-Kernel Interface
- Kernel
- HAL (Hardware Abstraction Layer)
Additional Notes: Hardware Trends
ㅤ | 1968 | 2024 | factor of Improv. |
clock speed | 400KHZ | 3GHZ | 7500x |
Memory Capacity | 32KB | 16GB | 500000x |
Storage | 512KB | 10TB | 20,000,000x |
Storage Revolution Time | 40ms | 8ms | 5x |
Some hardware improves dramatically while some improves not that much(storage acess).
The Nucleus of a Multiprogramming System
The Goals of the Nucleus
- extensibility
- flexibility
Components of System Nucleus
- Fundamental Process Control
- IPC via Message Buffering
- Interrupt, exceptions handling
- Scheduling
Components of OSes on top of Nucleus
- Scheduling
- High-level Process Control
- Memory Management
Economy of Abstractions
Process & Message
- Process
- Internal: normal program execution
- External: wrap hardware devices by process and access them in the way you access process
- Message
- buffer management
- synchronization
Summary & take-away
- flexible & extensible OS design
- The idea of System Nucleus, and layer OSes on top of it.
- Synchronization via Message Passing
- Small Number of Abstractions
02 - Historical Perspectives: TENEX & UNIX
TENEX, a Paged Time Sharing System for the PDP-10
Terminology
explain some confusing terminologies.
- different terms for same thing: monitor := kernel
- same term for different things: virtual machine
- OS-Syscall Interface (in TENEX)
- Virtualizing Hardware (VMWare)
- Language Runtime (JVM)
The Goals of the TENEX System
- SOTA virtual machine
- paged virtual memory
- multi-process with IPC
- file system
- good human engineering
- modular, maintainable
- backward compatibility
- efficiency
Virtual Memory
some terminologies related to modern terms:
- Address Space (same as today)
- Memory Maps / Table: Page Table
- BBN Pager: TLB (page table cache)
Page Sharing
Three types of ptr in page table entry:
- Direct: points to a private page of a process
- Indirect: points to an entry of another process’s page table
- shared: points to an entry of the system’s page table
Copy-On-Write Mechanism enables page sharing to avoid overhead for pages that are only read. When a write reference occurs, trap into OS and make a copy.
Backward Compatibility
TENEX’s approach to achieve backward compatibility:
- Add a compatibility layer: implement all TENEX Syscall by the new instruction JSys, and implement legacy system call by TENEX’s new system call.
Summary & take-away
- Virtual Memory - sharing & copy on write
- time sharing system
- support for backward compatibility
The UNIX Time Sharing System
The Goals of UNIX
- Unifying abstraction
- Easy to use
- Easy to extend
- maintainability
- simplicity
File System
File Data Model
Before Unix:
- impose structure on file
- file size should be predeclared
Unix:
- treat file as a sequence of uninterpreted bytes, no imposed structure.
- allow variable-length file
File System API
Before Unix:
- no buffer cache in kernel
- different API for different access way (random, sequential …), a lot of work for programmers
Unix:
- buffer cache in kernel, which means I/O operations are buffered
- unified API for different access method, which simplifies things for programmers.
File Names & Directory
Before Unix:
- one directory per user
- directory represented different from regular file
Unix:
- hierarchical namespaces
- directory is represented the same way as regular files, but the content is interpreted by the file system.
- files are independent of directory
Device
Before Unix:
- use different sets of system call for different kinds of devices.
Unix:
- device independent I/O for most cases (wrap device as file)
- special syscall for device-specific functionality: ioctl
Protection Model
Before Unix:
- list of users (multics does this)
Unix:
- six bits, rwx, for owner and everyone else
- much like today, except today we have more bits
Process
Before Unix:
- one process per user
Unix:
- user can fork() & exec() syscall to create child process
- slightly different api from today’s fork()
Shell
Before Unix:
- built into the kernel
Unix:
- as a user program, so can be customized.
- featured with I/O redirection
Summary & take-away
- nice unifying abstraction
- careful decomposition(which should be put into kernel, which should be implemented in user space)