1 Operating Systems: MemoryINF1060: Introduction to Operating Systems and Data Communication Operating Systems: Memory Pål Halvorsen 27/
2 Overview Memory management HierarchiesMultiprogramming and memory management Addressing A process’ memory Partitioning Paging and Segmentation Virtual memory Page replacement algorithms
3 Memory Management Memory management is concerned with managing the systems’ memory resources different levels of memory in a hierarchy providing a virtual view of memory giving the impression of having more than the amount of available bytes allocating space to processes protecting the memory regions
4 Memory Hierarchies We can’t access the disk each time we need dataTypical computer systems therefore have several different components where data may be stored different capacities different speeds less capacity gives faster access and higher cost per byte Lower levels have a copy of data in higher levels A typical memory hierarchy: tertiary storage (tapes) price speed secondary storage (disks) 2x 100x 107x CPU / 0.5 ns -- desk Cache / 1 ns -- shelf above (1 second) Memory / 50 ns – library in 2nd floor (1.5 minutes), if you really run Disk / 5 ms – US, NASA, Moon (3.5 mnds) main memory capacity cache(s)
5 So, use your memory efficiently….!Memory Hierarchies ? So, use your memory efficiently….! cache(s) main memory secondary storage (disks) tertiary storage (tapes) 5 ms 3.5 months 50 ns 1.5 minutes CPU / 0.5 ns -- desk Cache / 1 ns -- shelf above (1 second) Memory / 50 ns – library in 2nd floor (1.5 minutes), if you really run Disk / 5 ms – US, NASA, Moon (3.5 mnds) On die memory ns 2 s 0.3 ns 1 s
6 Storage Costs: Access Time vs Capacitycache main memory magnetic disks online tape offline 1015 1013 1011 109 107 105 typical capacity (bytes) 10-9 10-6 10-3 10 103 access time (sec) from Gray & Reuter
7 Storage Costs: Access Time vs Price104 102 100 10-2 cache main memory magnetic disks online tape offline dollars/Mbytes 10-9 10-6 10-3 10 103 access time (sec) from Gray & Reuter
8 Multiprogramming Several programs Memory Use of secondary storageconcurrently loaded into memory memory partitioning OS must arrange memory sharing Memory needed for different tasks within a process shared among processes process memory demand may change over time Use of secondary storage move (parts of) blocking processes from memory higher degree of multiprogramming possible makes sense if processes block for long times or only use parts of the code
9 Memory Management for MultiprogrammingSwapping: remove a process from memory with all of its state and data store it on a secondary medium (disk, flash RAM, other slow RAM, historically also tape) Overlays: manually replace parts of code and data programmer’s rather than OS’s work only for very old and memory-scarce systems Segmentation/paging: remove part of a process from memory store it on a secondary medium sizes of such parts are fixed
10 Absolute and Relative AddressingHardware often uses absolute addressing reading data by referencing the byte numbers in memory read absolute byte 0x000000ff fast! What about software? read absolute byte 0x000fffff (process A) result dependent of physical process location absolute addressing not convenient but, addressing within a process is determined during programming!!?? Relative addressing independent of process position in memory address is expressed relative to some base location dynamic address translation – find absolute address during run-time adding relative and base addresses 0x000… process A process A 0xfff…
11 Processes’ Memory On the Intel architecture a task partitions its available memory a text (code) segment read from program file for example by exec usually read-only can be shared a data segment initialized global variables (0 / NULL) uninitialized global variables heap dynamic memory e.g., allocated using malloc grows against higher addresses a stack segment variables in a function stored register states (e.g., calling function’s EIP) grows against lower addresses system data segment (PCB) segment pointers pid program and stack pointers … possibly more stacks for threads low address system data segment (PCB) … ...
12 Memory Layout Memory is usually divided into regionsoperating system occupies low memory system control resident routines the remaining area is used for transient operating system routines and application programs How to assign memory to concurrent processes? Memory partitioning Fixed partitioning Dynamic partitioning Simple paging Simple segmentation Virtual memory paging Virtual memory segmentation 0x000… system control information resident operating system (kernel) transient area (application programs – and transient operating system routines) 0xfff…
13 Fixed Partitioning Divide memory into static partitions at system initialization time (boot or earlier) Advantages very easy to implement can support swapping process Two fixed partitioning schemes equal-size partitions large programs cannot be executed (unless program parts are loaded from disk) small programs use entire partition (problem called “internal fragmentation”) unequal-size partitions large programs can be loaded at once less internal fragmentation require assignment of jobs to partitions one queue or one queue per partition Equal sized: Unequal sized: Operating system 8MB Operating system 8MB 2MB 4MB 6MB 12MB 16MB
14 Dynamic Partitioning Divide memory in run-timepartitions are created dynamically removed after jobs are finished External fragmentation increases with system running time Operating system 8MB Process 5 14MB 6MB 56MB free Process 1 20MB 36MB free 20MB free External fragmentation 22MB free Process 2 14MB Process 4 8MB 6MB free 14MB free 4MB free Process 3 18MB
15 Dynamic Partitioning Divide memory in run-timepartitions are created dynamically removed after jobs are finished External fragmentation increases with system running time Operating system 8MB Process 5 14MB 6MB Process 4 8MB Compaction removes fragments by moving data in memory takes time consumes processing resources Proper placement algorithm might reduce need for compaction first fit – simplest, fastest, typically the best next fit – problems with large segments best fit – slowest, lots of small fragments, therefore worst Process 4 8MB Process 3 18MB 6MB free 6MB free Process 3 18MB 6MB free 16MB free 6MB free 4MB free
16 Buddy System Mix of fixed and dynamic partitioningProcess 256kB Process 32kB Process 128kB Mix of fixed and dynamic partitioning partitions have sizes 2k, L ≤ k ≤ U Maintain a list of holes with sizes Assigning memory to a process: find smallest k so that process fits into 2k find a hole of size 2k if not available, split smallest hole larger than 2k recursively into halves 128kB Process 128kB 1MB 256kB 512kB 64kB 32kB Process 32kB 128kB 32kB 64kB 256kB Process 256kB 256kB Process 256kB 512kB 256kB
17 Segmentation Requiring that a process is placed in contiguous memory gives much fragmentation (and memory compaction is expensive) Segmentation different lengths determined by programmer memory frames Programmer (or compiler toolchain) organizes program in parts move control needs awareness of possible segment size limits Pros and Cons principle as in dynamic partitioning – can have different sizes no internal fragmentation less external fragmentation because on average smaller segments adds a step to address translation
18 Segmentation find segment table in registeroperating system find segment table in register extract segment number from address find segment address using segment number as index to segment table find absolute address within segment using relative address other regions/programs address process A, segment 0 segment number | offset process A, segment 1 segment table segment start address 0x…a… 0x…b… 0x…c… … + process A, segment 2
19 Paging Paging equal lengths determined by processor one page moved into one page (memory) frame Process is loaded into several frames (not necessarily consecutive) Fragmentation no external fragmentation little internal fragmentation (depends on frame size) Addresses are dynamically translated during run-time (similar to segmentation) Can combine segmentation and paging Process 5 Process 1 Process 4 Process 2 Process 1 Process 3
20 Virtual Memory The described partitioning schemes may be used in applications, but the modern OS also uses virtual memory: early attempt to give a programmer more memory than physically available older computers had relatively little main memory but, all instructions does not have to be in memory before execution starts break program into smaller independent parts load currently active parts when a program is finished with one part a new can be loaded memory is divided into equal-sized frames often called pages some pages reside in physical memory, others are stored on disk and retrieved if needed virtual addresses are translated to physical (in MMU) using a page table both Linux and Windows implements a flat linear 32-bit (4 GB) memory model on IA-32 Windows: 2 GB (high addresses) kernel, 2 GB (low addresses) user mode threads Linux: GB (high addresses) kernel, 3 GB (low addresses) user mode threads
21 Virtual Memory Memory lookup??? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16virtual address space 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Memory lookup??? physical memory 7 1 5 4 13 2 3 18 3
22 Memory Lookup present Outgoing physical address bit Page table8 000 9 101 1 10 11 111 12 13 14 15 1 (0x6004, 24580) Example: 4 KB pages (12-bit offsets within page) 16 bit virtual address space 16 pages (4-bit index) 8 physical pages (3-bit index) 7 000 6 000 5 011 1 4 100 1 3 000 1 2 110 1 4-bit index into page table virtual page = 0010 = 2 1 001 010 1 12-bit offset Incoming virtual address (0x2004, 8196) 1 1 1 1 1 1 1
23 Memory Lookup PAGE FAULT present Outgoing physical address bitPage table 8 000 9 101 1 10 11 111 12 13 14 15 PAGE FAULT 7 000 6 000 5 011 1 4 100 1 3 000 1 2 110 4-bit index into page table virtual page = 0010 = 2 1 001 010 1 12-bit offset Incoming virtual address (0x2004, 8196) 1 1 1 1 1 1
24 Page Fault Handling Hardware traps to the kernel saving program counter and process state information Save general registers and other volatile information OS discover the page fault and tries to determine which virtual page is requested OS checks if the virtual page is valid and if protection is consistent with access Select a page to be replaced Check if selected page frame is ”dirty”, i.e., updated When selected page frame is ready, the OS finds the disk address where the needed data is located and schedules a disk operation to bring in into memory A disk interrupt is executed indicating that the disk I/O operation is finished, the page tables are updated, and the page frame is marked ”normal state” Faulting instruction is backed up and the program counter is reset Faulting process is scheduled, and OS returns to routine that made the trap to the kernel The registers and other volatile information is restored and control is returned to user space to continue execution as no page fault had occured
25 Page Replacement AlgorithmsPage fault OS has to select a page for replacement Modified (dirty) page write back to disk Not modified page just overwrite with new data How do we decide which page to replace? determined by the page replacement algorithm several algorithms exist: Random Other algorithms take into acount usage, age, etc. (e.g., FIFO, not recently used, least recently used, second chance, clock, …) which is best???
26 First In First Out (FIFO)All pages in memory are maintained in a list sorted by age FIFO replaces the oldest page, i.e., the first in the list A I H G F E D C J A I H G F E D I H G F E D C B D C B A H G F E D C B A Now the buffer is full, next page fault results in a replacement D C B A No change in the FIFO chain G F E D C B A E D C B A F E D C B A C B A A B A Reference string: A B C D A E F G H I A J Page most recently loaded Page first loaded, i.e., FIRST REPLACED Low overhead FIFO is rearly used in its pure form
27 Page most recently loadedSecond Chance Modification of FIFO R bit: when a page is referenced again, the R bit is set, and the page will be treated as a newly loaded page H G F E D C B A 1 Page I will be inserted, find a page to page out by looking at the first page loaded: -if R-bit = 0 replace -if R-bit = 1 clear R-bit, move page last, and finally look at the new first page A H G F E D C B Page A’s R-bit = 1 move last in chain and clear R-bit, look at new first page (B) I A H G F E D C Page B’s R-bit = 0 page out, shift chain left, and insert I last in the chain D C B A 1 The R-bit for page A is set D C B A G F E D C B A 1 E D C B A 1 F E D C B A 1 H G F E D C B A 1 Now the buffer is full, next page fault results in a replacement Reference string: A B C D A E F G H I Page most recently loaded Page first loaded R-bit Second chance is a reasonable algorithm, but inefficient because it is moving pages around the list
28 Clock More efficient implemention Second ChanceCircular list in form of a clock Pointer to the oldest page: R-bit = 0 replace and advance pointer R-bit = 1 set R-bit to 0, advance pointer until R-bit = 0, replace and advance pointer Reference string: A B C D A E F G H I A A 1 H B I G C F D E
29 Least Recently Used (LRU)Replace the page that has the longest time since last reference Based on the observation that pages that are heavily used in the last few instructions will probably be used again in the next few instructions Several ways to implement this algorithm
30 Least Recently Used (LRU)LRU as a linked list: H G F E A D C B Now the buffer is full, next page fault results in a replacement I C A H G F E D Page fault, replace LRU (B) with I A H G F E D C B Move A last in the chain (most recently used) C A H G F E D B Move C last in the chain (most recently used) A D B C Move A last in the chain (most recently used) G F E A D C B D C B A E A D C B F E A D C B Reference string: A B C D A E F G H A C I Page most recently used Page least recently used Expensive - maintaining an ordered list of all pages in memory: most recently used at front, least at rear update this list every memory reference !! Many other approaches: using aging and counters
31 Many Design Issues Page size Reference locality in time and spaceDemand paging vs. pre-paging Allocation policies: equal share vs. proportional share Replacement policies. local vs. global …
32 Allocation Policies Page fault frequency (PFF): Usually, more page frames fewer page faults page faults/sec PFF: # page frames assigned PFF is unacceptable high process needs more memory PFF might be too low process may have too much memory!!!?????? Solution ??: Reduce number of processes competing for memory reassign a page frame swap one or more to disk, divide up pages they held reconsider degree of multiprogramming
33 Example: Paging on Pentium
34 Paging on Pentium In protected mode, the currently executing process have a 4 GB address space (232) – viewed as 1 M (220) 4 KB pages The 4 GB address space is divided into 1 K page groups (1 level – page directory) Each page group has 1 K 4 KB pages (2 level – page table) Mass storage space is also divided into 4 KB blocks of information Uses control registers for paging information
35 Control Registers used for Paging on PentiumControl register 0 (CR0): Control register 1 (CR1) – does not exist, returns only zero Control register 2 (CR2) only used if CR0[PG]=1 & CR0[PE]=1 31 30 29 16 PG CD NW WP PE Not-Write-Through and Cache Disable: used to control internal cache Write-Protect: If CR0[WP] = 1, only OS may write to read-only pages Paging Enable: OS enables paging by setting CR0[PG] = 1 Protected Mode Enable: If CR0[PE] = 1, the processor is in protected mode 31 Page Fault Linear Address
36 Control Registers used for Paging on PentiumControl register 3 (CR3) – page directory base address: only used if CR0[PG]=1 & CR0[PE]=1 Control register 4 (CR4): 31 11 4 3 Page Directory Base Address PCD PWT Page Cache Disable: If CR3[PCD] = 1, caching is turned off A 4KB-aligned physical base address of the page directory Page Write-Through: If CR3[PWT] = 1, use write-through updated 31 4 PSE Page Size Extension: If CR4[PSE] = 1, the OS designer may designate some pages as 4 MB
37 Pentium Memory Lookup Incoming virtual address (0x1802038, 20979768)31 22 21 12 11 1 Page directory: 31 12 7 6 5 4 3 2 1 PT base address ... PS A U W P physical base address of the page table page size accessed present allowed to write user access allowed
38 Pentium Memory Lookup Incoming virtual address (0x1802038, 20979768)31 22 21 12 11 1 Index to page directory (0x6, 6) 31 12 7 6 5 4 3 2 1 ... Page table PF: Save pointer to instruction Move linear address to CR2 Generate a PF exception – jump to handler Programmer reads CR2 address Upper 10 CR2 bits identify needed PT Page directory entry is really a mass storage address Allocate a new page – write back if dirty Read page from storage device Insert new PT base address into page directory entry Return and restore faulting instruction Resume operation reading the same page directory entry again – now P = 1 ... 1 ...... Page Directory Base Address CR3:
39 Pentium Memory Lookup Incoming virtual address (0x1802038, 20979768)Page frame PF: Save pointer to instruction Move linear address to CR2 Generate a PF exception – jump to handler Programmer reads CR2 address Upper 10 CR2 bits identify needed PT Use middle 10 CR2 bit to determine entry in PT – holds a mass storage address Allocate a new page – write back if dirty Read page from storage device Insert new page frame base address into page table entry Return and restore faulting instruction Resume operation reading the same page directory entry and page table entry again – both now P = 1 31 22 21 12 11 1 Index to page directory (0x6, 6) Index to page table (0x2, 2) 31 12 7 6 5 4 3 2 1 ... ... 1 ...... Page table: Page Directory Base Address CR3: 31 12 7 6 5 4 3 2 1 ... ......
40 Pentium Memory Lookup Incoming virtual address (0x1802038, 20979768)31 22 21 12 11 1 Index to page directory (0x6, 6) Index to page table (0x2, 2) Page: 31 12 7 6 5 4 3 2 1 ... Page offset (0x38, 56) ... 1 ...... requested data Page Directory Base Address CR3: 31 12 7 6 5 4 3 2 1 ... ......
41 Pentium Page Fault CausesPage directory entry’s P-bit = 0: page group’s directory (page table) not in memory Page table entry’s P-bit = 0: requested page not in memory Attempt to write to a read-only page Insufficient page-level privilege to access page table or frame One of the reserved bits are set in the page directory or table entry
42 Summary Memory management is concerned with managing the systems’ memory resources allocating space to processes protecting the memory regions in the real world programs are loaded dynamically physical addresses it will get are not known to program – dynamic address translation program size at run-time is not known to kernel Each process usually has text, data and stack segments Systems like Windows and Unix use virtual memory with paging Many issues when designing a memory component