Paging in Operating System
Paging is a memory management scheme that eliminates the need for contiguous allocation of physical memory. This scheme permits the physical address space of a process to be non – contiguous.
- Logical Address or Virtual Address (represented in bits): An address generated by the CPU
- Logical Address Space or Virtual Address Space( represented in words or bytes): The set of all logical addresses generated by a program
- Physical Address (represented in bits): An address actually available on memory unit
- Physical Address Space (represented in words or bytes): The set of all physical addresses corresponding to the logical addresses
Example:
- If Logical Address = 31 bit, then Logical Address Space = 231 words = 2 G words (1 G = 230)
- If Logical Address Space = 128 M words = 27 * 220 words, then Logical Address = log2 227 = 27 bits
- If Physical Address = 22 bit, then Physical Address Space = 222 words = 4 M words (1 M = 220)
- If Physical Address Space = 16 M words = 24 * 220 words, then Physical Address = log2 224 = 24 bits
The mapping from virtual to physical address is done by the memory management unit (MMU) which is a hardware device and this mapping is known as paging technique.
- The Physical Address Space is conceptually divided into a number of fixed-size blocks, called frames.
- The Logical address Space is also splitted into fixed-size blocks, called pages.
- Page Size = Frame Size
Let us consider an example:
- Physical Address = 12 bits, then Physical Address Space = 4 K words
- Logical Address = 13 bits, then Logical Address Space = 8 K words
- Page size = frame size = 1 K words (assumption)
Address generated by CPU is divided into
- Page number(p): Number of bits required to represent the pages in Logical Address Space or Page number
- Page offset(d): Number of bits required to represent particular word in a page or page size of Logical Address Space or word number of a page or page offset.
Physical Address is divided into
- Frame number(f): Number of bits required to represent the frame of Physical Address Space or Frame number.
- Frame offset(d): Number of bits required to represent particular word in a frame or frame size of Physical Address Space or word number of a frame or frame offset.
The hardware implementation of page table can be done by using dedicated registers. But the usage of register for the page table is satisfactory only if page table is small. If page table contain large number of entries then we can use TLB(translation Look-aside buffer), a special, small, fast look up hardware cache.
- The TLB is associative, high speed memory.
- Each entry in TLB consists of two parts: a tag and a value.
- When this memory is used, then an item is compared with all tags simultaneously. If the item is found, then corresponding value is returned.
Main memory access time = m
If page table are kept in main memory,
Effective access time = m(for page table) + m(for particular page in page table)
Page Table Entries in Page Table
Page table has page table entries where each page table entry stores a frame number and optional status (like protection) bits. Many of status bits used in the virtual memory system. The most important thing in PTE is frame Number.
Page table entry has the following information:
- Frame Number: It gives the frame number in which the current page you are looking for is present. The number of bits required depends on the number of frames. Frame bit is also known as address translation bit.
Number of bits for frame = Size of physical memory/frame size - Present/Absent bit: Present or absent bit says whether a particular page you are looking for is present or absent. In case if it is not present, that is called Page Fault. It is set to 0 if the corresponding page is not in memory. Used to control page fault by the operating system to support virtual memory. Sometimes this bit is also known as valid/invalid bits.
- Protection bit: Protection bit says that what kind of protection you want on that page. So, these bit for the protection of the page frame (read, write etc).
- Referenced bit: Referenced bit will say whether this page has been referred in the last clock cycle or not. It is set to 1 by hardware when the page is accessed.
- Caching enabled/disabled: Some times we need the fresh data. Let us say the user is typing some information from the keyboard and your program should run according to the input given by the user. In that case, the information will come into the main memory. Therefore main memory contains the latest information which is typed by the user. Now if you try to put that page in the cache, that cache will show the old information. So whenever freshness is required, we don’t want to go for caching or many levels of the memory. The information present in the closest level to the CPU and the information present in the closest level to the user might be different. So we want the information has to be consistency, which means whatever information user has given, CPU should be able to see it as first as possible. That is the reason we want to disable caching. So, this bit enables or disable caching of the page.
- Modified bit: Modified bit says whether the page has been modified or not. Modified means sometimes you might try to write something on to the page. If a page is modified, then whenever you should replace that page with some other page, then the modified information should be kept on the hard disk or it has to be written back or it has to be saved back. It is set to 1 by hardware on write-access to page which is used to avoid writing when swapped out. Sometimes this modified bit is also called as the Dirty bit.
Inverted Page Table in Operating System
Most of the Operating Systems implement a separate pagetable for each process, i.e. for ‘n’ number of processes running on a Multiprocessing/ Timesharing operating system, there are ‘n’ number of pagetables stored in the memory. Sometimes when a process is very large in size and it occupies virtual memory then with the size of the process, it’s pagetable size also increases substantially.
Example: A process of size 2 GB with:
Page size = 512 Bytes
Size of page table entry = 4 Bytes, then
Number of pages in the process = 2 GB / 512 B = 222
PageTable Size = 222 * 22 = 224 bytes
Through this example, it can be concluded that for multiple processes running simultaneously in an OS, a considerable part of memory is occupied by page tables only.
Operating Systems also incorporate multilevel paging schemes which further increase the space required for storing the page tables and a large amount of memory is invested in storing them. The amount of memory occupied by the page tables can turn out to be a huge overhead and is always unacceptable as main memory is always a scarce resource. Various efforts are made to utilize the memory efficiently and to maintain a good balance in the level of multiprogramming and efficient CPU utilization.
Inverted Page Table
An alternate approach is to use the Inverted Page Table structure that consists of one-page table entry for every frame of the main memory. So the number of page table entries in the Inverted Page Table reduces to the number of frames in physical memory and a single page table is used to represent the paging information of all the processes.
Through the inverted page table, the overhead of storing an individual page table for every process gets eliminated and only a fixed portion of memory is required to store the paging information of all the processes together. This technique is called as inverted paging as the indexing is done with respect to the frame number instead of the logical page number.
Each entry in the page table contains the following fields.
- Page number: It specifies the page number range of the logical address.
- Process id: An inverted page table contains the address space information of all the processes in execution. Since two different processes can have similar set of virtual addresses, it becomes necessary in Inverted Page Table to store a process Id of each process to identify it’s address space uniquely. This is done by using the combination of PId and Page Number. So this Process Id acts as an address space identifier and ensures that a virtual page for a particular process is mapped correctly to the corresponding physical frame.
- Control bits: These bits are used to store extra paging-related information. These include the valid bit, dirty bit, reference bits, protection and locking information bits.
- Chained pointer: It may be possible sometime that two or more processes share a part of main memory. In this case, two or more logical pages map to same Page Table Entry then a chaining pointer is used to map the details of these logical pages to the root page table.
Working: The operation of an inverted page table is shown below.
The virtual address generated by the CPU contains the fields and each page table entry contains and the other relevant information required in paging related mechanism. When a memory reference takes place, this virtual address is matched by the memory-mapping unit and the Inverted Page table is searched to match the and the corresponding frame number is obtained. If the match is found at the ith entry then the physical address of the process, , is sent as the real address otherwise if no match is found then Segmentation Fault is generated.
Note: Number of Entries in Inverted page table = Number of frames in Physical address Space(PAS)
Examples: The Inverted Page table and its variations are implemented in various systems like PowerPC, UltraSPARC and the IA-64 architecture. An implementation of the Mach operating system on the RT-PC also uses this technique.
Advantages and Disadvantages
- Reduced memory space: Inverted Pagetables typically reduces the amount of memory required to store the page tables to a size bound of physical memory. The maximum number of entries could be the number of page frames in the physical memory.
- Longer lookup time: Inverted Page tables are sorted in order of frame number but the memory look-up takes place with respect to the virtual address, so, it usually takes a longer time to find the appropriate entry but often these page tables are implemented using hash data structures for a faster lookup.
- Difficult shared memory implementation: As the Inverted Page Table stores a single entry for each frame, it becomes difficult to implement the shared memory in the page tables. Chaining techniques are used to map more than one virtual address to the entry specified in order of frame number.
Multilevel Paging
- Multilevel Paging is a paging scheme that consists of two or more levels of page tables in a hierarchical manner. It is also known as hierarchical paging. The entries of the level 1 page table are pointers to a level 2 page table and entries of the level 2 page tables are pointers to a level 3 page table and so on. The entries of the last level page table store actual frame information. Level 1 contains a single-page table and the address of that table is stored in PTBR (Page Table Base Register).
Why it is required?
- Consider a 32-bit physical address space with page size = 4KB and let there are 220 = 1M total entries in the page table, page table entry size = 232/212 = 220 and adding some protection bits and dirty bit in the page table entry. Now page table size = 220 * 24 ≈ 3MB which should be in the physical memory and since each process has its own page table there are so much memory wastage only for storing page tables.
- One solution to the large memory requirements of page table is to use multilevel paging, only the outer most page table will reside in the main memory and other page tables will be brought to main memory as per the requirement because at a particular time we do not need complete page table, also we can save much memory space because outermost page table can fit in exactly one frame.
- In multilevel paging whatever may be levels of paging, all the page tables will be stored in the main memory. So it requires more than one memory access to get the physical address of the page frame. One access for each level is needed. Each page table entry except the last level page table entry contains the base address of the next level page table.
Reference to actual page frame:
- Reference to PTE in level 1 page table = PTBR value + Level 1 offset present in virtual address.
- Reference to PTE in level 2 page table = Base address (present in Level 1 PTE) + Level 2 offset (present in VA).
- Reference to PTE in level 3 page table= Base address (present in Level 2 PTE) + Level 3 offset (present in VA).
- Actual page frame address = PTE (present in level 3).
Generally, the page table size will be equal to the size of the page.
Assumptions:
Byte addressable memory and n is the number of bits used to represent virtual address.
Important formula:
Number of entries in page table:
= (virtual address space size) / (page size)
= Number of pages
Virtual address space size:
= 2n B
Size of page table:
<>= (number of entries in page table)*(size of PTE)
If page table size > desired size then create 1 more level.
Disadvantage:
Extra memory references to access address translation tables can slow programs down by a factor of two or more. Use translation look aside buffer (TLB) to speed up address translation by storing page table entries.
Example:
Q. Consider a virtual memory system with physical memory of 8GB, a page size of 8KB, and 46-bit virtual address. Assume every page table exactly fits into a single page. If page table entry size is 4B then how many levels of page tables would be required.
Explanation:
Page size = 8KB = 213 B
Virtual address space size = 246 B
PTE = 4B = 22 B
Number of pages or number of entries in page table,
= (virtual address space size) / (page size)
= 246B/213 B
= 233
Size of page table,
= (number of entries in page table)*(size of PTE)
= 233*22 B
= 235 B
To create one more level,
Size of page table > page size
Number of page tables in last level,
= 235 B / 213 B
= 222
The base address of these tables is stored in page table [second last level].
Size of page table [second last level]
= 222*22B
= 224B
To create one more level,
Size of page table [second last level] > page size
Number of page tables in second last level
= 224B/213 B
= 211
The base address of these tables are stored in page table [third last level]
Size of page table [third last level]
= 211*22 B
= 213 B
= page size