In the last three lessons, we discussed the important real-time task scheduling techniques. We highlighted that timely production of results in accordance to a physical clock is vital to the satisfactory operation of a real-time system. We had also pointed out that real-time operating systems are primarily responsible for ensuring that every real-time task meets its timeliness requirements. A real-time operating system in turn achieves this by using appropriate task scheduling techniques. Normally real-time operating systems provide flexibility to the programmers to select an appropriate scheduling policy among several supported policies. Deployment of an appropriate task scheduling technique out of the supported techniques is therefore an important concern for every real-time programmer. To be able to determine the suitability of a scheduling algorithm for a given problem, a thorough understanding of the characteristics of various real-time task scheduling algorithms is important. We therefore had a rather elaborate discussion on real-time task scheduling techniques and certain related issues such as sharing of critical resources and handling task dependencies.
In this lesson, we examine the important features that a real-time operating system is expected to support. We start by discussing the time service supports provided by the real-time operating systems, since accurate and high precision clocks are very important to the successful operation any real- time application. Next, we point out the important features that a real-time operating system needs to support. Finally, we discuss the issues that would arise if we attempt to use a general purpose operating system such as UNIX or Windows in real-time applications.
Clocks and time services are among some of the basic facilities provided to programmers by every real-time operating system. The time services provided by an operating system are based on a software clock called the system clock maintained by the operating system. The system clock is maintained by the kernel based on the interrupts received from the hardware clock. Since hard real-time systems usually have timing constraints in the micro seconds range, the system clock should have sufficiently fine resolution1 to support the necessary time services. However, designers of real-time operating systems find it very difficult to support very fine resolution system clocks. In current technology, the resolution of hardware clocks is usually finer than a nanosecond (contemporary processor speeds exceed 3GHz). But, the clock resolution being made available by modern real-time operating systems to the programmers is of the order of several milliseconds or worse. Let us first investigate why real-time operating system designers find it difficult to maintain system clocks with sufficiently fine resolution. We then examine various time services that are built based on the system clock, and made available to the real-time programmers
The hardware clock periodically generates interrupts (often called time service interrupts). After each clock interrupt, the kernel updates the software clock and also performs certain other work (explained in Sec 4.1.1). A thread can get the current time reading of the system clock by invoking a system call supported by the operating system (such as the POSIX clock-gettime()). The finer the resolution of the clock, the more frequent need to be the time service interrupts and larger is the amount of processor time the kernel spends in responding to these interrupts. This overhead places a limitation on how fine is the system clock resolution a computer can support. Another issue that caps the resolution of the system clock is the response time of the clock-gettime() system call is not deterministic. In fact, every system call (or for that matter, a function call) has some associated jitter. The problem gets aggravated in the following situation.
The jitter is caused on account of interrupts having higher priority than system calls. When an interrupt occurs, the processing of a system call is stalled. Also, the preemption time of system calls can vary because many operating systems disable interrupts while processing a system call. The variation in the response time (jitter) introduces an error in the accuracy of the time value that the calling thread gets from the kernel. Remember that jitter was defined as the difference between the worst-case response time and the best case response time (see Sec. 2.3.1). In commercially available operating systems, jitters associated with system calls can be several milliseconds. A software clock resolution finer than this error, is therefore not meaningful.
We now examine the different activities that are carried out by a handler routine after a clock interrupt occurs. Subsequently, we discuss how sufficient fine resolution can be provided in the presence of jitter in function calls.
Clock Interrupt Processing
Fig. 31.1 Structure of a Timer Queue
1 Clock resolution denotes the time granularity provided by the clock of a computer. It corresponds to the duration of time that elapses between two successive clock ticks.
Each time a clock interrupt occurs, besides incrementing the software clock, the handler routine carries out the following activities:
Process timer events: Real-time operating systems maintain either per-process timer queues or a single system-wide timer queue. The structure of such a timer queue has been shown in Fig. 31.1. A timer queue contains all timers arranged in order of their expiration times. Each timer is associated with a handler routine. The handler routine is the function that should be invoked when the timer expires. At each clock interrupt, the kernel checks the timer data structures in the timer queue to see if any timer event has occurred. If it finds that a timer event has occurred, then it queues the corresponding handler routine in the ready queue.
Update ready list: Since the occurrence of the last clock event, some tasks might have arrived or become ready due to the fulfillment of certain conditions they were waiting for. The tasks in the wait queue are checked, the tasks which are found to have become ready, are queued in the ready queue. If a task having higher priority than the currently running task is found to have become ready, then the currently running task is preempted and the scheduler is invoked.
Update execution budget: At each clock interrupt, the scheduler decrements the time slice (budget) remaining for the executing task. If the remaining budget becomes zero and the task is not complete, then the task is preempted, the scheduler is invoked to select another task to run.
Providing High Clock Resolution
We had pointed out in Sec. 4.1 that there are two main difficulties in providing a high resolution timer. First, the overhead associated with processing the clock interrupt becomes excessive. Secondly, the jitter associated with the time lookup system call (clock-gettime()) is often of the order of several milliseconds. Therefore, it is not useful to provide a clock with a resolution any finer than this. However, some real-time applications need to deal with timing constraints of the order of a few nanoseconds. Is it at all possible to support time measurement with nanosecond resolution? A way to provide sufficiently fine clock resolution is by mapping a hardware clock into the address space of applications. An application can then read the hardware clock directly (through a normal memory read operation) without having to make a system call. On a Pentium processor, a user thread can be made to read the Pentium time stamp counter. This counter starts at 0 when the system is powered up and increments after each processor cycle. At today’s processor speed, this means that during every nanosecond interval, the counter increments several times.
However, making the hardware clock readable by an application significantly reduces the portability of the application. Processors other than Pentium may not have a high resolution counter, and certainly the memory address map and resolution would differ.
We had pointed out that timer service is a vital service that is provided to applications by all real-time operating systems. Real-time operating systems normally support two main types of timers: periodic timers and aperiodic (or one shot) timers. We now discuss some basic concepts about these two types of timers.
Periodic Timers: Periodic timers are used mainly for sampling events at regular intervals or performing some activities periodically. Once a periodic timer is set, each time after it expires the corresponding handler routine is invoked, it gets reinserted into the timer queue. For example, a periodic timer may be set to 100 msec and its handler set to poll the temperature sensor after every 100 msec interval.
Aperiodic (or One Shot) Timers: These timers are set to expire only once. Watchdog timers are popular examples of one shot timers.
Fig. 31.2 Use of a Watchdog Timer
Watchdog timers are used extensively in real-time programs to detect when a task misses its deadline, and then to initiate exception handling procedures upon a deadline miss. An example use of a watchdog timer has been illustrated in Fig. 31.2. In Fig. 31.2, a watchdog timer is set at the start of a certain critical function f() through a wd_start(t1) call. The wd_start(t1) call sets the watch dog timer to expire by the specified deadline (t1) of the starting of the task. If the function f() does not complete even after t1 time units have elapsed, then the watchdog timer fires, indicating that the task deadline must have been missed and the exception handling procedure is initiated. In case the task completes before the watchdog timer expires (i.e. the task completes within its deadline), then the watchdog timer is reset using a wd_ tickle() call.
Features of a Real-Time Operating System
Before discussing about commercial real-time operating systems, we must clearly understand the features normally expected of a real-time operating system and also let us compare different real-time operating systems. This would also let us understand the differences between a traditional operating system and a real-time operating system. In the following, we identify some important features required of a real-time operating system, and especially those that are normally absent in traditional operating systems.
Clock and Timer Support: Clock and timer services with adequate resolution are one of the most important issues in real-time programming. Hard real-time application development often requires support of timer services with resolution of the order of a few microseconds. And even finer resolution may be required in case of certain special applications. Clocks and timers are a vital part of every real-time operating system. On the other hand, traditional operating systems often do not provide time services with sufficiently high resolution.
Real-Time Priority Levels: A real-time operating system must support static priority levels. A priority level supported by an operating system is called static, when once the programmer assigns a priority value to a task, the operating system does not change it by itself. Static priority levels are also called real-time priority levels. This is because, as we discuss in section 4.3, all traditional operating systems dynamically change the priority levels of tasks from programmer assigned values to maximize system throughput. Such priority levels that are changed by the operating system dynamically are obviously not static priorities.
Fast Task Preemption: For successful operation of a real-time application, whenever a high priority critical task arrives, an executing low priority task should be made to instantly yield the CPU to it. The time duration for which a higher priority task waits before it is allowed to execute is quantitatively expressed as the corresponding task preemption time. Contemporary real-time operating systems have task preemption times of the order of a few micro seconds. However, in traditional operating systems, the worst case task preemption time is usually of the order of a second. We discuss in the next section that this significantly large latency is caused by a non-preemptive kernel. It goes without saying that a real-time operating system needs to have a preemptive kernel and should have task preemption times of the order of a few micro seconds.
Predictable and Fast Interrupt Latency: Interrupt latency is defined as the time delay between the occurrence of an interrupt and the running of the corresponding ISR (Interrupt Service Routine). In real-time operating systems, the upper bound on interrupt latency must be bounded and is expected to be less than a few micro seconds. The way low interrupt latency is achieved, is by performing bulk of the activities of ISR in a deferred procedure call (DPC). A DPC is essentially a task that performs most of the ISR activity. A DPC is executed later at a certain priority value. Further, support for nested interrupts are usually desired. That is, a real-time operating system should not only be preemptive while executing kernel routines, but should be preemptive during interrupt servicing as well. This is especially important for hard real-time applications with sub-microsecond timing requirements.
Support for Resource Sharing Among Real-Time Tasks: If real- time tasks are allowed to share critical resources among themselves using the traditional resource sharing techniques, then the response times of tasks can become unbounded leading to deadline misses. This is one compelling reason as to why every commercial real-time operating system should at the minimum provide the basic priority inheritance mechanism. Support of priority ceiling protocol (PCP) is also desirable, if large and moderate sized applications are to be supported.
Requirements on Memory Management: As far as general-purpose operating systems are concerned, it is rare to find one that does not support virtual memory and memory protection features. However, embedded real-time operating systems almost never support these features. Only those that are meant for large and complex applications do. Real-time operating systems for large and medium sized applications are expected to provide virtual memory support, not only to meet the memory demands of the heavy weight tasks of the application, but to let the memory demanding non-real-time applications such as text editors, e-mail software, etc. to also run on the same platform. Virtual memory reduces the average memory access time, but degrades the worst-case memory access time. The penalty of using virtual memory is the overhead associated with storing the address translation table and performing the virtual to physical address translations. Moreover, fetching pages from the secondary memory on demand incurs significant latency. Therefore, operating systems supporting virtual memory must provide the real-time applications with some means of controlling paging, such as memory locking. Memory locking prevents a page from being swapped from memory to hard disk. In the absence of memory locking feature, memory access times of even critical real-time tasks can show large jitter, as the access time would greatly depend on whether the required page is in the physical memory or has been swapped out.
Memory protection is another important issue that needs to be carefully considered. Lack of support for memory protection among tasks leads to a single address space for the tasks. Arguments for having only a single address space include simplicity, saving memory bits, and light weight system calls. For small embedded applications, the overhead of a few Kilo Bytes of memory per process can be unacceptable. However, when no memory protection is provided by the operating system, the cost of developing and testing a program without memory protection becomes very high when the complexity of the application increases. Also, maintenance cost increases as any change in one module would require retesting the entire system.
Embedded real-time operating systems usually do not support virtual memory. Embedded real-time operating systems create physically contiguous blocks of memory for an application upon request. However, memory fragmentation is a potential problem for a system that does not support virtual memory. Also, memory protection becomes difficult to support a non-virtual memory management system. For this reason, in many embedded systems, the kernel and the user processes execute in the same space, i.e. there is no memory protection. Hence, a system call and a function call within an application are indistinguishable. This makes debugging applications difficult, since a run away pointer can corrupt the operating system code, making the system “freeze”.
Additional Requirements for Embedded Real-Time Operating Systems: Embedded applications usually have constraints on cost, size, and power consumption. Embedded real-time operating systems should be capable of diskless operation, since many times disks are either too bulky to use, or increase the cost of deployment. Further, embedded operating systems should minimize total power consumption of the system. Embedded operating systems usually reside on ROM. For certain applications which require faster response, it may be necessary to run the real- time operating system on a RAM. Since the access time of a RAM is lower than that of a ROM, this would result in faster execution. Irrespective of whether ROM or RAM is used, all ICs are expensive. Therefore, for real-time operating systems for embedded applications it is desirable to have as small a foot print (memory usage) as possible. Since embedded products are typically manufactured large scale, every rupee saved on memory and other hardware requirements impacts millions in profit.
Unix as a Real-Time Operating System
Unix is a popular general purpose operating system that was originally developed for the mainframe computers. However, UNIX and its variants have now permeated to desktop and even handheld computers. Since UNIX and its variants inexpensive and are widely available, it is worthwhile to investigate whether Unix can be used in real-time applications. This investigation would lead us to some significant findings and would give us some crucial insights into the current Unix-based real-time operating systems that are currently commercially available.
The traditional UNIX operating system suffers from several shortcomings when used in real-time applications.
We elaborate these problems in the following two subsections.
|The two most troublesome problems that a real-time programmer faces while using Unix for real-time applications include non-preemptive Unix kernel and dynamically changing priority of tasks.|
One of the biggest problems that real-time programmers face while using Unix for real-time application development is that Unix kernel cannot be preempted. That is, all interrupts are disabled when any operating system routine runs. To set things in proper perspective, let us elaborate this issue.
Application programs invoke operating system services through system calls. Examples of system calls include the operating system services for creating a process, interprocess communication, I/O operations, etc. After a system call is invoked by an application, the arguments given by the application while invoking the system call are checked. Next, a special instruction called a trap (or a software interrupt) is executed. As soon as the trap instruction is executed, the handler routine changes the processor state from user mode to kernel mode (or supervisor mode), and the execution of the required kernel routine starts. The change of mode during a system call has schematically been depicted in Fig. 31.3.
Fig. 31.3 Invocation of an Operating System Service through System Call
At the risk of digressing from the focus of this discussion, let us understand an important operating systems concept. Certain operations such as handling devices, creating processes, file operations, etc., need to be done in the kernel mode only. That is, application programs are prevented from carrying out these operations, and need to request the operating system (through a system call) to carry out the required operation. This restriction enables the kernel to enforce discipline among different programs in accessing these objects. In case such operations are not performed in the kernel mode, different application programs might interfere with each other’s operation. An example of an operating system where all operations were performed in user mode is the once popular operating system DOS (though DOS is nearly obsolete now). In DOS, application programs are free to carry out any operation in user mode2, including crashing the system by deleting the system files. The instability this can bring about is clearly unacceptable in real-time environment, and is usually considered insufficient in general applications as well.
In fact, in DOS there is only one mode of operation, i.e. kernel mode and user mode are indistinguishable.
A process running in kernel mode cannot be preempted by other processes. In other words, the Unix kernel is non-preemptive. On the other hand, the Unix system does preempt processes running in the user mode. A consequence of this is that even when a low priority process makes a system call, the high priority processes would have to wait until the system call completes. The longest system calls may take up to several hundreds of milliseconds to complete. Worst-case preemption times of several hundreds of milliseconds can easily cause, high priority tasks with short deadlines of the order of a few milliseconds to miss their deadlines.
Let us now investigate, why the Unix kernel was designed to be non-preemptive in the first place. Whenever an operating system routine starts to execute, all interrupts are disabled. The interrupts are enabled only after the operating system routine completes. This was a very efficient way of preserving the integrity of the kernel data structures. It saved the overheads associated with setting and releasing locks and resulted in lower average task preemption times. Though a non-preemptive kernel results in worst-case task response time of upto a second, it was acceptable to Unix designers. At that time, the Unix designers did not foresee usage of Unix in real-time applications. Of course, it could have been possible to ensure correctness of kernel data structures by using locks at appropriate places rather than disabling interrupts, but it would have resulted in increasing the average task preemption time. In Sec. 4.4.4 we investigate how modern real-time operating systems make the kernel preemptive without unduly increasing the task preemption time.