Microsoft Windows NT and Sun Microsystems Solaris
A Comparison of Commercial Operating Systems
WPI CS 535 – Advanced Operating Systems
Professor Craig Wills, Fall 2000
Introduction
......................................................................................................................... 1
Background
............................................................................................................................ 2
Sun Microsystems Solaris
.............................................................................................................. 2
Architectural and System Overview
........................................................................................ 3
Microsoft Windows NT................................................................................................................. 5
Design Alternatives.................................................................................................................. 5
Operating System Characteristics
.......................................................................................... 6
Architectural and System Overview
....................................................................................... 7
Comparison............................................................................................................................. 10
Processes and Threads
................................................................................................................ 10
Solaris..................................................................................................................................... 10
Windows NT............................................................................................................................ 12
Conclusion.............................................................................................................................. 14
Memory Management.................................................................................................................. 14
Solaris..................................................................................................................................... 14
Windows NT........................................................................................................................... 16
Conclusion.............................................................................................................................. 17
Wrap Up...................................................................................................................................... 18
Bibliography......................................................................................................................... 20
A few key players dominate the
commercial market for server operating systems: Sun Microsystems, IBM, HP, and
recently, Microsoft. Microsoft’s
Windows NT/2000 (herein referred to as Windows NT) is a child compared to Sun
Microsystems’ Solaris, but with each revision is slowly approaching the power
of a UNIX operating system, and is starting to get more and more market
share. This paper discusses the design
decisions of the original Windows NT team to see how the team drew upon their
VMS background to build a powerful operating system. The Solaris Kernel Architecture will be presented, followed by
the Windows NT Executive and Kernel.
Advantages and disadvantages of each design and design trade-off are
examined. The overall architectures as
well as a few individual components of Windows NT and Solaris are compared and
contrasted. Pure performance will not
be covered in this review – performance is a hot topic of debate; each side
claims superior performance, but neither side has enough data to back their
claims.
Sun
Microsystems’ premier operating system, Solaris, began life known as
SunOS. SunOS was Sun’s original BSD
based UNIX. Solaris is the extension of
SunOS – it includes a graphical user interface, many windowing components, and
other operating system “accessories.”
So SunOS could be thought of as the kernel, with Solaris the entire
operating environment. Solaris 1.x was
the framework around SunOS 4.x. Solaris
2.0, over SunOS 5.0, was a major revision, and was a System V Release 4-based
UNIX. It was released during the third
quarter of 1992. Although the term
SunOS is missing from the name, the core of Solaris will always be SunOS. Solaris has gone through a large number of
updates and upgrades, always striving to be bigger and better. Seemingly on a cue from Microsoft, however,
Sun marketing decided to play games with the version numbers. Solaris 2.7 became known as Solaris 7, and
its upgrade is Solaris 8. [DHBA1]
Microsoft’s
Windows NT began life as nothing more than a portable version of OS/2[K1]. The Windows NT team, however, had much, much
grander goals for their new operating system.
Comprised of a team from Digital’s VMS labs, Windows NT 3.1 was born
during the third quarter of 1993. [DHBA1] On the outside it looked like Windows, but on the inside it was something
Microsoft had never had before – a real operating system. Knowing the speed of the computer industry,
the team put performance low on the list of essential items for Windows NT. And it showed – Windows NT 3.1 was a hog on
system resources. But it was
extensible, portable, reliable, robust, and moderately compatible with all
MS-DOS and MS Windows applications of the time (any application that did not
compromise one of the previous design goals).
The upgrade, Windows NT 3.5, was moderately successful, but it was not
until Windows NT 3.51 that the world finally caught on to the power of Windows
NT. Also, Microsoft’s new Windows
application programming interface, the Win32 API, was finally starting to reach
a level of maturity. Windows NT 4.0 was
a significant upgrade. With a
completely redesigned graphical user interface (used first in Windows 95 but
designed by the Windows NT team), broader hardware and application support, and
more security, performance, and stability enhancements than ever before,
Windows NT 4.0 slowly began to take the corporate world by storm. Windows NT 4.0 Workstation began appearing
on desktops, with Windows NT 4.0 Server powering the network. At first Windows NT 4.0 Server was used for
small things, such as file and printer sharing, but soon grew to challenge the
jobs originally held exclusively by the large UNIXs, such as email and web
serving. Microsoft’s latest version,
Windows 2000, is tooted to be Microsoft’s most important operating system
release ever. Originally Windows NT
5.0, Windows 2000 “Built on NT
Technology” has broad hardware support for technologies such as Plug and
Play as well as multimedia enhancements utilizing Microsoft’s Direct X
technologies. Windows 2000 is a key
update, but at its core is the same Windows NT technology that was designed in
from the beginning, proving the original team stuck to their guns and never
lowered their sites.
Since Windows NT is a much newer technology, the design team had a lot of previous data to work with. Given the tools they had to work with and the research already done, I hope to discover what paths the designers of Windows NT took, and how close those paths are to the paths of the designers of System V Release 4 and Solaris 2.x.
To
begin, the background of the two operating systems are looked at. The initial design goals are discussed,
where possible, and an architectural overview is given.
Sun’s UNIX operating system
began life as a port of BSD UNIX to the Sun-1 workstation. It was called SunOS. SunOS 1.0 was based on a port of BSD 4.1,
and released in 1982. As networked UNIX
systems began to grow, new technologies such as remote procedure calls, sharing
of data over networks, and network file systems were being developed, and
needed operating system support. SunOS
2.0 was released in 1984, and offered support for many of these features,
including the virtual file system framework, which makes NFS, Sun’s network
file system, possible. The next phase
of innovation was, again, because of greater demands put on the Sun platform –
in this case, applications needed better facilities for the sharing of data and
executable objects. This led to a major
re-architecting of the SunOS virtual memory system. The new virtual memory system, introduced as SunOS 4.0,
abstracted devices and objects as virtual memory, facilitated the mapping of
files, the sharing of memory, and the mapping of hardware devices into a
process. Shortly later, as the demand
for processing capacity outpaced the improvements in processor speed and
systems with multiple processors were developed, SunOS 4.1 saw the light of
day. SunOS 4.1 introduced asymmetric
multiprocessor support – the kernel could only run on one processor at a time,
but user processes could be scheduled on any available processor. Systems with multiple processors saw greater
system throughput than systems with a single processor, but scalability
declined rapidly as additional processors were added.
Around this time, Sun was
participating in a joint development effort with AT&T, and the SunOS
virtual file system framework and virtual memory system became the core of UNIX
System V Release 4 (SVR4). SVR4 UNIX
incorporated the features from SVR3, SunOS, BSD, and Xenix. Of course, SVR4 UNIX was ported to Sun’s
hardware architecture, the SPARC. This
became the basis of SunOS 5.
With the predicted growth in
multiprocessor systems, Sun invested heavily in the development of a new
operating system kernel, with a primary focus on multiprocessor scalability. This new kernel, along with the SVR4
operating environment, became the basis for Solaris 2.0. This change in the base operating system was
accompanied by a name change – Solaris became the name of the operating
environment, of which SunOS, the base operating system, is a subset. So, the older SunOS retained the SunOS 4.x
versioning and adopted Solaris 1.x as the operating environment version. The SVR4-based environment adopted a SunOS
5.x versioning with the Solaris 2.x operating environment. Solaris 2.x, and thus SunOS 5.x, is the
backbone of the current Solaris operating environment. Solaris 2.0 was born in 1992, and is
currently the last major revision.
Solaris has gone through a number of minor revisions, with each minor
update adding some pretty major functionality.
A significant update came in 1993, when Solaris was ported to the Intel
i386 architecture. Since then, Solaris
has continued to have two versions, one for Sun’s SPARC hardware line, and one
for Intel’s. Solaris version 2.7 saw a
minor name change, where for some reason the 2.x was dropped, and it was
released as Solaris 7. This continued
to version 2.8, the current version, which sells as Solaris 8. Solaris still supports two hardware
architectures, the SPARC (including all of its derivatives, such as the
UltraSPARC), and the Intel i386. [MM1]
The
Solaris kernel is the core of Solaris – it manages system hardware resources
and provides an execution environment for user applications. Like other operating system implementations,
the Solaris kernel provides a virtual machine environment that shields programs
from the underlying hardware, and allows multiple applications to execute in
parallel by giving each program its own virtual machine environment. The basic unit that provides this
environment is the process. A process
is an abstraction that contains the environment for a user program, and each
process is isolated from all other processes in the system. Processes provide a 32-bit address space for
applications (Solaris 7 became a 64-bit operating system, with support for
64-bit address spaces; however, Solaris 7 (and 8) can run in either 32-bit or
64-bit mode), and are used for isolation and resource management. A process has one or more threads of
execution, which share the virtual memory space of the process, and are used
for process execution and kernel tasks. Each process also contains one or more
lightweight processes – a virtual execution environment for each kernel thread
within the process. Like a process, multiple threads of execution within the
kernel share the kernel’s environment.
The kernel itself is completely multithreaded – it is implemented with
multiple threads of execution to allow for concurrency across multiple
processors.
During
Solaris development there were a number of key differences distinguishing
Solaris from earlier UNIX implementations.
Solaris has support for symmetric multiprocessing – it can run on
systems ranging from single processor systems to 64 processor servers, and
scales linearly from one to 64 processors.
The 64-bit kernel, introduced in Solaris 7, provides a 64-bit platform
and an LP64 execution environment (Long Pointers are 64-bits wide). A 32-bit environment is also provided so
that 32-bit binaries can execute on a 64-bit kernel alongside 64-bit
applications. It has multiple platform
support, supporting all SPARC and Intel x86 based architectures. Over 90% of the Solaris source is platform
independent. Solaris has a modular
binary kernel – the kernel uses dynamic linking and dynamic modules to divide
the kernel into modular binaries.
Solaris has multithreaded process execution, in that a process can have
one or more threads of execution, and each thread can run concurrently on one
or more processors. The Solaris kernel
is completely multithreaded, allowing multiple threads of execution to be in
the kernel at one time. The kernel is
also fully preemptable, not requiring manipulation of hardware interrupt levels
to protect critical data. Solaris
provides a configurable scheduler environment and support for multiple
schedulers. The multiple schedulers can
operate concurrently, each with its own scheduling algorithms and priority
levels. Solaris offers a virtual file
system (VFS) framework that allows multiple file systems to be configured into
the system. The framework implements several disk-based file systems (such as
UNIX File System, MS-DOS File System, and CD-ROM File System), network file
systems (such as NFS V2 and V3), and pseudo file systems (such as the PROCFS,
the process file system that abstracts processes as files). The VFS framework is integrated with the
virtual memory system to provide dynamic file system caching (uses available
free memory as a file system cache).
Special facilities in Solaris allow fine-grained processor control,
including binding processes to processors and configuring processors into
scheduling groups to partition system resources. The Demand-Paged Virtual Memory System feature of Solaris allows
systems to load applications on demand rather than loading whole executables or
library images into memory. This speeds
up application startup time and potentially reduces memory footprint. The modular Virtual Memory System separates
virtual memory functions into distinct layers: the address space layer, segment
drivers, and hardware-specific components (which are consolidated into the
Hardware Address Translation (HAT) layer).
The segment drivers allow the abstraction of memory as files, and files
can be memory-mapped into an address space.
Segment drivers also allow other abstractions, including physical memory
and devices, to appear in an address space.
The Modular Device I/O System in Solaris allows a hierarch of buses and
devices to be installed and configured.
A Device Driver Interface shields device drivers from platform-specific
infrastructure, maximizing portability of device drivers. Solaris also includes integrated networking,
with the data link provider interface allowing multiple concurrent network
interfaces to be configured and used, and Solaris also includes an integrated
TCP/IP implementation (which sits on top of the data link provider
interface). Finally, the Solaris kernel
was designed and implemented to provide real-time capabilities. [MM1]
The
core infrastructure is constructed using modular, well-defined interfaces that
support the addition of objects. The
kernel itself simply provides the core functionality that everything else
relies on. By keeping the size of the
kernel small it limits the amount of software that can potentially cause the
system to crash. The architecture
itself supports “change through evolution.”
It adapts to new environments instead of requiring rewrites that could
cause existing applications to suddenly stop working. [SDC1]
The
modular Solaris kernel is grouped into several key components. [MM1]

Solaris Internals, Figure 1.1
The System Call Interface
allows user processes to access kernel facilities. The system call layer consists of a common system call handler,
which vectors system calls into the appropriate kernel modules. Process Management provides facilities for
process creation, execution, management, and deletion. The scheduler loads the different scheduling
classes (more on this in the Process and Threads Comparison section), and helps
schedule threads. The Virtual Memory
System manages the mapping of physical memory to user process and the
kernel. The Solaris Memory Management
layer is divided into two layers: the common memory management functions and
the hardware-specific mechanisms. The
hardware-specific portions are located in the Hardware Address Translation
layer. Solaris implements a Virtual
File System framework, which allows different file systems to be configured
into the kernel at the same time.
Regular disk-based file system, network file systems, and pseudo file
systems are all implemented in this layer.
The Solaris I/O framework implements bus nexus node drivers (driver for
specific architectures, e.g. a PCI bus) and device drivers (driver for a
specific device on the bus) as a hierarchy of modules that reflect the physical
layout of the bus/driver interconnect.
The Central Kernel Facilities module provides things like regular clock
interrupts, system timers, synchronization primitives, and loadable module
support. Solaris’s built-in Networking
subsystem is implemented as stream-based device drivers and streams modules. [MM1]
The
kernel is implemented as a core set of functions, with additional subsystems
and services linked in as dynamically loadable modules. The module loader and kernel runtime linker
infrastructure do this, and modules can either be loaded at boot time or on
demand while the system is running.
Solaris 7 has support for seven types of loadable kernel modules:
scheduling classes, file systems, loadable system calls, loaders for executable
file formats, streams modules, bus or device drivers, and miscellaneous
modules. The core kernel provides
system calls, the scheduler, memory management, process management, the virtual
file system framework, kernel locking, clocks and timers, interrupt management,
boot and startup, trap management, and CPU management. Everything else in the system is a loadable
kernel module, such as the scheduling classes, the file systems (such as UFS,
NFS, and PROCFS), loadable system calls, executable formats (such as ELF and
COFF), stream modules, device and bus drivers, and miscellaneous modules (such
as the NFS Server, Interprocess Communication, and more). [MM1]
The following design goals drove the original
Windows NT specification in 1989: [SR1]
At the start of the project, the Windows NT design
team adopted five design goals to help direct the thousands of decision they
knew they would have to make. [SR1]
With
those goals in mind, the development team investigated several alternatives
during the design phase. Originally,
Windows NT was to have an OS/2 style user interface, and the OS/2 Application
Programming Interface (API) was going to be its primary programming
interface. However, due to the
popularity of Microsoft Windows, it was decided to refocus and develop the
Win32 API.
The
first design layered the POSIX API set over a slightly extended OS/2 API
set. However, it soon became clear that
this system would not be robust, easily maintained, or extensible. Also, once the development of the Win32 API
began, this was no longer even an option.
The
next design implemented the OS/2 and POSIX API sets directly in the Windows NT
Executive. Although an improvement, the
large number of oddly structured interfaces required by this design threatened
the goals of extensibility and maintainability.
The
third, and ultimately final, design implemented OS/2 and POSIX as protected
subsystems outside of the Executive. It
was an almost client-server architecture.
After analysis and extended mockup, it was soon clear that this design
would provide everything the operating system would require. [MSDN1]
Windows NT, arguably Microsoft’s first real
operating system, consists of all of the characteristics of a modern operating
system, including: [VM1]
Windows NT uses the process
abstraction to separate applications and the operating system. A process is a container for resources and
attributes for a running program. A
process has one or more threads of program execution. The thread is the base unit of execution and scheduling under
Windows NT, with all threads in the system being fundamentally equal.
To allow multiple threads of
execution to exist simultaneously, Windows NT multitasks – or rapidly switches
between threads, giving each processor time.
All threads in the system have an associated priority. The threads with the highest priority are
given first chance at the processor, followed by threads of a lower
priority. Threads of equal priority
have equal access to the processor, even if the first thread does not quietly
release the processor. Thus, Windows NT
implements pre-emptive multitasking.
Each thread is given a maximum amount of time to run, and when that time
expires, Windows NT will suspend the thread to give another a chance to run.
Each process has its own
protected 4 GB (giga-bytes) of virtual memory.
Typically applications have access to 2 GB of their process’ address
space, while the system uses the remaining 2 GB. Virtual memory pages are loaded on reference. The memory model allows the same physical
address space to appear within the virtual address spaces of multiple processes.
In a Windows NT symmetric
multiprocessing (SMP) system, all CPU’s are equal. Thread scheduling and interrupt handling can be equally
distributed among all processes.
Windows 2000 currently supports up to 32 processors, but there is
nothing inherent in the multiprocessor design that limits it to 32. It is just a convenient limit because 32
processors can easily be represented as a bit mask using a native 32-bit data
type.
Windows
NT achieves portability in two primary ways.
First, it is a very layered design, where only the lowest-level portions
are processor or platform specific.
Second, the majority of Windows NT is written in C, with portions
written in C++. Assembly language is
only used for parts that need to communicate directly with the hardware, or are
extremely performance sensitive. The
initial release of Windows NT supported the x86 and MIPS architecture, and
support for the Alpha came shortly. In
version 3.51, PowerPC support was added.
Because of market demands, however, support for MIPS and PowerPC were
dropped before development began on Windows 2000. Later, Compaq withdrew support for the Alpha, and Windows 2000
became x86 only. The next architecture
to be supported is IA-64 (or Intel Architecture 64), however AMD’s 64-bit offering
will also likely be supported.
In the classic sense, Windows NT is not a
microkernel-based operating system. The
principle operating system components do not run as separate processes in their
own address spaces as they would in a microkernel. Under Windows NT these components run in kernel mode, wrapped in
a package called The Executive. The
Executive follows basic object-oriented design principles – each of the modules
within it is a separate entity without access to the private data of other entities,
and the entities use formal interfaces to access and modify data.
From the beginning, Windows NT was designed to
emulate other operating systems.
Although the primary subsystem is Win32, DOS, Win16 (Windows 3.1),
POSIX, and OS/2 are all supported through emulation. In practice, however, the extent of the emulation is, in some
cases, very small. Over the years POSIX
and OS/2 have become less important, and DOS and Win16 are slowing fading out
of existence (finally). Therefore, the
subsystem with the greatest support and constant improvements is the Win32
subsystem. The DOS and Win16 subsystems
(part of the Win32 subsystem) are still of some importance, but not nearly the
extent of their more advanced cousin.
The POSIX and OS/2 subsystems, however, are mostly paper certificates,
and are not even useful as porting environments (explained later).
The
following is a view of the Windows 2000 system architecture: [SR1]

Inside Windows 2000, Third Edition. Figure 2-3.
The four basic User Mode sections are as
follows. The System Support Processes
are fixed processes that are always running (such as the logon manager) but are
not started by the Service Control Manager (and as such are not services). System Support Processes are the first user
level processes to start. Anything
started by the Service Control Manager is a Service Process. Services are similar to daemons under UNIX. They are ordinary applications that run
under a special non-interactive user account, and generally cannot interact
with desktop. Services are generally
started when the operating system boots and are unaffected by user logons and
logoffs. User Applications can be one
of five types: Win32, Win16, MS-DOS, OS/2, or POSIX. Environmental Subsystems expose the native operating system
services through a set of well-documented APIs. Windows 2000 ships with three such environmental subsystems:
Win32 (for Win32, Win16, and MS-DOS applications), OS/2 (for OS/2 1.2
applications), and POSIX (POSIX.1, or IEEE POSIX standard 1003.1-1990).
[SR1] The Win32 subsystem is
actually special – Windows cannot run without it. Whereas the OS/2 and POSIX subsystems are started on demand, the
Win32 subsystem is always running.
An application is bound to one, and only one,
subsystem. Since the only available
APIs to a given application are exported by the subsystem it runs in, migrating
an application across subsystems is not an easy, straightforward process. In some cases, however, there are
third-party tools to aid in the process.
There is a single path to transition between User
Mode and Kernel Mode – through NTDLL.dll.
It contains two types of functions: system service dispatch stubs to the
Executive, and internal support functions used by subsystems, subsystem DLLs,
and other native images. The service
dispatch stubs provide an interface to the Executive system services. Most of these functions are available
through the Win32 API. The code inside
of these functions is platform specific, and causes a transition into kernel
mode. After parameter verification, the
actual kernel-mode service is called.
Also contained in NTDLL.dll are support functions, such as the image
loader, the heap manager, Win32 subsystem process communication functions, and
general run-time library routines, as well as the user-mode asynchronous
procedure call dispatcher and exception dispatcher. [SR1]
The
Executive is the upper layer of Kernel Mode.
It includes the following types of functions. Some functions are exported and callable from user mode. Some functions can only be called from
kernel mode and are documented in the Device Driver Kit (or DDK) or the
Installable File System (IFS) Kit.
Other functions can only be called from kernel mode but are not
documented. Some functions are defined
as global symbols but are not exported.
And finally, there are functions that are internal to a module that are
not defined as global symbols. The
Executive contains the following major components: the Configuration Manager
(implements and manages the system registry), the Process and Thread Manager
(underlying support for processes and threads), the Security Reference Monitor
(enforces security policies), the I/O Manager (implements device-independent
I/O and dispatches to the appropriate device drivers), the Plug and Play
Manager (determines what drivers are required to support a device and loads
them), the Power Manager (coordinates power events and generates power
management I/O notifications to device drivers), the WDM Windows Management
Instrumentation Routines (enables device drivers to publish performance and
configuration information, and receive commands from the user-mode WMI
service), the Cache Manager (improves performance of file-based I/O), and the
Virtual Memory Manager (implements virtual memory, and provides underlying
support for the Cache Manager). The
Executive also contains four main groups of support functions: the Object
Manager (creates, manages, and deletes executive objects and abstract data
types), the Local Procedure Call (LPC) Facility (passes messages between client
and server processes on the same computer), a broad set of run-time library
functions, and Executive support routines (system memory allocation,
interlocked memory access, as well as special types of synchronization objects:
resources and fast mutexes).
The lower layer of Kernel Mode is the Kernel. The Kernel provides fundamental mechanisms, such as thread scheduling and synchronization services, as well as low-level hardware architecture-dependent support, such as interrupt and exception dispatching. Basically it provides a low-level base of well-defined, predictable operating system primitives and mechanisms that allow higher-level components to get stuff done. It separates itself from the rest of the Executive by implemented mechanisms but avoiding policy making. Almost all policy decisions are left to the Executive, with the exception of thread scheduling and dispatching. [SR1]
Outside of the kernel the Executive represents threads and other shareable resources as objects. The Kernel does this as well, but it uses a set of simpler objects, called Kernel Objects. Most Executive objects encapsulate one or more Kernel objects. One set of Kernel objects, called Control Objects, is for controlling various operating system functions. These objects include APC objects, deferred procedure call (DPC) objects, and several objects the I/O manager uses (such as interrupt objects). Another set, the Dispatcher objects, offer synchronization capabilities that affect or alter thread scheduling. These objects include kernel threads, mutexes, events, kernel event pairs, semaphores, timers, and wait-able timers. The Executive uses Kernel functions to create and manipulate Kernel objects, and to construct more complex objects to provide to User Mode. [SR1]
The other major job of the Kernel is to provide an abstract interface to the different hardware architectures that Windows NT supports. It does this through the Hardware Abstraction Layer (HAL). The HAL is a loadable kernel-mode module that provides the low-level interface to the hardware platform, and is responsible for variations in functions such as interrupt handling, exception dispatching, and multiprocessor synchronization. However, the design of the HAL still tries to maximize the amount of common code. The interfaces supported are portable and semantically identical across architectures. But still, some of the interfaces are implemented very differently on different architectures. Also, some Kernel interfaces (such as spin locks) are implemented in the HAL. [SR1]
Two different but important operating system fundamentals are looked at, first for Solaris, and then for Windows NT. What it means to be a process and a thread in the two operating systems is given, as well as scheduling differences. Afterwards, there is a look at memory management and virtual memory.
A
process is an abstraction used by the operating system to represent an
application that has been loaded into memory and may be running. Under both Solaris and Windows NT processes
are used for abstraction (to abstract a program that has been loaded into
memory) and for resource management (so that all programs are fair to the
system and are treated fairly by shared system resources). Neither Solaris nor Windows NT uses the
process as a schedulable entity entitled to CPU time. For that, they use the thread abstraction. Threads of execution, or simply threads, are
abstractions of program code running (or waiting to run) on a processor. In both Solaris and Windows NT processes
have one or more threads associated with it.
In
Solaris, there is a system process table that maintains process structures
(proc structure), and every process in the system occupies a slot in this
table. The proc structure contains all
of the information needed by the kernel to manage and schedule the process, its
child processes, and all threads contained within the process. Space for the process table is dynamically
allocated in kernel memory as processes are created. The table is implemented as a linked list, and the maximum size
of the table is determined at boot time based on the amount of physical memory
in the system. Also determined at boot
time is the maximum number of processes per non-root user. [M1]
Solaris
has support for processes, threads, and lightweight processes (LWPs). Generally, a process represents an instance
of an application (although applications are free to spawn additional
processes). Processes are containers
and used for resource allocation. A
thread is the execution of code within a process. Threads under Solaris are implemented in user space – the
scheduling of threads is not the direct responsibility of the kernel. The schedulable kernel entity is the LWP,
which lives partially in user space and partially in kernel space. There is at least one LWP for every process,
but not necessarily one LWP for every thread (there cannot be more LWPs than
threads, however). Threads can either
be tied one-to-one to a LWP, there can be multiple threads associated with a
single LWP, or multiple threads can be spread across multiple LWPs. Light Weight Processes are a virtual
execution environment for each kernel thread within a process. The LWP allows each kernel thread within a
process to make system calls independently of other kernel threads within the
same process.
Every
process is represented by a proc_t data structure, and has an address space
that is comprised of all the page mappings for the various regions that make up
the virtual address space of the process.
The proc_t structure contains a pointer to a structure for that address
space mappings, or AS structure. The
proc_t structure also contains the vnode of the executable file that comprises
the process. User credentials for the
process are maintained in an additional structure pointed to by the
proc_t. The proc_t also contains the
list of kernel threads associated with the process. [M1]
Each kernel thread data
structure (kthread_t) maintains an associated LWP structure (klwp_t), and is
the entity that is actually put on a dispatch queue and scheduled. The kthread_t structure includes a number of
interesting bits of information.
Kthread_t structures are implemented as a link list with one list per
process, so the kthread_t has a pointer to the next structure in the list. It contains a pointer to the kernel stack
and the size of this stack. It includes
the CPU affinity for the thread, binding it to a processor, as well as processor
set support data. When blocked, the
Wait channel contains what the thread is waiting for. Kthread_t structures maintain information on their current state
(running, run-able, sleeping, stopped, zombie). Finally, the kthread_t structure has various bits of data the
kernel needs for thread management, signal information, pointers to its LWP and
controlling process, a thread ID, and more.
The kthread_t structure is not page-able. [M1]
Since the kernel thread is
the one schedulable entity, each kthread_t structure also points to its own
scheduling class structure. There are
four different scheduling classes. The
timeshare scheduling class provides behavior that follows the traditional
notion of timeshare systems – as processes run, their priority gets worse, and
as they wait, their priority gets better.
The timeshare scheduling class is designed to provide a fairly even
distribution of hardware processor resources to all threads. Another scheduling class is the realtime
scheduling class. It provides some
degree of support for realtime applications, including predictable latencies
for getting processor time and the ability to stay on the processor as long as
needed. Later Solaris added the
interactive scheduling class, which is designed to provide more responsive
behavior for desktop systems. This
class bumps the threads attached to the currently focused user’s window to the
top of the dispatch queue. The final
scheduling class is the system class, and is reserved for use by Solaris.
[M1]
The Lightweight Process
structure (klwp_t) contains a pointer back to its corresponding kernel thread,
and a pointer to its associated process.
It also contains an embedded structure that maintains various pieces of
resource utilization information, which is updated during execution. The proc_t also contains resource
utilization information, which is the combined total of all LWP resource
utilization. Other interesting pieces
of the klwp_t structure include the Process Control Block (PCB) structure,
which contains machine state information that is saved when the LWP is switched
out of context, and is restored when the LWP is switched back in. The klwp_t contains a system call interface
– an argument pointer and errno for system call execution. Lastly it contains the current state of the
LWP, and signal and debugger information.
Unlike the kthread_t, the klwp_t is page-able. [M1]
To
create a simpler programming model for interfacing with processes, a file-like
abstraction has been created – the process file system (procfs). The proc directory hierarch contains a
subdirectory for each LWP in the process.
Resource utilization for the process and each LWP is available through
standard file open and read system calls.
The
Solaris scheduler framework is built upon the Unix SVR4 scheduler. The SVR4 scheduler introduced the notion of
scheduling classes, which defines the polices and algorithms applied to a
process or thread. For each scheduling
class there is a table of values and parameters that the dispatcher code uses
for selecting a thread to run on a processor, in addition to setting priorities
based on wait time and how recently the thread had execution time on a
processor. Scheduling classes are
implemented as dynamically loadable kernel modules, allowing users to define
and add their own classes. Originally,
Solaris 2.x shipped with three scheduling classes: systems (SYS), timesharing
(TS), and realtime (RT). Somewhere
around Solaris 2.4 they added another class, the interactive (IA) scheduling
class, to provide more responsive interactive desktop performance. As stated above, the TS class is the
traditional time and resource sharing behavior, such that all processes on the
system get their fair share of execution time.
The SYS class exists for kernel threads. The RT class provides realtime-scheduling behavior; meaning
threads in this class have a higher global priority than TS and SYS
threads. However, even with that,
changes to the kernel had to be made to fully support RT threads. First, the kernel was made preempt-able,
even so much as the kernel itself could be preempted to allow a RT thread to
run. Also, since RT threads cannot
afford to be exposed to the latency involved in resolve a page fault, a memory
locking facility was added so developers could lock a range of pages in
physical memory.
The
Solaris scheduler also adds several features that enhance the overall usability
and flexibility of the operating system.
It provides a more intuitive priority scheme, where higher priorities
are better priorities (compared to the traditional Unix implementation where
lower priorities were better). There
are 170 priorities, which are assigned as follows: 0 to 59 TS and IA, 60 to 99
SYS, 100 to 159 RT, and 160 to 169 interrupt thread priorities. The Solaris scheduler provides realtime
support (required for the RT scheduling class). It has table-driven scheduler parameters, providing the ability
to tune the dispatcher for specific application requirements by altering the
values in the dispatch table for a particular scheduling class. Finally, the Solaris scheduler solves
priority inversion (the issue of a higher priority thread being blocked from
execution because a lower priority thread is holding a resource). [M1]
In Windows NT, an Executive Process (EPROCESS) block represents every process. An EPROCESS block contains attributes relating to the process, as well as pointers to other related data structures, such as thread blocks (Executive Thread, or ETHREAD). Most of the EPROCESS structure lives in system space, with the exception of the Process Environment Block (PEB), which lives in the process address space (it contains information that is modifiable by user-mode code). In addition to the EPROCESS block, the Win32 subsystem maintains a parallel structure for each process that represents a Win32 application. Also, the kernel-mode part of the Win32 subsystem has a per-process data structure created the first time a thread calls a Win32 USER or GDI function that is implemented in kernel-mode. Interesting bits of the EPROCESS block are as follows. The Kernel Process (KPROCESS) Block, sometimes called the Process Control Block (PCB), contains the basic information that the kernel needs to schedule threads, such as the list of Kernel Threads (KTHREADs) corresponding to the process, a process spin lock, processor affinity, resident kernel stack count, process base priority, process state, and thread seed. The KPROCESS block also has a common dispatcher object header, pointer to the process page directory, the list of kernel thread blocks belonging to the process, the default base priority, quantum, affinity mask, and total kernel and user time for the threads in the process. Another interesting bit of the EPROCESS block is the Process Environment Block (PEB), which lives in the user process address space and contains information needed by the image loader, the heap manager, and other Win32 system DLLs – information that needs to be write-able from user mode. It also contains information such as the base address of the image, thread-local storage data, code page data, critical section timeout, number of heaps, heap size information, a pointer to the process heap, GDI shared handle table, operating system version number information, image version information, and image process affinity mask. In addition to these two main blocks, the EPROCESS block contains the following. The Quota Block is the limit of nonpaged pool, paged pool, and page file usage plus current and peak process nonpaged and paged pool usage. Several processes can share this structure – all processes point to a single default quota block, and all processes in the interactive session share a single quota block that Winlogon sets up. The Access Token is an Executive object that describes the security profile of the process. The Process Environment Block (PEB) contains image information (base address, version number, and module list), process heap information, and thread-local storage utilization. The Win32 subsystem process block (W32PROCESS) contains details needed by the kernel-mode component of the Win32 subsystem. [SR1]
An
Executive Thread (ETHREAD) block represents a thread of execution. The ETHREAD block and all of the structures
it points to live in system space, with the exception of the Thread Environment
Block (TEB), which lives in the process address space. Like processes, the Win32 subsystem also
keeps a parallel structure for each thread created in a Win32 process. And again, any threads that have called a
Win32 subsystem USER or GDI function have a thread object created by the
kernel-mode portion of the Win32 subsystem (a per-thread data structure called
a W32THREAD block). The key components
of an ETHREAD block are: a pointer to a KTHREAD block (Kernel Thread, explained
later), thread time information, process ID of the process the thread belongs
to, the start address, an access token and impersonation level, LPC information
(message ID that the thread is waiting for and address of the message), and I/O
information (list of pending I/O request packets). The Kernel Thread (KTHREAD) block contains the information the
kernel needs to perform thread scheduling and synchronization. Some key sections are: dispatcher header
(standard kernel dispatcher object header), execution time, pointer to kernel
stack information, pointer to system service table, scheduling information,
wait blocks, wait information, mutant list (kernel level mutexes), APC queues,
timer block, queue list, and pointer to TEB.
The Thread Environment Block (TEB), like the PEB, lives in process
address space. It contains information
such as: the exception list, stack base, stack limit, pointer to subsystem
thread information block (TIB), pointer to fiber information (“fibers” are
lightweight threads – threads that are implemented in user space and are not
known about by the kernel), thread ID, active RPC handle, pointer to PEB, “Last
Error” value, count of owned critical sections, current locale, USER32 client
information, GDI32 information, OpenGL information, and pointer to Winsock data
(Windows Sockets). [SR1]
In
Windows NT, processes are used for abstraction and resource allocation – the
thread is the schedulable entity.
Windows NT implements a priority-driven, preemptive scheduling system. The highest-priority ready thread always
runs, with the forewarning that the thread chosen to run might be limited by
the processors on which the thread is allows to run (known as processor
affinity). By default, threads can run
on any available processor, but processor affinity is alterable through the
Win32 scheduling functions. When a
thread is ready to run, it runs for an amount of time called a quantum. A quantum is the length of time a thread is
allowed to run before the kernel interrupts the thread to find out whether
another thread at the same priority is waiting to run, or whether the thread’s
priority needs to be reduced. Quantum
values can vary from thread to thread.
However, if a thread of a higher priority becomes ready to run, the
currently running thread is preempted (even if before it completes its quantum)
and the higher priority thread is allowed to run. Because scheduling is at the thread granularity, no consideration
is given to what process the thread belongs to.
The
scheduling code is implemented in the kernel, however there is no single
“scheduler” module or routine. The code
is spread throughout the kernel, and its functions are collectively known as
the kernel’s Dispatcher. The dispatcher
is triggered by the following events: a thread becomes ready to execute (it has
just been created, for example), a thread leaves the running state (quantum
ends, or the thread terminates or enters a wait state), a thread’s priority
changes, or the processor affinity of a running thread changes.
Windows
NT uses 32 priority levels (0 to 31): sixteen real-time levels (16-31), fifteen
variable levels (1-15), and one system level (0 – reserved for the zero page
thread). Priority levels are assigned
from two different perspectives: those of the Win32 AIP and those of the
kernel. The Win32 API first organizes
the process by priority class (Real-time, High, Above Normal, Normal, Below
Normal, and Idle), and then by relative priority for the individual threads
within the process (Time-critical, Highest, Above-Normal, Normal, Below-Normal,
Lowest, and Idle). The system then maps
the Win32 priorities to numerical kernel priorities. So, a process has only a single priority value, but each thread
within the process has two: the base priority and current priority (the base
priority of the process plus its relative priority).
To
make thread scheduling decisions, the kernel has a list of data structures,
collectively known as the Dispatcher Database.
This database keeps track of all threads waiting to execute, and which
processes are executing which threads.
The head of the database is the Dispatcher Ready Queue, a series of
queues (one for each scheduling priority) that contains threads that are in the
ready state and are waiting to execute.
Windows NT bases scheduling on thread priority, but there are a few ways
threads can get scheduled to run.
First, threads can voluntarily give up the CPU, by entering a wait
state, for example. A thread of a
higher priority can preempt the currently running thread. What is interesting to note is that user
level threads can preempt kernel level threads – only the thread priority is a
factor in determining preemption. The
quantum for the currently running thread could end, causing it to switch
out. And finally, the thread could
terminate. In an attempt to be fair to
all threads in the system, Windows NT may temporarily increase the priority of
a given thread. It may do this under
the following circumstances: completion of I/O operations, after waiting on
executive events or semaphores, after threads in the foreground process
complete a wait operation, when GUI threads wake because of windowing activity,
and to thwart CPU starvation (if the thread has not run in a long time). However, Windows NT never changes the
priority levels of real-time threads.
Thread scheduling on
symmetric multiprocessing systems is basically the same as single processor
systems, with a few slight differences.
Threads are given an Affinity Mask that specifies on which processors
the thread is allowed to run on. By
default all threads can run on any processor, but processor affinity can be
modified through API calls or by image-wide affinity masks. Although Windows NT will attempt to schedule
the highest priority run-able threads on all CPUs, it only guarantees that the
single highest priority thread is running at any given time. When a thread is created, it is given an
“ideal” processor. A thread also
remembers the last processor it ran on.
When a thread becomes available to run, Windows NT will attempt to run
the thread on an idle processor. If
there is more than one idle processor, first choice goes to the thread’s ideal
processor, second choice goes to its last processor, and finally to the
currently executing processor. If all
processors are busy, Windows NT looks to find a thread to preempt, again
searching the ideal processor and then the last processor. [SR1]
Although
terminology is slightly different, Solaris and Windows NT have very similar
underlying process and thread concepts.
They both have the process abstraction, and in both cases processes are
used for resource allocation and management.
Both Solaris and Windows NT have User Level threads, called Threads
under Solaris and Fibers in Windows NT.
Both threads and fibers are not scheduled by the kernel, but are
scheduled by user mode libraries. The
difference is that fibers are built into the Windows NT Application Programming
Interface, where thread libraries that may or may not ship with Solaris handle
threads. This lets Solaris programmers
pick and choose their thread library, where Windows NT users all use Win32 API
fibers (possibly with different configuration parameters, however). Although the schedulable entity in both operating
systems is a kernel thread (simply called Thread in Windows NT, and analogous
to the Light Weight Process in Solaris), scheduling is very different in the
two operating systems. Solaris, like
other UNIX System V Release 4 operating systems, is dynamically configurable
with different scheduling modules, and each kernel thread gets to choose which
scheduling module to attach itself to.
Windows NT, however, only uses a single scheduling module. This module, though, is a combination of all
of the default Solaris scheduling modules, all packed into a single scheduling
scheme. Where Solaris will schedule
differently depending on scheduling module (and priority within that module),
Windows NT will schedule depending solely on priority, letting the priority
determine what type of thread it is (real-time, interactive, etc.). At the end of the day, Solaris is more
configurable, but Windows NT is able to hold its own.
The motivation behind virtual memory is quite simple – it exists to provide the illusion of unlimited memory for all processes to execute in. Before virtual memory, programmers had to deal with the memory requirements of programs on certain platforms. With operating systems that support virtual memory, such as all major UNIXs and modern Windows operating systems, this is no longer the case. A process’ virtual memory requirement may exceed the amount of memory available for a process’ pages, and it may even exceed the amount of physical memory installed in the system. Other features of virtual memory include protecting an address space such that one process cannot simply write over another process’ memory pages, and that the kernel pages are protected from all non-kernel processes. Since virtual memory provides basic modern operating system functionality, memory management is key in a successful operating system.
As of version 7, Solaris has two modes of operation – 32-bit mode or 64-bit mode. Both versions of the kernel are actually installed to disk – a configurable kernel parameter decides which version to boot. In 32-bit mode, the default integer size is 32-bits, or 4-bytes. All processes have their own, private, flat 4-gigabyte 32-bit virtual address space (since 4-gigabytes is the maximum size that can be addressed using 32-bits). 4-gigabytes is also the maximum amount of RAM installable in the system (not including clever programming techniques that, in practice, allow the machine to have more than 4-gigabytes of RAM installed). 64-bit mode does not have any of these limitations. A 64-bit virtual address space is 17,179,869,184-gigabytes, and that is the maximum amount of RAM that is installable in the system. However, the default integer size doubles to 64-bits, or 8-bytes. Since the integer is extremely common, this increases the size of all application code and data, reducing cache hits and increasing memory demands. So, all other things being equal, there may be a small decrease in performance for normal, everyday applications when moving from 32-bits to 64-bits. Device drivers, on the other hand, generally see large performance increases, since they are able to transfer data twice as fast. [C1]
Solaris takes an object-orient approach to implementing virtual memory, and it is tightly integrated with the Virtual File System (VFS) architecture. All physical memory is treaded as a page cache for memory objects. Every memory object cached in the page cache has a corresponding vnode that describes it. Vnodes are a file system abstraction, and provide a method of describing and managing files in the kernel, independent of the lower level specifics of the particular type of file system the file originated in. A simple example would be the caching of a file from the UNIX File System (UFS) – the memory pages that hold the contents of the file each have a corresponding vnode, and in this case the vnode maps to an inode in the UFS that describes the file. [M2]
For processes, the Virtual Memory (VM) system presents a simple linear range of memory, known as an address space. Each address space is broken into several segments that represent mappings of the executable, heap space, shared libraries, and program stack. Each segment is divided into equal-sized pieces of virtual memory, called pages. A hardware memory management unit manages the mapping of pages of virtual memory to physical memory. Much of the VM system is platform independent, except for the components that deal with physical memory management. These platform specific portions are implemented in the Hardware Address Translation (HAT) layer. Pages of memory are allocated on demand, as they are referenced. This includes pages for executables or shared libraries, lowering the memory footprint and startup time of the process. The virtual memory system uses a global paging model that implements a single global policy to manage the allocation of memory between processes. A scanning algorithm, a kernel thread called the page scanner, calculates the least-used portion of the physical memory. This page scanner also keeps track of the amount of free physical memory, raising alarms if the amount of free memory falls below a preconfigured threshold. In that case, pages that have not been used recently are stolen and placed onto the free list for use by other processes.
Most of the kernel’s memory is not pageable – it is allocated from physical memory that cannot be stolen by the page scanner. This avoids deadlocks that could occur within the kernel if a kernel memory management function caused a page fault while holding a lock for another critical resource. The kernel also implements its own memory allocation systems. The core kernel memory allocator, called the slab allocator, allocates memory for kernel data structures. It subdivides large contiguous areas of memory into smaller chunks for data structures. Allocation pools are organized such that like-sized objects are allocated from the same continuous segments to reduce fragmentation. [MM1]
The Virtual Memory Manager is invoked by system calls and by other kernel services which allocate virtual memory on behalf of a process (such as memory-mapped files and shared memory areas). If the requested amount of virtual memory is unavailable the call returns failure. Otherwise, the Virtual Storage Manager subtracts the virtual address page used to validate the request from the number of real memory page frames and swap slots. Pages are returned when explicitly freed by the process or when the process terminates.
The Virtual Storage Manager is invoked by the kernel when a page fault occurs. If the concerned virtual address is valid the Virtual Memory Manager determines whether the page is backed. If it is not backed, or is backed by swap space, the Virtual Memory Manager requests an available real memory page frame and uses that page to back the address. In the case of pages backed by swap, the Virtual Storage Manager then causes the data to be read into memory. Since this is a costly process, the Real Memory Manager attempts to keep pages of physical memory free by swapping out pages that have not been referenced in a “long” time. [H1]
Windows NT is a 32-bit operating system (with a 64-bit version, Windows: Codename Whistler 64, currently in the beta phase). All processes have their own, private, flat 4-gigabyte 32-bit virtual address space. It is partitioned, however, in such a way to make processes look identical to the operating system and give protection to the operating system and the process, all the while still giving users control of some of the space. The bottom partition, 0x0 through 0xFFFF (64-KB) is set aside to help Windows NT programmers catch NULL pointer assignments and errors. Any attempt to read or write to this area causes an access violation. The next partition, 0x10000 through 0x7FFEFFFF (2-GB minus 64-KB minus 64-KB), is the process’ private address space. Program code and data resides here, as well as code for dynamically linked libraries and any private data those libraries require. The very top of this memory includes objects the operating system needs for the process, such as the Thread Environment Block and Process Environment Block (see the section on Processes, Threads, and Scheduling), and a read-only page that is mapped which contains information such as system time, clock tick count, and version number (it exists so that this data is directly readable without a transition into kernel-mode). The next partition, 0x7FFF0000 through 0x7FFFFFFF (64-KB), is also protected. It is reserved by Microsoft to make parameter checking on Executive functions quick and easy, and prevents threads from passing buffers that straddle the user/system space boundary. The last partition, 0x80000000 through 0xFFFFFFFF (2-GB), is reserved for the operating system. This is where Windows NT loads the Executive, the Kernel, and any device drivers needed by the process. It also contains the process page tables and page directory, the system working set list, system cache, paged pool, crash dump information, and HAL usage. The physical layout of this partition is processor specific. It is completely protected from harm, causing an access violation if any access attempt to this region is made. [R1] & [SR1]
The
Windows NT Memory Manager is part of the Executive. It is completely platform independent – no part of the Memory
Manager is implemented in the HAL. The
Memory Manager consists of the following components. It has a set of executive system services for allocating,
deallocating, and managing virtual memory.
Most of these services are exposed through the Win32 API or the
kernel-mode device driver interfaces.
The Memory Manager contains a “translation-not-valid” and access fault
trap handler for resolving hardware-detected memory management exceptions and
making virtual pages resident on behalf of the process. The Manager also contains several key
components that run in the context of six different kernel-mode system
threads. First is the Working Set
Manager (priority 16), which is called once per second, as well as when free
memory falls below a certain threshold, and drives the overall memory
management polices, such as working set trimming, aging, and modified page
writing. Second is the Process/Stack
Swapper (priority 23), which performs both process and kernel thread stack
inswapping and outswapping, and is awakened whenever an inswap or outswap
operation needs to take place. Next is
the Modified Page Writer (priority 17), which writes dirty pages on the
modified list back to the appropriate paging files, and is awakened when the
size of the modified list needs to be reduced.
Next is the Mapped Page Writer (priority 17), who writes dirty pages in
mapped files to disk, and is called when the size of the modified list needs to
be reduced or if pages for mapped files have been on the modified list for more
than five minutes. This second modified
page writer thread is needed because it can generate page faults that result in
requests for free pages. Lastly is the
Zero Page Thread (priority 0), who zeros out pages on the free list so that a
cache of zero pages is available to satisfy future demand-zero page
faults. Like all members of the
Executive, the Memory Manager is fully reentrant and supports simultaneous
execution on multiprocessor systems. [SR1]
Most of the services of the Memory Manager are exposed through the Win32 API. These APIs allow processes to manipulate their own, or other (with proper permissions), virtual memory space. Services include allocating and freeing of virtual memory, the sharing of memory between processes, mapping files into memory, flushing virtual pages to disk, the retrieval of information about a range of virtual pages, the changes of the protection of virtual pages, and the locking of virtual pages into memory. The Memory Manager also provides services for manipulating physical memory, such as allocation, deallocation, and locking for direct memory access (DMA) transfers. These services are provided to other Executive components in kernel space, as well as to device drivers. [SR1]
Windows
NT aligns each region of reserved process address space to begin on an integral
boundary defined by the system “allocation granularity” parameter. Currently this value is 64 KB. This size was chosen so that if future
support were added for processors with large page sizes, the risk of requiring
changes to applications would be small.
Kernel mode components, however, are not restricted to this 64 KB
alignment. They can reserve memory on a
single-page granularity. Nonetheless,
no matter where a region of address space is reserved, Windows NT ensures that
the size of the region is a multiple of the system page size, whatever it may
be (depends on the processor). [SR1]
Windows
NT provides memory protection in four primary ways. First, all system-wide data structures and memory pools used by
kernel-mode system components can only be accessed while in kernel-mode. If user-level threads attempt to access
these memory pages, the hardware generates a fault, and the Memory Manager
reports this to the thread as an access violation. Second, each process, as stated above, has a separate, private
address space that is protected from access by threads not belonging to that
process. The only exception is if the
process explicitly shares pages with other processes or if another process has
virtual memory read or write access.
Each time a thread references an address, the Memory Manager and virtual
memory hardware translate the virtual address into a physical address. Thus, since Windows NT controls how virtual
addresses are translated, it can guarantee that threads of one process don’t
inappropriately access a page belonging to another process. Next, in addition to the implicit protection
that the virtual to physical address translation offers, all processors
supported by Windows NT provide some form of hardware-controlled memory
protection. So, code pages in a
process’ address space are marked as read-only and are thus protected, in
hardware, against modifications by user threads. Code pages for loaded device drivers are likewise marked as read-only. Finally, all shared memory section objects
have standard Windows NT access-control lists that are checked whenever a
process attempts to open them, thus limiting access of shared memory to
processes with proper security permissions. [SR1]
The
functionality of the Memory Manager is the same in both Solaris and Windows NT,
but implementation seems to be somewhat different. Possibly the largest difference is that the Solaris Memory
Management subsystem has parts which are platform-specific where the Windows NT
Memory Management subsystem does not.
Unfortunately it is impossible to know precisely why this is the case,
but it probably has to do with the processors that each operating system supports. The Solaris Virtual Memory system was
developed for SunOS 4.0, long before Windows NT was conceived. The Virtual Memory (VM) system did not go
through any major changes during the transition to SunOS 5.0 and System V
Release 4 UNIX. In fact, the VM system
developed by Sun was one of the features of SunOS to be incorporated in System
V Release 4. Back then SunOS 4.0
supported the then current SPARC family of processors as well as the then
current Intel x86 family. The Intel
i386 processor does have special support for virtual memory, but it is possible
that the SPARC processors at the time did not.
In order to keep the two platforms as similar as possible, it is
probable that Sun decided to implement the upper layers of the VM system in
such a way as to require slightly different hardware implementations below, to
make up for the lack of VM features of the SPARC chip. All of this is speculation of course, but it
has been stated that the processors Microsoft chose to support were chosen, for
one reason, because they all had the same VM features. Since Microsoft started much later than Sun
did, they had the luxury of modern processors with VM support. Other than that the Windows NT Memory
Manager seems to have a number of monitoring and helper threads that the
Solaris Virtual Memory system does not have.
Solaris gives the impression of performing all of those functions
in-line where Windows NT has dedicated threads. Because the Solaris kernel is completely non-blocking, neither
method is really “better” than the other is – just different.
Although
completely different on the outside, the internals of Solaris and Windows NT
are not all that dissimilar. SunOS, the
precursor to Solaris, came long before Windows NT. The technology developed for and incorporated into SunOS was developed
by Sun engineers in response to market demands, but much of that technology
became the bases for UNIXs yet to come.
Many of the features and techniques used in SunOS became part of the
UNIX System V Release 4 standard, before Windows NT had even begun
development. However, the kernel and
underlying system of Solaris 2.x was developed at the same time as Windows
NT. It was just release a year earlier.
Microsoft
has stated time and again that Windows NT development will not be rushed, which
is exemplified by the number of times new releases have been pushed back. The first release was no different – it took
four years to complete. When the
original team was done, however, they had a clean operating system design,
obviously based off of lots of research into previous works. Originally developed for the MACH operating
system, the microkernel architecture is an exceptionally clean and robust
design; it was just rather performance lacking. The Windows NT team took this design and, especially after a few
iterations of the Windows NT operating system, got the performance to a level
that was commercially viable.
The
architecture of the Solaris kernel shows it to be modular and relatively well
structured. The Windows NT Executive
and kernel, however, is a cleaner design.
The members of the Windows NT Executive and like well-defined objects in
a clean object-oriented system – each member does one thing and does it
well. The Windows NT Executive and
kernel are also layered well to promote abstraction between components. Although Solaris and Windows NT both promote
portability through limited use of architecture-dependent code, the Windows NT
Hardware Abstraction Layer is more encompassing than the Solaris Hardware
Address Translation Layer and seems to make portability easier and cleaner.
Both
architectures support seamless extensibility via dynamically loadable kernel
modules, although some interfaces are documented better than others are. The primary interface to the kernel in
Windows NT is through hardware device drivers.
Microsoft has an unmatched developer support system for the development
of Windows based hardware and software, and the Device Driver Kit, for the
creation of device drivers, is no exception.
Together with some excellent third-party tools, device driver
development for Windows NT is arguably the easiest in the industry. Microsoft also has development tools for the
development of third-party file systems through the Installable File Systems Kit. Although not as good as Microsoft’s, Sun
also has support for developing Solaris hardware device drivers. Where Solaris really shines, however, is
through all of the other interfaces that are supported – such as installable
file systems, user defined scheduling algorithms, and all that. The Solaris kernel is more extensible from
outside of Sun than the Windows NT kernel is from outside of Microsoft. Plus, being based on a UNIX standard,
Solaris can be instantly familiar to the third party developer if that
developer is familiar with other System V Release 4 UNIXs.
With
regards to processes and threads and such, Solaris and Windows NT are
extraordinarily similar. The notion of
processes and their function has been around since the beginnings of UNIX, but
threads, especially kernel threads, were new at the time. With the use of the lightweight process,
Solaris is slightly more abstract than Windows NT, but the results are the
same. Both kernels support multiple
concurrent threads of execution within kernel space, and both scale to support multiple
processors. Scheduling in Solaris is
finer grained than Windows NT, with 169 priorities vs. 32 priorities. Plus, the Solaris scheduler allows for
dynamically loadable third party scheduling algorithms, making the Solaris
scheduler more extensible than that of Windows NT.
The
Memory Management systems of Solaris and Windows NT provide the same
functionality, but accomplish it in different ways. Solaris has small parts of the Virtual Memory system implemented
in architecture specific code, where the Windows NT Memory Manager is
completely platform independent. The
Solaris VM system is also tightly integrated with the Virtual File System. Although this limits modularity somewhat, it
does provide some great features, such as vnodes and the abstraction of running
processes as files (the /proc file system, or procfs). Windows NT does not have these features, but
then again, its Memory Manager is completely independent of the File System
drivers.
The
goal of this project was to determine what, if anything, the designers and
developers of Windows NT learned from UNIX, specifically with regards to
Solaris and other System V Release 4-based UNIXs. What was discovered is not all that surprising. The Windows NT designers did a great deal of
research, and drew not only on their own backgrounds from work in Digital’s VMS
labs but also from the work of a great number of other researchers attempting
to advance the state of the art. They
took what others had done and extended it, making Windows NT similar yet unique
in its own right. The modified
microkernel architecture has come a long way, and the Win32 Application
Programming Interface has matured into a great general-purpose API. Proven abstractions and concepts were taken
directly (such as processes and virtual file systems), while others were taken
and tweaked with a Windows NT feel (such as threads and scheduling). From an architectural point of view, Windows
NT easily holds its own in a world surrounded by UNIX.
[C1]
Cockcroft, Adrian, “Performance Q&A: When is it faster to have 64 bits?,” http://www.sunworld.com/sunworldonline/swol-11-1995/swol-11-perf_p.html,
November 1995
[DHBA1]
D.H. Brown Associates, Inc, “The Solaris 8 Operating Environment and Microsoft
Windows 2000 – Race for Control of Web Infrastructures,” http://www.sun.com/software/white-papers/wp-dhbrown00/
[F1]
Frost, Jim, “Windows NT: Finally, A Grown-Up Operating System From Microsoft,” http://world.std.com/~jimf/papers/nt-unix/nt-unix.html,
SunExpert, December 1995
[H1]
Herber, Randolph J., “UNIX System V Concepts – Memory,” http://www-cdf.fnal.gov/offline/UNIX_Concepts/concepts.memory.txt,
December 1997
[K1]
Kano, Nadine, “The Architects: First, Get the Spec Right,” http://www.microsoft.com/WINDOWS2000/news/fromms/kanoarchitect.asp
[M1]
Mauro, Jim, “Inside Solaris: Peeling back the process layers, Part 1,” http://www.sunworld.com/sunworldonline/swol-08-1998/swol-08-insidesolaris.html,
August 1998
[M1]
Mauro, Jim, “Inside Solaris: The Solaris process model, Part 2,” http://www.sunworld.com/sunworldonline/swol-09-1998/swol-09-insidesolaris.html,
September 1998
[M1]
Mauro, Jim, “Inside Solaris: The Solaris process model, Part 3,” http://www.sunworld.com/sunworldonline/swol-10-1998/swol-10-insidesolaris.html,
October 1998
[M2]
Mauro, Jim, “Inside Solaris: Swap space implementation, Part 1,” http://www.sunworld.com/swol-12-1997/swol-12-insidesolaris.html,
December 1997
[MM1]
Mauro, Jim, and McDougall, Richard, Solaris Internals: Core Kernel
Architecture, “Chapter 1: An Introduction to Solaris,” http://soldc.sun.com/booklist/bookreviews/sol_internals_intro.pdf,
October 2000
[MSDN1]
Microsoft Developer Network, “The Foundations of Microsoft Windows NT System
Architecture,” http://msdn.microsoft.com/library/backgrnd/html/msdn_ntfound.htm,
September 1997
[R1]
Richter, Jeffrey, Advanced Windows, Third Edition, Microsoft Press, 1997
[SDC1]
Solaris Developer Connection: “Better by Design – Designed to Evolve,” http://soldc.sun.com/articles/betterbydesign/evolve.html
[SR1]
Solomon, David A. and Russinovich, Mark E., Inside Windows 2000, Third
Edition, Microsoft Press, 2000
[VM1]
Viscarola, Peter G., and Mason, W. Anthony, Windows NT Device Driver
Development, Macmillan Technical Publishing, 1999