OSdata.com

Speed

OSdata.com is used in more than 300 colleges and universities around the world

Find out how to get similar high web traffic and search engine placement.

summary
hardware considerations

hard drives

servers
benchmarks and tests
Windows NT Registry
hardware acceleration

summary

Speed, or performance, is based on a combination of hardware and software issues. Many mainframe and server operating systems allow a system administrator to “tune” the operating system, that is, to adjust the operating system to match the actual needs of the kinds of software running on the system. A skilled system administrator can typically change the overall speed of a system by 10-15% and under certain conditions even more.

Some systems are optimized for business apllications. These systems tend to have high speed integer processing, capabilities for transaction processing, good file and disk subsystems, and I/O optimized for ordinary terminals or monitors and keyboards.

Some systems are optimized for scientific applications. These systems then to have high speed floating point processing, possibly vector processing, and I/O optimized for high end graphics.

Most systems are balanced, being good at both science and business, but not great at either.

    “You cannot really separate the O/S, from the hardware. You must consider the total solution in product evaluation. IBM, HP, SUN, and the others all have a line of systems spanning single-user machines to systems that handle thousands of users. Factor in various high-availability hardware, binary compatibility among the families, and so on, and you have even more permutations. Add other exotics such as HP’s workstations running a different variant of HP-UX than their servers… and you have a real mess.”
    “It is a fact of life that application vendors have first, second, and third tier platforms for the software they write. The differences in bugs, performance, and features/revision levels need to be considered when choosing the HARDWARE platform. Also the same application can run quite differently on different platforms, even though they [the platforms] have the same speeds & feeds on performance. That is because optimization levels differ, and sometimes hardware vendors assist software vendors in tuning applications for their particular environments.”
    “Consider this: Comparing the performance of Windows 95 on a Compaq and a Dell, both with the same Pentium II, speeds, and disk drives.”
    “I’m now running the same O/S, and the only difference is the architecture of the motherboard. If you pick up any PC rag, then you’ll see how Excel test #1 runs 10-15 percent faster on one machine, while that same machine might run another Excel test slower. Compare that to the performance of Lotus, and you will see different results.”
    “If you just want to do a comparison in terms of O/S features, fine. Don’t however, expect to be able to determine how well applications actually RUN. Too many variables. You’ll embarrass yourself. I charge $2,000 a day just enhancing I/O performance for a given system, application, database, and disk layout. I can’t even generalize on that!” —David A. Lethe^e70

“Dysfunctional equivalent
    “Microsoft promoted Windows NT as if it was essentially technologically equivalent to Unix. In some cases, Microsoft promoted Windows NT on Intel as superior to Unix/RISC machines. If this wasn’t an embarrassing mistake, it should have been.
    “In a March 1996 interview with InfoWorld, Bill Gates said, ‘Compare the performance: Buy the most expensive Sun [Microsystems Inc.] box you can and compare its Web performance to an inexpensive Windows NT box. Let’s not joke around: Pentium Pro processors have more performance than the RISC community is putting out. I’m not talking about price/performance; I’m talking about performance in the absolute.’
    “This bravado may sound convincing to customers lacking experience with Unix/RISC platforms. But everyone else understands this to be nonsense. Surely no one in his right mind would expect a cheap Windows NT box to outperform the most expensive Sun box (currently the Sun Microsystems Enterprise 10000 Server with 64 UltraSPARC processors).
    “One need not perform a lab comparison to extrapolate the results. An InfoWorld Web server benchmark showed that a dual Pentium Pro machine running Windows NT pegged the CPU usage while a single-processor Sun SPARC workstation barely broke a sweat.
    “The problem is with both the software and the hardware. Years ago, PC Week demonstrated that OS/2 and NetWare could outperform Windows NT on a single-processor machine, even when NT was given more processors to work with. InfoWorld’s internal unpublished testing once demonstrated that Windows NT and SQL Server consistently crashed under high-stress loads while running on four Pentium Pro processors. Yet a single-processor IBM AS/400 machine running OS/400 and DB2 hummed along under the same load. (As one might expect, the single-processor AS/400 ran slower than the quad-processor NT box at lighter loads.)
    “Thus, despite the Gates bravado, Windows NT on any platform cannot yet compete with the high-end Unix/RISC machines.” —Nicholas Petreley, “The new Unix alters NT’s orbit”, NC World^w74

“Linux, FreeBSD, and BSDI Unix outperform Windows NT by a wide margin on limited hardware, and under some circumstances can perform as well or better than NT on the best hardware.” —Nicholas Petreley, “The new Unix alters NT’s orbit”, NC World^w74

hardware considerations

One of the most measures of the performance of a system is the underlying hardware. The ability for compilers to optimize programs for the hardware and the ability for an operating system to balance for hardware configurations are major factors in performance, as these determine how well the hardware is utilized.

The most important factors in determining hardware speed relate to the basic computer architecture. The three most important factors are clock speed, path length, and cycles per instruction. Most advertisements concentrate only on clock speed, which by itself is a meaningless number (unless comparing two versions of the exact same processor). As a comparison, clock speed can be thought of like a car engine’s RPMs. Just because a Porsche and a Honda are both running at 4,000 RPM does not mean that they are travelling at the same speed.

Clock rate or cycle time is limited mostly by chip technology. Clock speed is how many cycles occur in a second (higher is better). Cycle time is the amount of time it takes to perform one cycle (lower is better), and is the inverse of clock speed.

Path length is the number of machine instructions required to perform one operation (shorter is better). This is determined by the instruction set, addressing modes, and register set. If a processor has very few registers, extra time is required to shuffle data between memory and registers to make space for the next operation. If a processor is othogonal (or regular), then operations can be performed in any register. If a processor is irregular, then processing is slowed down while data is continually shuffled to the correct special purpose register (such as with the Intel 80x86/Pentium family of processors). If a CISC (complex instruction set computer) has special instructions for complex tasks, then it will be able to perform those tasks in a single instruction. If a problem can be vectorized, then super computers (such as a Cray Y-MP) or processors with special vector units (such as a PowerPC with Altivec) will add to perfomance. Also, the ability for a compiler to optimize to take advantage of special instructions and architecture plays an important role in determining performance, as it doesn not matter how good the underlying architecture is if the compiler ignores its capabilities or generates inefficient code.

Cycles per instruction is the number of clock cycles needed to run one machine instruction (lower is better). In a RISC (reduced instruction set computer), the number of cycles per instruction is low (one or fewer). In a CISC, the number of cycles per instruction is high (more than one) and variable (different instructions take different amounts of time). Modern processors use various techniques to improve the cycles per instruction. One approach is superscalar architecture. In superscalar processors, there are multiple scalar units in a single processor, and each can be working on a different task simultaneously. A common modern approach is pipelining and superpipelining. Pipelining works like an assembly line, with various parts of the processing broken up into small sequential tasks, with instructions and data moved through the pipeline in an orderly manner. Once a pipeline is filled, the number of cycles per instruction drops dramatically as many operations are done simultaneously. Another approach is vectorization, although that tends to only be useful for certain kinds of scientific processing. Other approaches that can be beneficial are distributed-memory, massively parallel, and clustered systems.

The cycles per instruction can vary greatly depending on what kind of work is being done. The most basic design decision regards integer or floating point math. For business applications, integer math should be optimized. For scientific and advanced graphics applications, floating point math should be emphasized. The fastest super computers rely on vectorization, but if a programming task can’t be vectorized, then the super computer may be slower than a mainframe or even a fast workstation.

The basic performance of a processor can be quantified as cycle time times path length times cycles per instruction (lower number is better).

Other parts of a system can also greatly affect overall performance. Memory access times can be a major factor in system speed. Internal caches can be used to cut down access time. I/O subsystems (including external storage) can greatly affect overall performance. If an application or system is limited primarily by CPU speed, then it is called “CPU-bound”. If an application or system is limited primarily by I/O speed, then it called “I/O-bound”. Scientific and graphics programs tend to be CPU-bound, while business programs tend to be I/O-bound.

hard drives

    “I;’ve harped on the hard drive bottleneck which is stuck at a rate that hovers around a meager 25 megabytes per second (or about 200 Mbps). That;’s about one-fifth the speed of a Gigabit Ethernet connection. And that;’s the speed you;’ll get if you are lucky. The next bottleneck comes from Microsoft Windows itself. Last I checked, the best speed achievable by the Windows NT kernel topped out at about 450 Mbps, making Gigabit Ethernet a waste of time for a small office. This amounts to a scene that just as easily can be served by 500-MHz processors running on the cheap memory you can buy at Costco.
    “RAID Can Save the Day Dept.: Last summer [2000] at Computex, Intel showed its BNU31—fashioned after the Ultra 160 specification—which the company called ;‘an affordable RAID controller for servers.;’ To Intel, affordable means $600. I suppose it was affordable compared with the $2,100 Mylex RAID controller from six months earlier [2000]. These controllers claim a 200-Mbps sustained data rate under optimal conditions, which of course are unlikely in the real world. But with eight drives, I suppose you get close to that.
    “Adaptec and others soon introduced sub-$500 controllers meeting the Ultra 160 specifications. Since then, HighPoint Technologies introduced its one-chip ATA/100 RAID controller for cheap IDE drives, which is now being bundled on motherboards to provide RAID 0, RAID 1, and RAID 1+0 support. This seem to be the cheapest way to enhance system performance. Large hard drives have become so cheap—$10 per gigabyte and falling—that buying multiple drives to improve perfomance makes some sense.” —John C. Dvorak, PC Magazine, July 2001^m5

servers

“Technology matters. You’ve got to have a fast, efficient Website that delivers what customers need in a matter of a few seconds. (If it takes longer than about eight seconds for a Web page to load, chances are that twitchy surfers will click away without waiting to see those fancy graphics.) Behind the scenes you need the right database structure and analytical tools to support e-commerce activities, along with the ability to handle customer inquiries interactively in the 24-hour-a-day, seven-day-a-week world of the Web.” —Fortune Technology Guide^m3

“UNIX is a mature, technically superior group of operating systems with a proven track record for performance, reliability, and security in a server environment. The almost thirty years of continual development, performed often by volunteers who believe in what they’re doing, has produced a group of operating systems—and extremely powerful multiprocessor server hardware tailor-made to its needs, whose performance is still unparalleled by Intel hardware—that not only meets the demands of today’s computing needs, but in many cases exceeds them.” —Microsoft Windows NT Server 4.0 versus UNIX ^w51

benchmarks and tests

As pointed out in the above quote, benchmarks and comparison tests can be very misleading. Benchmarks are artificial in nature. In addition to often being unrepresentative of real work, they are subject to politics and other forms of tampering in their creation and subject to subversion through specific “enhancements” to either software or hardware designed to artificially inflate performance on selected benchmarks. Comparison tests are also undependable. Not only can their be great variance in the quality of the implementation of the same program on different platforms, but the test can be selected with advanced knowledge of the probable results, biasing the test to produce the desired answers. Notice that just about every manufacturer can tout various benchmarks and comparison tests that show they are the “fastest”.

“Windows NT has a dangerous driver model because it is willing to sacrifice stability for speed in an attempt to win benchmarks against competing operating systems.” —Nicholas Petreley, “The new Unix alters NT’s orbit”^w74

With those caveats on the flaws of benchmarking and comparison tests, the following results are available for your examination:

CPU Speed Tests: Comparing Amiga, IRIX, Macintosh, MS-DOS, ULTRIX, Windows 95, and Windows NT using PovRay to render the scene fish13.pov.

Performance information about Digital products: www.compaq.com/alphaserver/download/alphaserver_gs_benchmark_performance_v2.pdf

Windows NT Registry

Windows NT uses a central registry data base for keeping track of key configuration information on the operating system and all the application programs. According to computer engineer Bob Canup (chief design engineer at two different computer manufacturers, including the design that briefly held the record for fastest microcomputer), the existence of the registry is a fundamental design flaw that compromises speed:

    “The feature for which Microsoft has received the greatest critical praise in the computer press is the registry: a built in centralized database of settings for the operating system and application programs. However, as Microsoft discovered, there is one place where a centralized database is not the correct way to do things: at the heart of an operating system.
    A Unix system consults the files in the /etc directory only when a service or program requests the information they contain. NT however must scan the registry before launching any program or service — lest the registry have some settings which are critical to the operation of the program or service. As a result Unix spends very little time doing anything having to do with the /etc directory. NT on the other hand spends more and more time searching and updating the registry as the system ages and grows.
    If you wish to see how dramatic the impact on performance the existence of the registry is install a fresh copy of NT on a system and then add Computer Associates’ Unicenter TNG Framework. Even if you never run the TNG framework every other program on the system will be slowed down because of the data added to the registry. This is not because there is anything wrong with Unicenter TNG, it is because it, quite properly, uses the registry exactly as the Microsoft specifications require. I know of no other operating system where simply adding a program to the system slows down the operation of every other program on the system, even if the added program is never run.
    An NT system starts off its fastest the day it is installed and slows down the more it is used. Adding Visual Basic 6.0 professional to my NT system (a process which expanded the registry by 5 megabytes) caused my 266 MHZ K6-II to start performing about like a 386-20.
    If performance were the only thing which the existence of the registry impacted, that would be bad enough, but there is another much more serious problem caused by the existence of the registry. In a Unix system if you happen to corrupt one of the files in the /etc directory you might lose the service to which the file refers. In an NT system if you corrupt the registry, you lose the entire operating system.
    The end result is that in real world, day to day, use NT is much slower and more fragile than a Unix system.” —Bob Canup^e87

See more information on the registry at Windows NT Registry Flaws.

hardware acceleration

“I took it upon myself to substitute the machine’s [Dual Processor PowerMac G4 500Mhz] standard ATI Rage 128 Pro AGP 16-MB unit with the optional -$100 extra- ATI AGP Radeon 32-MB card. Hooh-hah! With the Radeon card, this sucker screamed through Quake faster than Al Gore chasing Florida voters.” —“X a perfect X”, Open (a Linux e-business magazine)^m4

OSdata.com is used in more than 300 colleges and universities around the world

Read details here.

Tweets by @osdata

A web site on dozens of operating systems simply can’t be maintained by one person. This is a cooperative effort. If you spot an error in fact, grammar, syntax, or spelling, or a broken link, or have additional information, commentary, or constructive criticism, please e-mail Milo. If you have any extra copies of docs, manuals, or other materials that can assist in accuracy and completeness, please send them to Milo, PO Box 1361, Tustin, CA, USA, 92781.

Click here for our privacy policy.