### stub section

This subchapter is a stub section. It will be filled in with instructional material later. For now it serves the purpose of a place holder for the order of instruction.

Professors are invited to give feedback on both the proposed contents and the propsed order of this text book. Send commentary to Milo, PO Box 1361, Tustin, California, 92781, USA.

# integer data type

The computer integer data type is based on the mathematical concept of an integer.

Most modern computers store integers as binary integers. Some early computers used decimal integers and many modern CISCs still provide limited support for binary coded decimals.

### integer type

Most programming languages have an integer type. This is a computer representation of the mathematical integers (counting numbers, zero, and negative integers).

Unlike mathematical integers, computer integers have a range, a maximum (largest) and minimum (smallest negative) number.

Note that negative integers are indicated with a negative sign (such as -3), while positive integers are indicated by the lack of a sign (such as 3).

Unlike normal written numbers, you leave out the commas when writing numbers in a computer program. The number 1,000,000 (one million) is written 1000000. Adding the commas will confuse your compiler.

The following material is from the unclassified *Computer Programming Manual for the JOVIAL (J73) Language*, RADC-TR-81-143, Final Technical Report of June 1981.

`The kinds of values provided by JOVIAL reflect the applications`
`of the language; they are oriented toward engineering and contrl`
`programming rather than, for example, commercial and business`
`programming. The JOVIAL values are:`
`1. `**Integer values**, which are signed of unsigned whole
` numbers. They are used for counting. For example, an`
` integer can be used to count the number of times a loop`
` is repeated or the number of checks performed on a`
` process.`
Chapter 1 INTRODUCTION, page 2

In ALGOL 68 the integer mode is declared with the reserved word **int**.

**int** *FinalAverage*;

In C the integer type is declared with the reserved word **int**.

`int age;`

## Stanford C essentials

**Stanford CS Education Library** This [the following section until marked as end of Stanford University items] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.

### Integer Types

The “integral” types in C form a family of integer types. They all behave like integers and can be mixed together and used in similar ways. The differences are due to the different number of bits (“widths”) used to implement each type -- the wider types can store a greater ranges of values.

`char`- ASCII character -- at least 8 bits. Pronounced “car”. As a practical matter
`char` is basically always a byte which is 8 bits which is enough to store a single ASCII character. 8 bits provides a signed range of -128..127 or an unsigned range is 0..255. `char` is also required to be the “smallest addressable unit” for the machine -- each byte in memory has its own address.
`short`- Small integer -- at least 16 bits which provides a signed range of -32768..32767. Typical size is 16 bits. Not used so much.
`int`- Default integer -- at least 16 bits, with 32 bits being typical. Defined to be the “most comfortable” size for the computer. If you do not really care about the range for an integer variable, declare it
`int` since that is likely to be an appropriate size (16 or 32 bit) which works well for that machine.
`long`- Large integer -- at least 32 bits. Typical size is 32 bits which gives a signed range of about -2 billion ..+2 billion. Some compilers support “long long” for 64 bit ints.

The integer types can be preceded by the qualifier `unsigned` which disallows representing negative numbers, but doubles the largest positive number representable. For example, a 16 bit implementation of `short` can store numbers in the range -32768..32767, while `unsigned short` can store 0..65535. You can think of pointers as being a form of unsigned long on a machine with 4 byte pointers. In my opinion, it’s best to avoid using `unsigned` unless you really need to. It tends to cause more misunderstandings and problems than it is worth.

### Extra: Portability Problems

Instead of defining the exact sizes of the integer types, C defines lower bounds. This makes it easier to implement C compilers on a wide range of hardware. Unfortunately it occasionally leads to bugs where a program runs differently on a 16-bit-int machine than it runs on a 32-bit-int machine. In particular, if you are designing a function that will be implemented on several different machines, it is a good idea to use typedefs to set up types like `Int32` for 32 bit int and `Int16` for 16 bit int. That way you can prototype a function `Foo(Int32)` and be confident that the typedefs for each machine will be set so that the function really takes exactly a 32 bit int. That way the code will behave the same on all the different machines.

### int Constants

Numbers in the source code such as 234 default to type `int`. They may be followed by an ‘L’ (upper or lower case) to designate that the constant should be a `long` such as 42L. An integer constant can be written with a leading 0x to indicate that it is expressed in hexadecimal -- `0x10` is way of expressing the number 16. Similarly, a constant may be written in octal by preceding it with “0” -- `012` is a way of expressing the number 10.

### Type Combination and Promotion

The integral types may be mixed together in arithmetic expressions since they are all basically just integers with variation in their width. For example, `char` and `int` can be combined in arithmetic expressions such as (`'b' + 5`). How does the compiler deal with the different widths present in such an expression? In such a case, the compiler “promotes” the smaller type (`char`) to be the same size as the larger type (`int`) before combining the values. Promotions are determined at compile time based purely on the **types** of the values in the expressions. Promotions do not lose information -- they always convert from a type to compatible, larger type to avoid losing information.

### Pitfall -- int Overflow

I once had a piece of code which tried to compute the number of bytes in a buffer with the expression (`k * 1024`) where `k` was an `int` representing the number of kilobytes I wanted. Unfortunately this was on a machine where `int` happened to be 16 bits. Since `k` and `1024` were both `int`, there was no promotion. For values of `k >= 32`, the product was too big to fit in the 16 bit int resulting in an overflow. The compiler can do whatever it wants in overflow situations -- typically the high order bits just vanish. One way to fix the code was to rewrite it as (`k * 1024L`) -- the long constant forced the promotion of the `int`. This was not a fun bug to track down -- the expression sure looked reasonable in the source code. Only stepping past the key line in the debugger showed the overflow problem. “Professional Programmer’s Language.” This example also demonstrates the way that C only promotes based on the **types** in an expression. The compiler does not consider the values 32 or 1024 to realize that the operation will overflow (in general, the values don’t exist until run time anyway). The compiler just looks at the compile time types, `int` and `int` in this case, and thinks everything is fine.

**Stanford CS Education Library** This [the above section] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.

### end of Stanford C essentials

In Pascal the integer type is declared with the reserved word **integer**.

**var** *Age: Integer*;

“31 Every object in the language has a type, which characterizes a set of values and a set of applicable operations. The main classes of types are elementary types (comprising enumeration, numeric, and access types) and composite types (including array and record types).” —:Ada-Europe’s Ada Reference Manual: Introduction: Language Summary See legal information

“33 Numeric types provide a means of performing exact or approximate numerical computations. Exact computations use **integer types**, which denote sets of consecutive integers. Approximate computations use either fixed point types, with absolute bounds on the error, or floating point types, with relative bounds on the error. The numeric types Integer, Float, and Duration are predefined.” —:Ada-Europe’s Ada Reference Manual: Introduction: Language Summary See legal information

There are no data types in Ruby. Instead there are objects, as Ruby is exclusively an Object Oriented Programming language.

Ruby’s base class for numbers is `Numeric`.

Ruby’s numeric class `Fixnum` holds integers. They are stored as fixed length numbers whose bit length is the underlying native machine word minus one.

Ruby also has a class `Bignum` for storing multiple precision numbers too large for native machine representation. Numbers are automatically converted from `Fixnum` to `Bignum` whenever a result is too large for storage in `Fixnum`. The only limit on the size of a `Bignum` is the amount of memory made available by the operaating system.

The following material is from the unclassified *Computer Programming Manual for the JOVIAL (J73) Language*, RADC-TR-81-143, Final Technical Report of June 1981.

`1.1.2 `**Storage**

`When a JOVIAL program is executed, each value it operates on is`

`stored as an `**item**. The item has a **name**, which is **declared** and

`then used in the program when the value of the item is fetched or`

`modified.`

`An item is declared by a JOVIAL statement called a `**declaration**

**statement**. The declaration provides the compiler with the

`information it needs to allocate and access the storage for the`

`item. Here is a statement that declares an integer item:`

` ITEM COUNT U 10;`

`This declaration says that the value of COUNT is an integer that`

`is stored without a sign in ten or more bits. The notation is`

`compact: "U" means it is an unsigned integer, "10" means it`

`requires at least 10 bits. We say "at least" then bits because`

`the JOVIAL compiler may allocate more than ten bits. (That`

`allocation wastes a little data space, but can result in faster,`

`more compact code.)`

Chapter 1 INTRODUCTION, page 3

`JOVIAL does not require that you give the number of bits in the`

`declaration of an integer item. If you omit it, JOVIAL supplies`

`a default value that depends on which implementation of JOVIAL`

`you are using. An example is:`

` ITEM TIME S;`

`This statement declares TIME to be the name of an integer`

`variable item that is signed and has the default number of bits.`

`On one implementation of JOVIAL, this would be equivalent to the`

`declaration:`

` ITEM TIME S 15;`

`The item TIME occupies 16 bits (including the sign). On another`

`implementation, it would be equivalent to:`

` ITEM TIME S 31;`

`This and other defaults are defined in the user"s manual for the`

`implementation of JOVIAL you are using.`

`In this brief introduction, we cannot consider each kind of item`

`in detail (as we just did for integer items). Instead, a list of`

`examples follow, one declaration for each kind of value.`

` ITEM SIGNAL S 2; A `**signed integer item**, which occupies

` at least three bits and accomodates`

` values from -3 to +3.`

Chapter 1 INTRODUCTION, page 4

### assembly language instructions

### number systems

**Binary** is a number system using only ones and zeros (or two states).

**Decimal** is a number system based on ten digits (including zero).

**Hexadecimal** is a number system based on sixteen digits (including zero).

**Octal** is a number system based on eight digits (including zero).

**Duodecimal** is a number system based on twelve digits (including zero).

binary | octal | decimal | duodecimal | hexadecimal |

0 | 0 | 0 | 0 | 0 |

1 | 1 | 1 | 1 | 1 |

10 | 2 | 2 | 2 | 2 |

11 | 3 | 3 | 3 | 3 |

100 | 4 | 4 | 4 | 4 |

101 | 5 | 5 | 5 | 5 |

110 | 6 | 6 | 6 | 6 |

111 | 7 | 7 | 7 | 7 |

1000 | 10 | 8 | 8 | 8 |

1001 | 11 | 9 | 9 | 9 |

1010 | 12 | 10 | A | A |

1011 | 13 | 11 | B | B |

1100 | 14 | 12 | 10 | C |

1101 | 15 | 13 | 11 | D |

1110 | 16 | 14 | 12 | E |

1111 | 17 | 15 | 13 | F |

10000 | 20 | 16 | 14 | 10 |

10001 | 21 | 17 | 15 | 11 |

10010 | 22 | 18 | 16 | 12 |

10011 | 23 | 19 | 17 | 13 |

10100 | 24 | 20 | 18 | 14 |

10101 | 25 | 21 | 19 | 15 |

10110 | 26 | 22 | 1A | 16 |

10111 | 27 | 23 | 1B | 17 |

11000 | 30 | 24 | 20 | 18 |

### integer representations

**Sign-magnitude** is the simplest method for representing signed binary numbers. One bit (by universal convention, the highest order or leftmost bit) is the sign bit, indicating positive or negative, and the remaining bits are the absolute value of the binary integer. Sign-magnitude is simple for representing binary numbers, but has the drawbacks of two different zeros and much more complicates (and therefore, slower) hardware for performing addition, subtraction, and any binary integer operations other than complement (which only requires a sign bit change).

In **one’s complement** representation, positive numbers are represented in the “normal” manner (same as unsigned integers with a zero sign bit), while negative numbers are represented by complementing all of the bits of the absolute value of the number. Numbers are negated by complementing all bits. Addition of two integers is peformed by treating the numbers as unsigned integers (ignoring sign bit), with a carry out of the leftmost bit position being added to the least significant bit (technically, the carry bit is always added to the least significant bit, but when it is zero, the add has no effect). The ripple effect of adding the carry bit can almost double the time to do an addition. And there are still two zeros, a positive zero (all zero bits) and a negative zero (all one bits).

In **two’s complement** representation, positive numbers are represented in the “normal” manner (same as unsigned integers with a zero sign bit), while negative numbers are represented by complementing all of the bits of the absolute value of the number and adding one. Negation of a negative number in two’s complement representation is accomplished by complementing all of the bits and adding one. Addition is performed by adding the two numbers as unsigned integers and ignoring the carry. Two’s complement has the further advantage that there is only one zero (all zero bits). Two’s complement representation does result in one more negative number (all one bits) than positive numbers.

Two’s complement is used in just about every binary computer ever made. Most processors have one more negative number than positive numbers. Some processors use the “extra” neagtive number (all one bits) as a special indicator, depicting invalid results, not a number (NaN), or other special codes.

In **unsigned** representation, only positive numbers are represented. Instead of the high order bit being interpretted as the sign of the integer, the high order bit is part of the number. An unsigned number has one power of two greater range than a signed number (any representation) of the same number of bits.

*bit pattern* | *sign-mag.* | *one’s comp.* | *two’s comp* | *unsigned* |

000 | 0 | 0 | 0 | 0 |

001 | 1 | 1 | 1 | 1 |

010 | 2 | 2 | 2 | 2 |

011 | 3 | 3 | 3 | 3 |

100 | -0 | -3 | -4 | 4 |

101 | -1 | -2 | -3 | 5 |

110 | -2 | -1 | -2 | 6 |

111 | -3 | -0 | -1 | 7 |

See also Data Representation in Assembly Language

### accumulators

**Accumulators** are registers that can be used for arithmetic, logical, shift, rotate, or other similar operations. The first computers typically only had one accumulator. Many times there were related special purpose registers that contained the source data for an accumulator. Accumulators were replaced with data registers and general purpose registers. Accumulators reappeared in the first microprocessors.

**Intel 8086/80286:** one word (16 bit) accumulator; named AX (high order byte of the AX register is named AH and low order byte of the AX register is named AL)
**Intel 80386:** one doubleword (32 bit) accumulator; named EAX (low order word uses the same names as the accumulator on the Intel 8086 and 80286 [AX] and low order and high order bytes of the low order words of four of the registers use the same names as the accumulator on the Intel 8086 and 80286 [AH and AL])
**MIX:** one accumulator; named A-register; five bytes plus sign

### data registers

**Data registers** are used for temporary scratch storage of data, as well as for data manipulations (arithmetic, logic, etc.). In some processors, all data registers act in the same manner, while in other processors different operations are performed are specific registers.

**MIX:** one extension register; named X-register; five bytes plus sign; can be concatenated on the right hand side of the A-register (accumulator)
**Motorola 680x0, 68300:** 8 longword (32 bit) data registers; named D0, D1, D2, D3, D4, D5, D6, and D7

### general purpose registers

**General purpose registers** can be used as either data or address registers.

**DEC VAX:** 16 word (32 bit) general purpose registers; named R0 through R15
**IBM 360/370:** 16 full word (32 bit) general purpose registers; named 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A (or 10), B (or 11), C (or 12), D (or 13), E (or 14), and F (or 15)
**Intel 8086/80286:** 8 word (16 bit) general purpose registers; named AX, BX, CX, DX, BP, SP, SI, and DI (high order bytes of the AX, BX, CX, and DX registers have the names AH, BH, CH, and DH and low order bytes of the AX, BX, CX, and DX registers have the names AL, BL, CL, and DL)
**Intel 80386:** 8 doubleword (32 bit) general purpose registers; named EAX, EBX, ECX, EDX, EBP, ESP, ESI, and EDI (low order words use the same names as the general purpose registers on the Intel 8086 and 80286 and low order and high order bytes of the low order words of four of the registers use the same names as the general purpose registers on the Intel 8086 and 80286)
**Motorola 88100:** 32 word (32 bit) general purpose registers; named r0 through r31

### constant registers

**Constant registers** are special read-only registers that store a constant. Attempts to write to a constant register are illegal or ignored. In some RISC processors, constant registers are used to store commonly used values (such as zero, one, or negative one) — for example, a constant register containing zero can be used in register to register data moves, providing the equivalent of a clear instruction without adding one to the instruction set. Constant registers are also often used in floating point units to provide such value as pi or e with additional hidden bits for greater accuracy in computations.

**Motorola 88100:** r0 (general purpose register 0) contains the constant 32 bit integer zero

See also Registers

# free music player coding example

**Coding example:** I am making heavily documented and explained open source code for a method to play music for free — almost any song, no subscription fees, no download costs, no advertisements, all completely legal. This is done by building a front-end to YouTube (which checks the copyright permissions for you).

**View music player in action:** www.musicinpublic.com/.

**Create your own copy** from the original source code/ (presented for learning programming).

Because I no longer have the computer and software to make PDFs, the book is available as an HTML file, which you can convert into a PDF.

Names and logos of various OSs are trademarks of their respective owners.