music
OSdata.com: assembly language 

OSdata.com

Assembly Language

data representation

summary

    This web page examines data representation and number systems for assembly languages.

free computer programming text book project

If you like the idea of this project,
then please donate some money.

more information on donating

Google

Now building a For those with high speed connections, the very large single file summary is still on line.

data representation

    Most data structures are abstract structures and are implemented by the programmer with a series of assembly language instructions. Many cardinal data types (bits, bit strings, bit slices, binary integers, binary floating point numbers, binary encoded decimals, binary addresses, characters, etc.) are implemented directly in hardware for at least parts of the instruction set. Some processors also implement some data structures in hardware for some instructions — for example, most processors have a few instructions for directly manipulating character strings.

    An assembly language programmer has to know how the hardware implements these cardinal data types. Some examples: Two basic issues are bit ordering (big endian or little endian) and number of bits (or bytes). The assembly language programmer must also pay attention to word length and optimum (or required) addressing boundaries. Composite data types will also include details of hardware implementation, such as how many bits of mantissa, characteristic, and sign, as well as their order. In many cases there are machine specific encodings for some data types, as well as choice of character codes (such as ASCII or EBCDIC) for character and string implementations.

data size

    The basic building block is the bit, which can contain a single piece of binary data (true/false, zero/one, north/south, positive/negative, high/low, etc.).

    Bits are organized into larger groupings to store values encoded in binary bits. The most basic grouping is the byte. A byte is the smallest normally addressable quantum of main memory (which can be different than the minimum amount of memory fetched at one time). In modern computers this is almost always an eight bit byte, so much so that many skilled programmers believe that a byte is defined as being always eight bits. In the past there have been computers with seven, eight, twelve, and sixteen bits. There have also been bit slice computers where the common memory addressing approach is by single bit; in these kinds of computers the term byte actually has no meaning, although eight bits on these computers are likely to be called a byte. Throughout the rest of this discussion, assume the standard eight bit byte applies unless specifically stated otherwise.

    A nibble is half a byte, or four bits.

    A word is the default data size for a processor. The default size does not apply in all cases. The word size is chosen by the processor’s designer(s) and reflects some basic hardware issues (such as internal or external buses). The most common word sizes are 16 and 32, but words have ranged from 16 to 60 bits. Typically there will be additional data sizes that are defined relative to the size of a word: halfword, half the size of a word; longword, usually double the size of a word; doubleword, usually double the size of a word (sometimes double the size of a longword); and quadword, four times the size of a word. Whether or not there is a space between the size designation and “word” is designated by the manufacturer, and varies by processor.

    Some processors require that data be aligned. That is, two byte quantities must start on byte addresses that are multiples of two; four byte quantities must start on byte addresses that are multiples of four; etc. The general rule follows a progression of exponents of two (2, 4, 8, 16, ƒ). Some processors allow data to be unaligned, but this usually results in a slow down in performance.

endian

    Endian is the ordering of bytes in multibyte scalar data. The term comes from Jonathan Swift’s Gulliver’s Travels. For a given multibyte scalar value, big- and little-endian formats are byte-reversed mappings of each other. While processors handle endian issues invisibly when making multibyte memory accesses, knowledge of endian is vital when directly manipulating individual bytes of multibyte scalar data and when moving data across hardware platforms.

    Big endian stores scalars in their “natural order”, with most significant byte in the lowest numeric byte address. Examples of big endian processors are the IBM System 360 and 370, Motorola 680x0, Motorola 68300, and most RISC processors.

    Little endian stores scalars with the least significant byte in the lowest numeric byte address. Examples of little endian processors are the Digital VAX and Intel x86 (including Pentium).

    Bi-endian processors can run in either big endian or little endian mode under software control. An example is the Motorola/IBM PowerPC, which has two separate bits in the Machine State Register (MSR) for controlling endian: the ILE bit controls endian during interrupts and the LE bit controls endian for all other processes. Big endian is the default for the PowerPC.

number systems

    Binary is a number system using only ones and zeros (or two states).

    Decimal is a number system based on ten digits (including zero).

    Hexadecimal is a number system based on sixteen digits (including zero).

    Octal is a number system based on eight digits (including zero).

    Duodecimal is a number system based on twelve digits (including zero).

binaryoctaldecimalduodecimalhexadecimal
00000
11111
102222
113333
1004444
1015555
1106666
1117777
100010888
100111999
10101210AA
10111311BB
1100141210C
1101151311D
1110161412E
1111171513F
1000020161410
1000121171511
1001022181612
1001123191713
1010024201814
1010125211915
1011026221A16
1011127231B17
1100030242018

number representations

integer representations

    Sign-magnitude is the simplest method for representing signed binary numbers. One bit (by universal convention, the highest order or leftmost bit) is the sign bit, indicating positive or negative, and the remaining bits are the absolute value of the binary integer. Sign-magnitude is simple for representing binary numbers, but has the drawbacks of two different zeros and much more complicates (and therefore, slower) hardware for performing addition, subtraction, and any binary integer operations other than complement (which only requires a sign bit change).

    In one’s complement representation, positive numbers are represented in the “normal” manner (same as unsigned integers with a zero sign bit), while negative numbers are represented by complementing all of the bits of the absolute value of the number. Numbers are negated by complementing all bits. Addition of two integers is peformed by treating the numbers as unsigned integers (ignoring sign bit), with a carry out of the leftmost bit position being added to the least significant bit (technically, the carry bit is always added to the least significant bit, but when it is zero, the add has no effect). The ripple effect of adding the carry bit can almost double the time to do an addition. And there are still two zeros, a positive zero (all zero bits) and a negative zero (all one bits).

    In two’s complement representation, positive numbers are represented in the “normal” manner (same as unsigned integers with a zero sign bit), while negative numbers are represented by complementing all of the bits of the absolute value of the number and adding one. Negation of a negative number in two’s complement representation is accomplished by complementing all of the bits and adding one. Addition is performed by adding the two numbers as unsigned integers and ignoring the carry. Two’s complement has the further advantage that there is only one zero (all zero bits). Two’s complement representation does result in one more negative number (all one bits) than positive numbers.

    Two’s complement is used in just about every binary computer ever made. Most processors have one more negative number than positive numbers. Some processors use the “extra” neagtive number (all one bits) as a special indicator, depicting invalid results, not a number (NaN), or other special codes.

    In unsigned representation, only positive numbers are represented. Instead of the high order bit being interpretted as the sign of the integer, the high order bit is part of the number. An unsigned number has one power of two greater range than a signed number (any representation) of the same number of bits.

bit patternsign-mag.one’s comp.two’s compunsigned
0000000
0011111
0102222
0113333
100-0-3-44
101-1-2-35
110-2-1-26
111-3-0-17

floating point representations

    Floating point numbers are the computer equivalent of “scientific notation” or “engineering notation”. A floating point number consists of a fraction (binary or decimal) and an exponent (bianry or decimal). Both the fraction and the exponent each have a sign (positive or negative).

    In the past, processors tended to have proprietary floating point formats, although with the development of an IEEE standard, most modern processors use the same format. Floating point numbers are almost always binary representations, although a few early processors had (binary coded) decimal representations. Many processors (especially early mainframes and early microprocessors) did not have any hardware support for floating point numbers. Even when commonly available, it was often in an optional processing unit (such as in the IBM 360/370 series) or coprocessor (such as in the Motorola 680x0 and pre-Pentium Intel 80x86 series).

    Hardware floating point support usually consists of two sizes, called single precision (for the smaller) and double precision (for the larger). Usually the double precision format had twice as many bits as the single precision format (hence, the names single and double). Double precision floating point format offers greater range and precision, while single precision floating point format offers better space compaction and faster processing.

    F_floating format (single precision floating), DEC VAX, 32 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 15 bits of an excess 128 binary exponent, followed by a normalized 24-bit fraction with the redundant most significant fraction bit not represented. Zero is represented by all bits being zero (allowing the use of a longword CLR to set a F_floating number to zero). Exponent values of 1 through 255 indicate true binary exponents of -127 through 127. An exponent value of zero together with a sign of zero indicate a zero value. An exponent value of zero together with a sign bit of one is taken as reserved (which produces a reserved operand fault if used as an operand for a floating point instruction). The magnitude is an approximate range of .29*10-38 through 1.7*1038. The precision of an F_floating datum is approximately one part in 223, or approximately seven (7) decimal digits).

    32 bit floating format (single precision floating), AT&T DSP32C, 32 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 23 bits of a normalized two’s complement fractional part of the mantissa, followed by an eight bit exponent. The magnitude of the mantissa is always normalized to lie between 1 and 2. The floating point value with exponent equal to zero is reserved to represent the number zero (the sign and mantissa bits must also be zero; a zero exponent with a nonzero sign and/or mantissa is called a “dirty zero” and is never generated by hardware; if a dirty zero is an operand, it is treated as a zero). The range of nonzero positive floating point numbers is N = [1 * 2-127, [2-2-23] * 2127] inclusive. The range of nonzero negative floating point numbers is N = [-[1 + 2-23] * 2-127, -2 * 2127] inclusive.

    40 bit floating format (extended single precision floating), AT&T DSP32C, 40 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 31 bits of a normalized two’s complement fractional part of the mantissa, followed by an eight bit exponent. This is an internal format used by the floating point adder, accumulators, and certain DAU units. This format includes an additional eight guard bits to increase accuracy of intermediate results.

    D_floating format (double precision floating), DEC VAX, 64 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 15 bits of an excess 128 binary exponent, followed by a normalized 48-bit fraction with the redundant most significant fraction bit not represented. Zero is represented by all bits being zero (allowing the use of a quadword CLR to set a D_floating number to zero). Exponent values of 1 through 255 indicate true binary exponents of -127 through 127. An exponent value of zero together with a sign of zero indicate a zero value. An exponent value of zero together with a sign bit of one is taken as reserved (which produces a reserved operand fault if used as an operand for a floating point instruction). The magnitude is an approximate range of .29*10-38 through 1.7*1038. The precision of an D_floating datum is approximately one part in 255, or approximately 16 decimal digits).

Now building a For those with high speed connections, the very large single file summary is still on line.

free music player coding example

    Programming example: I am making heavily documented and explained open source PHP/MySQL code for a method to play music for free — almost any song, no subscription fees, no download costs, no advertisements, all completely legal. This is done by building a front-end to YouTube (which checks the copyright permissions for you).

    View music player in action: www.musicinpublic.com/.

    Create your own copy from the original source code/ (presented for learning programming). Includes how to run this from your own computer if you don’t have a web site.


OSdata.com is used in more than 300 colleges and universities around the world

Read details here.

Some or all of the material on this web page appears in the
free downloadable college text book on computer programming.


return to table of contents
free downloadable college text book

view text book
HTML file

Because I no longer have the computer and software to make PDFs, the book is available as an HTML file, which you can convert into a PDF.


    A web site on dozens of operating systems simply can’t be maintained by one person. This is a cooperative effort. If you spot an error in fact, grammar, syntax, or spelling, or a broken link, or have additional information, commentary, or constructive criticism, please e-mail Milo. If you have any extra copies of docs, manuals, or other materials that can assist in accuracy and completeness, please send them to Milo, PO Box 1361, Tustin, CA, USA, 92781.

    Click here for our privacy policy.


previous page next page
previous page next page

home page

two levels up

special topics

one level up

peer level

free computer programming text book project

Building a free downloadable text book on computer programming for university, college, community college, and high school classes in computer programming.

If you like the idea of this project,
then please donate some money.

send donations to:
Milo
PO Box 1361
Tustin, California 92781

Supporting the entire project:

    If you have a business or organization that can support the entire cost of this project, please contact Pr Ntr Kmt (my church)

more information on donating

Some or all of the material on this web page appears in the
free downloadable college text book on computer programming.


I do the news as an unpaid volunteer for KOCI 101.5 FM, Newport Beach/Costa Mesa (also available on the web)


Google

Made with Macintosh

    This web site handcrafted on Macintosh computers using Tom Bender’s Tex-Edit Plus and served using FreeBSD .

Viewable With Any Browser


    Names and logos of various OSs are trademarks of their respective owners.

    Copyright © 2000, 2001, 2002 Milo

    Created: February 14, 2001 (from asm.htm)

    Last Updated: February 15, 2002


return to table of contents
free downloadable college text book

previous page next page
previous page next page