music |
| | OSdata.com |
arrays
summary
This subchapter looks at arrays.
Arrays are a collection of data items. In older programming languages, each array element is a single elementary data item, but many modern programming languages allow individual array elements to themselves be collections of data, including being another entire array.
A classic example of the jump from a single elementary data item to an array is keeping track of a grand total (such as total calories consumed for a diet program) as a single data element and having 28, 29, 30, or 31 daily totals of calories consumed.
In this example, one could have an array of 31 possible days of a month and keep each days calories consumed stored by the day of the month. Insteaad of giving each days calory total a separate variable name (go ahead, try to come up with 31 great variable names), the programmer gives a single overall variable name to the entire array and then also indicates which particular days calory total is meant by the number of the date.
calries_consumed[day: 1..31]
no specific language
This example is a single dimension array. SIngle dimension arrays are also called vectors.
We could extend this example to also designate the month (ranging from 1 for January to 12 for December).
calries_consumed[month: 1..12, day: 1..31]
no specific language
This is an array in two dimensions. Arrays with two dimensions are also called a matrix, the plural being matrices.
While particular programming languages and compilers may have set limits, there is no limit on the number of dimensions for an array as a concept.
Another terminology you will often see is to refer to arrays by the number of dimensions: 1-, 2-, 3-, 4-, n-, where n- refers to any possible number of dimensions.
The following material is from the unclassified Computer Programming Manual for the JOVIAL (J73) Language, RADC-TR-81-143, Final Technical Report of June 1981.
The kinds of values provided by JOVIAL reflect the applications
of the language; they are oriented toward engineering and contrl
programming rather than, for example, commercial and business
programming. The JOVIAL values are:
8. Table values, which are collections of values gathered
together to form a single data object. They are used
for the constructs called "arrays" and "structures" in
other languages. For example, a table can be used to
store temperature readings taken every 10 seconds during
a given test period.
Chapter 1 INTRODUCTION, page 3
1.1.5 Built-In Functions
The JOVIAL built-in functions provide advanced, specialized
operations that are not covered by the JOVIAL operators.
LBOUND(t,d) Lower bound of d'th dimension of the table t
UBOUND(t,d) Upper bound of d'th dimension of the table t
NWSDEN(t) Number of bytes allocated to each entry of
the table t
Chapter 1 Introduction, page 9
stub section
This subchapter is a stub section. It will be filled in with instructional material later. For now it serves the purpose of a place holder for the order of instruction.
Professors are invited to give feedback on both the proposed contents and the propsed order of this text book. Send commentary to Milo, PO Box 1361, Tustin, California, 92781, USA.
arrays
This subchapter looks at arrays.
Arrays are a collection of data items. In older programming languages, each array element is a single elementary data item, but many modern programming languages allow individual array elements to themselves be collections of data, including being another entire array.
A classic example of the jump from a single elementary data item to an array is keeping track of a grand total (such as total calories consumed for a diet program) as a single data element and having 28, 29, 30, or 31 daily totals of calories consumed.
In this example, one could have an array of 31 possible days of a month and keep each days calories consumed stored by the day of the month. Insteaad of giving each days calory total a separate variable name (go ahead, try to come up with 31 great variable names), the programmer gives a single overall variable name to the entire array and then also indicates which particular days calory total is meant by the number of the date.
calries_consumed[day: 1..31]
no specific language
This example is a single dimension array. SIngle dimension arrays are also called vectors.
We could extend this example to also designate the month (ranging from 1 for January to 12 for December).
calries_consumed[month: 1..12, day: 1..31]
no specific language
This is an array in two dimensions. Arrays with two dimensions are also called a matrix, the plural being matrices.
While particular programming languages and compilers may have set limits, there is no limit on the number of dimensions for an array as a concept.
Another terminology you will often see is to refer to arrays by the number of dimensions: 1-, 2-, 3-, 4-, n-, where n- refers to any possible number of dimensions.
declaring an array
All arrays in LISP must be declared before they are used.
ARRAY(x) LISP function that takes three or more arguments of type name, type, and dimnsions, and creates an array of that name, type, and dimensions.
(ARRAY SIMPLE T 1024)
The example creates a LISP array named SIMPLE that can contain any arbitrary s-expressions. The array elements are numbered 0 to 1,023.
(ARRAY TWODIMENSIONS FIXNUM 1024 1024)
The example creates a LISP array named TWODIMENSIONS that can contain fixed point numbers. The array elements are numbered by two indices, both in the range of 0 to 1,023.
(ARRAY THREEDIMENSIONS FLONUM 1024 1024 1024)
The example creates a LISP array named THREEDIMENSIONS that can contain floating point numbers. The array elements are numbered by three indices, each in the range of 0 to 1,023.
order of single-dimensional arrays
The storage of a signle dimension array in memory is obvious: a linear sequence of memory locations (possibly with padding for instruction alignment purposes).
1 |
|
1 |
|
0x1000 |
|
1 |
3 |
|
13 |
|
0x1008 |
|
13 |
5 |
|
10,000 |
|
0x1010 |
|
10,000 |
order of multi-dimensional arrays
The storage of a signle dimension array in memory is obvious: a linear sequence of memory locations (possibly with padding for instruction alignment purposes).
When an array has two or more dimensions, it is still ultimately stored in a linear sequence of memory locations (possibly with padding for instruction alignment purposes), but the dimensions must be lined up sequentially as well.
The two major choices are either row major (rows are stored one after another) or column major (columns are stored one after another). These either group the linear representation of the array by the row or by the column.
1,1 |
|
1 |
|
0x1000 |
|
1 |
1,3 |
|
13 |
|
0x1008 |
|
13 |
2,2 |
|
10,000 |
|
0x1010 |
|
10,000 |
3,1 |
|
-25 |
|
0x1018 |
|
-25 |
3,3 |
|
1,024 |
|
0x1020 |
|
1,024 |
Ada uses a row-major order.
ALGOL 60 does not specify an order, but many ALGOL 60 compilers use row-major order.
C uses a row-major order.
C++ uses a row-major order.
COBOL uses a row-major order.
FORTRAN explicitly requires a column-major order.
Java uses a row-major order.
Matlab uses a column-major order.
Modula-2 uses a row-major order.
Pascal implicitly requires a row-major order.
PL/I uses a row-major order.
walking off the end
According to Brian Kernighan, co-creator of the AWK and AMPL programming languages and co-author of K&R C, one of the three most common errors for new programmers is walking off the start or end of an array.
If a programmer created a loop to go through each element of an array (a common programming task), it is an easy mistake to go one two many array locations. Some languages, such as C, will let you make this mistake, with potential run time crashes, while other programming languages, may have the capaibility to do range checking to prevent this error.
Regardless of language facilities, a programmer can implement his or her own checks for going past the end of an array.
Note that this mistake can also occur on a single access to one array element if the index intot he array is a variable. Again, it is wise to check to make sure your access is within the actual array before you attempt the access.
For this reason, Bjarne Stroustrup, creator of the C++ language, said, Essentially all arrays and most pointers belong deep in implementations that most programmers dont have to see. (Masterminds of Programming, Chapter One, C++, 2009)
Stanford C essentials
Stanford CS Education Library This [the following section until marked as end of Stanford University items] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.
Complex Data Types
C has the usual facilities for grouping things together to form composite types-- arrays and records (which are called structures).
Arrays
The simplest type of array in C is one which is declared and used in one place. There are more complex uses of arrays which I will address later along with pointers. The following declares an array called scores to hold 100 integers and sets the first and last elements. C arrays are always indexed from 0. So the first int in scores array is scores[0] and the last is scores[99].
int scores[100];
scores[0] = 13; // set first element
scores[99] = 42; // set last element
Its a very common error to try to refer to non-existent scores[100] element. C does not do any run time or compile time bounds checking in arrays. At run time the code will just access or mangle whatever memory it happens to hit and crash or misbehave in some unpredictable way thereafter. Professional programmers language. The convention of numbering things 0..(number of things - 1) pervades the language. To best integrate with C and other C programmers, you should use that sort of numbering in your own data structures as well.
Multidimensional Arrays
The following declares a two-dimensional 10 by 10 array of integers and sets the first and last elements to be 13.
int board [10][10];
board[0][0] = 13;
board[9][9] = 13;
The implementation of the array stores all the elements in a single contiguous block of memory. The other possible implementation would be a combination of several distinct one dimensional arrays -- thats not how C does it. In memory, the array is arranged with the elements of the rightmost index next to each other. In other words, board[1][8] comes right before board[1][9] in memory.
(highly optional efficiency point) Its typically efficient to access memory which is near other recently accessed memory. This means that the most efficient way to read through a chunk of the array is to vary the rightmost index the most frequently since that will access elements that are near each other in memory.
Array of Structs
The following declares an array named numbers which holds 1000 struct fractions.
struct fraction numbers[1000];
numbers[0].numerator = 22; /* set the 0th struct fraction */
numbers[0].denominator = 7;
Heres a general trick for unraveling C variable declarations: look at the right hand side and imagine that it is an expression. The type of that expression is the left hand side. For the above declarations, an expression which looks like the right hand side (numbers[1000], or really anything of the form numbers[...]) will be the type on the left hand side (struct fraction).
Stanford CS Education Library This [the above section] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.
end of Stanford C essentials
31 Every object in the language has a type, which characterizes a set of values and a set of applicable operations. The main classes of types are elementary types (comprising enumeration, numeric, and access types) and composite types (including array and record types). :Ada-Europes Ada Reference Manual: Introduction: Language Summary See legal information
34/2 Composite types allow definitions of structured objects with related components. The composite types in the language include arrays and records. An array is an object with indexed components of the same type. A record is an object with named components of possibly different types. Task and protected types are also forms of composite types. The array types String, Wide_String, and Wide_Wide_String are predefined. :Ada-Europes Ada Reference Manual: Introduction: Language Summary See legal information
42.1/2 The predefined standard library packages provide facilities such as string manipulation, containers of various kinds (vectors, lists, maps, etc.), mathematical functions, random number generation, and access to the execution environment. :Ada-Europes Ada Reference Manual: Introduction: Language Summary See legal information
functions
aload(x) PosScript array operator that places on the stack all of the elements of an array followed by the array itself.
ARRAY(x) LISP function that takes three or more arguments of type name, type, and dimnsions, and creates an array of that name, type, and dimensions.
DIM(x, n) Pl/I built-in function that returns the size of an array. The data item x is the name of the array to be examined. The data item n is the number of the dimension of the specififed array for which the extent is to be returned. The function is most efficient if data item n is of type FIXED BINARY (15). It is an error if the array has less than n dimensions or if n is zero or less or if the array x is not currently allocated.
LBOUND(x, n) Pl/I built-in function that returns the lower bound for the specififed dimension of an array. The data item x is the name of the array to be examined. The data item n is the number of the dimension of the specififed array for which the extent is to be returned. The function is most efficient if data item n is of type FIXED BINARY (15). It is an error if the array has less than n dimensions or if n is zero or less or if the array x is not currently allocated.
HBOUND(x, n) Pl/I built-in function that returns the upper bound for the specififed dimension of an array. The data item x is the name of the array to be examined. The data item n is the number of the dimension of the specififed array for which the extent is to be returned. The function is most efficient if data item n is of type FIXED BINARY (15). It is an error if the array has less than n dimensions or if n is zero or less or if the array x is not currently allocated.
The following code example shows how to use a DO loop to safely access all of the elements of a one dimensional array (vector) without going past the boundaries (beginning and end) of the array. This can also be used to handle the possibility that the declaration of the size of an array might change.
DO I = LBOUND(ARRAY-NAME,1) TO HBOUND(ARRAY_NAME,1);
X = ARRAY_NAME(I);
code that goes in this loop
END; /* DO LOOP */
SUM(x) Pl/I built-in function that returns the total sum of all of the values of the elements in an arithmetic array x. Note that the IBM PL/I-F compiler converts all array values to floating point to perform the operation and returns a result in floating point. This may create small rounding errors for commercial programmers needing dollar and cents accuracy.
PROD(x) Pl/I built-in function that returns the total product of all of the values of the elements in an arithmetic array x. The PROD function is always carried out in floating point arithmetic. This may create small rounding errors for commercial programmers needing dollar and cents accuracy.
Stanford Perl essentials
This [the following section until marked as end of Stanford University items] is document #108 [Essential Perl] in the Stanford CS Education Library --see http://cslibrary.stanford.edu/108/. This document is free to be used, reproduced, or sold so long as this paragraph and the copyright are clearly. Copyright 2000-2002, Nick Parlante, nick.parlante@cs.stanford.edu.
4. Arrays -- @
Array constants are specified using parenthesis ( ) and the elements are separated with commas. Perl arrays are like lists or collections in other languages since they can grow and shrink, but in Perl they are just called arrays. Array variable names begin with the at-sign (@). Unlike C, the assignment operator (=) works for arrays -- an independent copy of the array and its elements is made. Arrays may not contain other arrays as elements. Perl has sort of a 1-deep mentality. Actually, its possible to get around the 1-deep constraint using references, but its no fun. Arrays work best if they just contain scalars (strings and numbers). The elements in an array do not all need to be the same type.
@array = (1, 2, "hello"); ## a 3 element array
@empty = (); ## the array with 0 elements
$x = 1;
$y = 2;
@nums = ($x + $y, $x - $y);
## @nums is now (3, -1)
Just as in C, square brackets [ ] are used to refer to elements, so $a[6] is the element at index 6 in the array @a. As in C, array indexes start at 0. Notice that the syntax to access an element begins with $ not @ -- use @ only when referring to the whole array (remember: all scalar expressions begin with $).
@array = (1, 2, "hello", "there");
$array[0] = $array[0] + $array[1]; ## $array[0] is now 3
Perl arrays are not bounds checked. If code attempts to read an element outside the array size, undef is returned. If code writes outside the array size, the array grows automatically to be big enough. Well written code probably should not rely on either of those features.
@array = (1, 2, "hello", "there");
$sum = $array[0] + $array[27]; ## $sum is now 1, since $array[27] returned undef
$array[99] = "the end"; ## array grows to be size 100
When used in a scalar context, an array evaluates to its length. The scalar operator will force the evaluation of something in a scalar context, so you can use scalar() to get the length of an array. As an alternative to using scalar, the expression $#array is the index of the last element of the array which is always one less than the length.
@array = (1, 2, "hello", "there");
$len = @array; ## $len is now 4 (the length of @array)
$len = scalar(@array) ## same as above, since $len represented a scalar
## context anyway, but this is more explicit
@letters = ("a", "b", "c");
$i = $#letters; ## $i is now 2
That scalar(@array) is the way to refer to the length of an array is not a great moment in the history of readable code. At least I havent showed you the even more vulgar forms such as (0 + @a).
The sort operator (sort @a) returns a copy of the array sorted in ascending alphabetic order. Note that sort does not change the original array. Here are some common ways to sort
(sort @array) ## sort alphabetically, with uppercase first
(sort {$a <=> $b} @array) ## sort numerically
(sort {$b cmp $a} @array) ## sort reverse alphabetically
(sort {lc($a) cmp lc($b)} @array) ## sort alphabetically, ignoring case (somewhat inefficient)
The sort expression above pass a comparator function {...} to the sort operator, where the special variables $a and $b are the two elements to compare -- cmp is the built-in string compare, and <=> is the built-in numeric compare.
Theres a variant of array assignment that is used sometimes to assign several variables at once. If an array on the left hand side of an assignment operation contains the names of variables, the variables are assigned the corresponding values from the right hand side.
($x, $y, $z) = (1, 2, "hello", 4);
## assigns $x=1, $y=2, $z="hello", and the 4 is discarded
This type of assignment only works with scalars. If one of the values is an array, the wrong thing happens (see flattening below).
Array Add/Remove/Splice Functions
These handy operators will add or remove an element from an array. These operators change the array they operate on
- Operating at the front ($array[0]) end of the array
- shift(array) -- returns the frontmost element and removes it from the array.
Can be used in a loop to gradually remove and examine all the elements in an array left to right. The foreach operator, below, is another way to examine all the elements.
- unshift(array, elem) -- inserts an element at the front of the array. Opposite of shift.
- Operating at the back ($array[$len-1]) end of the array
- pop(array) -- returns the endmost element (right hand side) and removes it from the array.
- push(array, elem) -- adds a single element to the end of the array. Opposite of
pop.
- splice(array, index, length, array2) -- removes the section of the array defined by index and length, and replaces that section with the elements from array2. If array2 is omitted, splice() simply deletes. For example, to delete the element at index $i from an array, use splice(@array, $i, 1).
@ARGV and %ENV
The built-in array @ARGV contains the command line arguments for a Perl program. The following run of the Perl program critic.pl will have the ARGV array ("-poetry", "poem.txt").
unix% perl critic.pl -poetry poem.txt
%ENV contains the environment variables of the context that launched the Perl program. @ARGV and %ENV make the most sense in a Unix environment.
Array Iteration Ñ foreach
The foreach construct is a handy way to iterate a variable over all the elements in an array. Because of foreach, you rarely need to write a traditional for or while loop to index into an array. Foreach is also likely to be implemented efficiently. (Its a shame Java does not include a compact iteration syntax in the language. It would make Java a better language at the cost of some design elegance.)
foreach $var (@array) {
stmt; ## use $var in here
stmt;
}
Any array expression may be used in the foreach. The array expression is evaluated once before the loop starts. The iterating variable, such as $var, is actually a pointer to each element in the array, so assigning to $var will actually change the elements in the array.
This [the above section] is document #108 [Essential Perl] in the Stanford CS Education Library --see http://cslibrary.stanford.edu/108/. This document is free to be used, reproduced, or sold so long as this paragraph and the copyright are clearly. Copyright 2000-2002, Nick Parlante, nick.parlante@cs.stanford.edu.
end of Stanford Perl essentials
other
Ah, variable names.ÊLength is not a virtue in a name; clarity of expression is.ÊA global variable rarely used may deserve a long name, maxphysaddr say.ÊAn array index used on every line of a loop neednt be named any more elaborately than i.Ê Saying index or elementnumber is more to type (or calls upon your text editor) and obscures the details of the computation.ÊWhen the variable names are huge, its harder to see whats going on.ÊThis is partly a typographic issue; consider
for(i=0 to 100)
array[i]=0 ;
vs.
for(elementnumber=0 to 100)
array[elementnumber]=0;
The problem gets worse fast with real examples.ÊIndices are just notation, so treat them as such. Rob Pike, Notes on Programming in C, February 21, 1989
chapter contents
free music player coding example
Coding example: I am making heavily documented and explained open source code for a method to play music for free almost any song, no subscription fees, no download costs, no advertisements, all completely legal. This is done by building a front-end to YouTube (which checks the copyright permissions for you).
View music player in action: www.musicinpublic.com/.
Create your own copy from the original source code/ (presented for learning programming).
Because I no longer have the computer and software to make PDFs, the book is available as an HTML file, which you can convert into a PDF.
Names and logos of various OSs are trademarks of their respective owners.