3. Operations Needed To Support Dynamic Allocation And Linked Structures

Abstractly, in order for the user to define and use dynamically allocated linked data structures, the following are necessary:

Getting Memory

We require a procedure for getting one block of memory - of a particular type, or appropriate for storing an object of a particular type. Typically, this procedure will return the address of the cell that has been allocated to the user. This procedure has different names in different systems, I will call it by the generic name GET_MEMORY. In C it is called malloc. In the textbook it is called new.

Pointer Variables

We require a type of variable that can store an address. For example, when GET_MEMORY is called it returns an address and the user needs to be able to store that address in a variable in order to make use of it later on.

In our example, each student entry contains the address of the next student entry; e.g. JOE's entry contains the address of MARYJANE's entry. Therefore, a student entry must be a structure/record whose first component is capable of storing an address. Variables that hold addresses are called pointers.

In C, to declare a variable that can hold a real value, you say:

        float X ;   (in the textbook:   var X: real)

To declare a variable that can hold the address of a real value, say:

        float *P ;  (in the textbook:   var P: PointerTo real)

Invalid Address

We require a special value that can be stored in a pointer, that means `not a valid address'. The textbook calls it NIL; in C it is called NULL.

P = NULL; means that P is defined, but contains no address. We can test this: if (P == NULL) ... Typically, we use NULL to indicate either then end of a chain of links or an uninitialized address.

Returning Memory

We require a procedure for returning/freeing one block of memory. The user will pass to this procedure a pointer (variable containing an address) and the procedure will return the corresponding block of memory to the global pool.

This procedure works on one block at a time. To free up all the space consumed by the student entries for one course, the user would have to call the procedure once for each student entry. Calling it just on the entry for JOE would not also free up the following entries. The entry for MARYJANE would still occupy memory.

Before freeing JOE:

After freeing JOE:

Unfortunately, now we'd have no way to free MARYJANE because we forgot to retrieve its address from JOE's entry before we freed it. This is a very common mistake! Look out for it in your own code.

The procedure for returning memory to the global pool has different names in different systems, I will call it by the generic name RETURN_MEMORY. In C it is called free. In the textbook it is called dispose.

Accessing Objects Through Pointers

We require methods for accessing, i.e. looking at and changing, the value (or more generally, the object) stored at a particular address. This is done by dereferencing a pointer, or rather its value.

For example, in C, if P is a pointer and contains a valid address (i.e. not NULL), we can obtain the value stored at this address by means of the * prefix operator.

X = *P;: Looks up the value P is pointing at and stores it in X.
*P = 13.1;: Stores the value 13.1 at the location P points at (assuming P is of type real*, that is `pointer to real').

For example, suppose P is a pointer to a real value: this means P itself contains an address, and at that address is stored a real value. Suppose memory consists of the following sequence of cells:

X, an ordinary variable, is memory cell 7. It contains a real value, 3.9. P, a pointer variable, is memory cell 2. It contains an address, 5. (*P) is the memory cell whose address is stored in P, i.e. memory cell 5. It contains a real value, 4.8.

*P = X;: Copies the value in cell named X (at address 7) into the cell whose address is in P (cell 5).

Note that P itself has not changed.

The textbook uses a different symbol for the dereferencing operator. The symbol is ^ and it is placed after the pointer variable, not before it. So the textbook writes (P^) instead of (*P).

Obtaining The Address Of An Object

Finally, it is sometimes useful to have a procedure that takes a symbolic variable name, like X, and returns the address corresponding to the name. In the above example, X is a symbolic name for the address 7. In C, this operation is done using the symbol & - (&X) is the address corresponding to X. The text has no notation for this.

3.1. Summary

get_memory (malloc,new): Procedure for dynamically allocating memory.
return_memory (free,dispose): Procedure for returning memory that was dynamically allocated.
pointer: A variable or expression whose value is an address.
NIL, NULL: A special value for pointers, meaning `not a valid address'.
dereferencing (*,^): Following a pointer; accessing the value in the address represented by the pointer.
&: Obtain the address of the object denoted by an expression. E.g. the address of a variable.

In many languages all these things are ``built-in'', i.e. provided as primitives in the language. But in some languages, FORTRAN and COBOL for example, they are not. But that does not mean you cannot use linked structures in these languages. What it means is that in order to use linked structures you will have to build your own implementations of these basic memory-management primitives. This is not hard at all, the textbook discusses how to do it using arrays in section 4.2.3.