A structure is a heterogeneous container object, i.e., it is an object with elements whose values do not have to be of the same data type. The elements or fields of a structure are named, and one accesses a particular field of the structure via the field name. This should be contrasted with an array whose values are of the same type, and whose elements are accessed via array indices.
A user-defined data type is a structure with a fixed set of fields defined by the user.
The struct
keyword is used to define a structure. The syntax
for this operation is:
struct {field-name-1, field-name-2, ... field-name-N};
This creates and returns a stucture with N fields whose names
are specified by field-name-1, field-name-2, ...,
field-name-N. When a structure is created, all its fields are
initialized to NULL
.
For example,
variable t = struct { city_name, population, next };
defines creates a structure with three fields and assigns it to the
variable t
.
Like arrays, structures are passed around via a references. Thus,
in the above example, the value of t
is a reference to the
structure. This means that after execution of
variable u = t;
both t
and u
refer to the same structure,
since only the reference was used in the assignment. To actually
create a new copy of the structure, use the dereference
operator, e.g.,
variable u = @t;
The dot (.
) operator is used to specifiy the a particular
field of structure. If s
is a structure and field_name
is a field of the structure, then s.field_name
specifies
that field of s
. This specification can be used in
expressions just like ordinary variables. Again, consider
variable t = struct { city_name, population, next };
described in the last section. Then,
t.city_name = "New York";
t.population = 13000000;
if (t.population > 200) t = t.next;
are all valid statments involving the fields of t
.
One of the most important uses of structures is to create a dynamic data structure such as a linked-list. A linked-list is simply a chain of structures that are linked together such that one structure in the chain is the value of a field of the previous structure in the chain. To be concrete, consider the structure discussed earlier:
variable t = struct { city_name, population, next };
and suppose that we desire to create a list of such structures.
The purpose of the next
field is to provide the link to the
next structure in the chain. Suppose that there exists a function,
read_next_city
, that reads city names and populations from a
file. Then we can create the list via:
define create_population_list ()
{
variable city_name, population, list_root, list_tail;
variable next;
list_root = NULL;
while (read_next_city (&city_name, &population))
{
next = struct {city_name, population, next };
next.city_name = city_name;
next.population = population;
next.next = NULL;
if (list_root == NULL)
list_root = next;
else
list_tail.next = next;
list_tail = next;
}
return list_root;
}
In this function, the variables list_root
and list_tail
represent the beginning and end of the list, respectively. As long
as read_next_city
returns a non-zero value, a new structure is
created, initialized, and then appended to the list via the
next
field of the list_tail
structure. On the first
time through the loop, the list is created via the assignment to the
list_root
variable.
This function may be used as follows:
variable Population_List = create_population_list ();
if (Population_List == NULL) error ("List is empty");
We can create other functions that manipulate the list. An example is
a function that finds the city with the largest population:
define get_largest_city (list)
{
variable largest;
largest = list;
while (list != NULL)
{
if (list.population > largest.population)
largest = list;
list = list.next;
}
return largest.city_name;
}
vmessage ("%s is the largest city in the list",
get_largest_city (Population_List)));
The get_largest_city
is a typical example of how one traverses
a linear linked-list by starting at the head of the list and
successively moves to the next element of the list via the
next
field.
Now consider a function that sorts the list according to population. To illustrate the technique, a bubble-sort will be used, not because it is efficient, it is not, but because it is simple and intuitive.
define sort_population_list (list)
{
variable changed;
variable node, next_node, last_node;
do
{
changed = 0;
node = list;
next_node = node.next;
last_node = NULL;
while (next_node != NULL)
{
if (node.population < next_node.population)
{
% swap node and next_node
node.next = next_node.next;
next_node.next = node;
if (last_node != NULL)
last_node.next = next_node;
if (list == node) list = next_node;
node = next_node;
next_node = node.next;
changed++;
}
last_node = node;
node = next_node;
next_node = next_node.next;
}
}
while (changed);
return list;
}
Note the test for equality between list
and node
, i.e.,
if (list == node) list = next_node;
It is important to appreciate the fact that the values of these
variables are references to structures, and that the
comparison only compares the references and not the actual
structures they reference. If it were not for this, the algorihm
would fail.
A user-defined data type may be defined using the typedef
keyword. In the current implementation, a user-defined data type
is essentually a structure with a user-defined set of fields. For
example, in the previous section a structure was used to represent
a city/population pair. We can define a data type called
Population_Type
to represent the same information:
typedef struct
{
city_name,
population
} Population_Type;
This data type can be used like all other data types. For example,
an array of Population_Type types can be created,
variable a = Population_Type[10];
and `populated' via expressions such as
a[0].city_name = "Boston";
a[0].population = 2500000;
The new type Population_Type
may also be used with the
typeof
function:
if (Population_Type = typeof (a)) city = a.city_name;
The dereference @
may be used to create an instance of the
new type:
a = @Population_Type;
a.city_name = "Calcutta";
a.population = 13000000;