2: The Structure of the Model

[next chapter] [previous chapter] [contents]



2.1 Introduction

The model for analysing the flow of students through the higher education system can be broken into two parts. The first part concerns the modelling of the system as an input-output model. It allows for the estimation of the transition probabilities among the various stages of the system. Given the current state of the system and the projection of intake into its various stages, the model can be used to make projections of the total number in the system, the number dropping out and the number completing. In the second part a simple model is developed for making projections of intake into various stages.

In Section 2.2 we outline the input-output model in the context of the higher education system using matrix formulation. The matrix notation allows for a more compact presentation. In Section 2.3 the model for making projections of intake into the system is outlined. Minimum data requirements for estimating the model and for making projections are discussed in Section 2.4. A simplified numerical example is presented for illustrative purposes in the last section.

2.2 The Input-Output Model

An input-output model, for demographic accounting assumes that the system can be divided into a finite number of mutually exclusive and exhaustive states in terms of some criterion of interest. In general, the states can be divided into two classes, transient and absorbing. Elements of the population cannot remain in a transient state permanently. On the other hand an element reaching an absorbing state remains in it permanently.

In the context of the higher education sector the primary classification of the transient states is by age of the student. The secondary classification of these states is by the student's year of enrolment in the course. For example, a student in his/her second year of enrolment is not necessarily doing the second year level subjects of the course because he/she may be repeating some first year subjects which he/she has failed in the previous year. Students often do combinations of subjects from different year levels in a given year. A typical state of the system is being a 20-year-old and in second year of enrolment in the course. There are two absorbing states in the system, dropout and completion. The dropout state refers to those students who withdraw from the course before completing.

There are a number of ways of analysing movements of a population; see Bartholomew et al. (1991). A broad division can be drawn between models in which movement from state i to state j depends largely on the places available in state j and those in which it depends on the numbers in state i. A model looking at human resources planning and career structure of employees in an organisation would be of the first type because there may be a ceiling put on the number of employees who can be of a particular class, for example, supervisors. The problem being looked at in this report is of the second type since it is assumed no restrictions are placed on the movement from one transient state to another once a student is in the system.

Suppose is a vector of student numbers in each of the transient states at time t. Then

(2.1)

and

, (2.2)

where is a square matrix whose elements represent the number of students who were in state i at time t but are in state j at time t+1, is a vector of student numbers leaving the system from various states, while is a vector of the number of students commencing the course in the time period t to t+1, and c is a unit vector. Throughout this study the time period will be one year.

Equation (2.1) tells us that the number of students at the beginning of a year is made up of those who will survive into the next year plus those who will leave the system during the year. On the other hand equation (2.2) says that the number of students at the beginning of the next year is made up of those who commence in the new year and those who survived from the previous year.

Since there are two absorbing states, dropout and completion, the vector can be written as

,

(2.3)

where is a matrix of departures. Now suppose we define

, (2.4)

where is a diagonal matrix with . Then the element of represents the proportion of students in state i at time t who are in state j at time t+1. is known as the matrix of transition proportions.

Rearranging (2.4) we obtain

.

(2.5)

Substituting (2.5) into (2.2) we get

(2.6)

or

.

(2.7)

If it can be assumed that is constant over time, then

(2.8)

can be used to make projections of student numbers in each state in future periods. Obviously it is assumed that estimates, or forecasts, of vector b for future periods are available. A model for projecting b is developed in the next section.

If we define

(2.9)

then of R(t) represent the proportion of students who are in state i at time t who depart into absorbing state j at time t+1. Thus, if it can be assumed that R(t) is time invariant, then

(2.10)

can be used make projection of the number of students who dropout and those who complete the course.

Other useful information can be gleaned from the above formulation if Q and R are regarded as probability matrices and not merely matrices of proportions. Under the assumption of constant Q and R, for example, the element of the matrix

(2.11)

can be interpreted as the mean time (in years) that a student starting in state i is in state j before departing the system. The matrix N is known as the fundamental matrix of an absorbing Markov chain. Many other theorems applicable to such chains can be applied to it.

The variance of the time a student starting in state i is in state j before departing the system is given by the element of the matrix

, (2.12)

where is a diagonal matrix whose leading diagonal is identical to that of matrix N and the matrix is formed by squaring each element of matrix N.

Furthermore, it is possible to calculate the mean and the variance of the time to depart from the system given an initial starting state. These are given by the elements of the matrices

(2.13)

and

(2.14)

respectively. It is also possible to obtain the probability of a student either dropping out or completing a course given that he/she started in state i. These probabilities are given in the row of the matrix

.

(2.15)

The information on the expected time and its variance to enter a particular absorbing state given an initial starting state, can be obtained by considering the reduced form of the system. The reduced system is formed by replacing D(t) and n(t) in the above formulation by and , respectively. consists of just one column of D(t), that representing the absorbing state of interest. The element of is given by

,

(2.16)

where the absorbing state of interest is in column k of D(t). A new Q and N can now be defined and equations (2.13) and (2.14) used to complete the analysis.

2.3 A Model for Projecting Student Intake

The framework that we use to model student intake is conceptually similar to that used by Sloan et al. (1990). Different models are used for undergraduate and postgraduate students. The models are partly driven by demographic changes and partly by the assumption of continuity of the structure of the system as it exists currently.

Undergraduates

The model aims to project the vector b(t) for . The element in the row of this vector, , represents the number of k-year-old students, in their year of enrolment at time t. Also suppose is the number of k-year-old students who commence a course at time t. Since by definition students are only allowed to enter the system in their first year of enrolment, it necessarily follows that

.

It is assumed that k-year-old entrants to undergraduate courses at time t are made up of:

Hence, we can decompose the number of k-year-olds entering a course at time t as

. (2.17)

If it can be assumed that the number of school-leavers of a given age who gain entry into a course is a fixed proportion of the number of Year 12 students of the same age in the previous year then

, (2.18)

where is the number of k-year-old students in Year 12 at time t. It should be noted that the above formulation requires age of Year 12 students and school-leavers to be defined on the same date. The assumption that school-leaver entrants remains a fixed proportion of the Year 12 student population is arbitrary. Alternative assumptions can be investigated. Data for the last five years show the proportion to fall sharply in 1992 and rise in 1994 and 1995, but still remain below the 1991 level.

The projection of the number of Year 12 students is achieved by using grade progression ratios for each age group. Thus, the number of Year 12 students in period t is estimated by

,

(2.19)

where is the number of k-year-old Year 11 students and

(2.20)

is the grade progression ratio for k-year-olds from Year 11 to Year 12. The number of Year 12 students in period is then estimated by

.

(2.21)

It should be emphasised that the grade progression ratios are estimates of the net flow of pupils from one grade to another.

The non-school-leaver component is projected on the basis of constant age-entry. If P(k,t) represents the projection of the number of k-year-olds in the population at time t, then the projection of the number of non-school-leavers commencing a course at time t is given by

. (2.22)

The projection of the number of overseas fee-paying students allows a uniform growth across all age groups of:

These assumptions are based on the view that the recent rapid growth in their numbers will not be sustained. One reason for this is the uncertainty in the likely number coming from Malaysia and Hong Kong, the major sources of overseas students. These countries are rapidly building up their own higher education sectors.

Postgraduates

The projection of the number of commencing postgraduate students at various course levels is on the basis of constant age-entry too. Therefore, an identical equation to (2.22) can be used to generate the projections. Thus,

.

(2.23)

2.4 Data Requirements

Cohort analysis is one way of acquiring information on student flows that is necessary for the input-output model, but it depends on tracking the same group of students over a number of years. Besides it may not adequately represent current behaviour. The way we have defined the states of the system means that it is possible to estimate the input-output model with stock data of student enrolment in two successive years, including the number of completions in the first of these years.

Flow statistics from one transient state to another can be inferred if data are available on each student's age and year of enrolment. This is a consequence of the way a transient state is defined. Moreover, since there are only two absorbing states and information on the flow into one of them, namely the completion state, is available, the flow into the other can be inferred.

Furthermore, if data are available on each student's sex, mode of study (full-time, part-time or external), level of course, field of study, if the student is overseas fee-paying or not and if the student is new to higher education or not, then differences in the behaviour of different groups of students can be investigated.

The model for projecting the number of commencing students requires data on school enrolment and population projections at the appropriate level of disaggregation. It is possible to estimate the school grade progression ratios and the expected number of Year 12 students coming through if school enrolment data by age, grade and gender is available for at least the current year and one year prior to that.

Therefore, minimum data requirements are:

If data over a longer period is available, then it may be possible to estimate the parameters of the model with a higher degree of reliability. Finally, it is important to ensure consistency between and within all three data sets. For example, age has to be defined identically in all three bodies of data. This can be achieved by having a common date with respect to which age is calculated. Furthermore, there has to be consistency between the enrolment and completions data so that students flowing from one state in one year are all accounted for the following year.

2.5 A Numerical Illustration

By way of a summary, we present a small numerical example simple enough for the development of the main results from the input-output matrix to be followed. The illustration is hypothetical and relates to a system consisting of three age groups - 18, 19 and 20 - and two classes of enrolment - A and B. Class A can relate to the first year of enrolment and B to second or higher year of enrolment. Therefore, there are six transient states. A typical state is being a 18-year-old in Class A (or first) year of enrolment. The state relating to 20-year-olds in Class B is a special case and is to be interpreted as being 20 years or older and in Class B (or second or higher) year of enrolment. There are two absorbing states, dropout and completion.

From hypothetical data on enrolment in 1993 and 1994 and completions in 1993, the input-output matrix as shown in Table 2.1 is constructed. For example, it shows that of the 330 18-year-olds who commenced the course in 1993, 250 progressed on to being 19 years of age and in Class B (the second year of enrolment), 50 dropped out of the course and 30 completed the course. There are 650 commencing students in 1994 of whom 350 are 18-year-olds, 200 are 19-year-olds and 100 20-year-olds.

Table 2.1 Input-Output Matrix Showing the Flow of Students (Hypothetical Example)

 

Age:Year of Enrolment

Enrolment in 1994

    
 18:A 19:A 19:B 20:A 20:B Dropouts Completions Total
 18:A    250     50 30 330
Enrolment 19:A      150 30 10 190
in 1993 19:B      20 10 180 210
 20:A       70  10 10 90
 20:B       10  10 20 40
Commencements 1994 350 200  100     650
Total 350 200 250 100 250 110 250  

Table 2.2 shows the matrix of transition proportions, Q (defined in equation (2.4)) and matrix R (defined in equation (2.9)) representing the proportions moving to absorbing states. The proportions are calculated by dividing entries in each row by the corresponding row total in Table 2.1.

Table 2.2 Matrix of Transition Proportions (Hypothetical Example)

 
Matrix Q
Matrix R
Age: Year of Enrolment 18:A 19:A 19:B 20:A 20:B Dropout Completion
18:A0.00 0.00 0.76 0.00 0.00 0.15 0.09
19:A0.00 0.00 0.00 0.00 0.79 0.16 0.05
19:B0.00 0.00 0.00 0.00 0.10 0.05 0.86
20:A0.00 0.00 0.00 0.00 0.78 0.11 0.11
20:B0.00 0.00 0.00 0.00 0.25 0.25 0.50

Assuming that the model can reasonably be approximated as a Markov chain, we can use equation (2.11) to calculate the fundamental matrix, N. This is given in Table 2.3. The table also includes matrix T and B, calculated using equations (2.13) and (2.15). As an example we shall interpret the entries in the first row of Table 2.3. Given a student enters the system as an 18-year-old, he/she spends, on average, one year as an 18-year-old in Class A enrolment, 0.76 years as a 19-year-old in Class B enrolment and 0.10 years as a 20-year-old in Class B enrolment. The sum of all entries in row one gives us the average time he/she spends in the system, which in this case is 1.85 years. The probability of an 18-year-old completing the course is 0.79.

Table 2.3 Expected Time in System and Probability of Absorption (Hypothetical Example)

 
Matrix N

Matrix T

Matrix B
Initial State Expected Time in State

Expected Time in System

Probability of
Dropping Out

Probability of
Completing

 18:A 19:A 19:B 20:A 20:B    
18:A1.00 0.00 0.76 0.00 0.10

1.85

0.21

0.79

19:A0.00 1.00 0.00 0.00 1.05

2.05

0.42

0.58

19:B0.00 0.00 1.00 0.00 0.13

1.13

0.08

0.92

20:A0.00 0.00 0.00 1.00 1.04

2.04

0.37

0.63

20:B0.00 0.00 0.00 0.00 1.33

1.33

0.33

0.67

Table 2.4 shows the projected number of commencing students for the years 1995 to 2000. In this illustrative example the numbers are not obtained from any model, but simply made up. The projections of number in the system and total number of dropouts and completions are given in Table 2.5. These were derived using equations (2.8) and (2.10). Thus, the table includes the numbers in each state of the system in each year from 1994 to 2000, for example, in the year 1997 there are 258 19-year-olds in Class B of enrolment. The total number in the system in each year is also given, for example, 1195 students are enrolled in 1997. Finally, the table shows the number of students who complete or drop out of the course each year. For example, 184 students drop out and 436 complete the course in 1997.

Table 2.4 Projected Intake of Students, 1995 to 2000 (Hypothetical Example)

 

 
Year

Age: Year of Enrolment

1995 1996 1997 1998 1999 2000
18:1360 340 320 300 330 350
19:1210 180 190 180 210 220
19:2         
20:1120 100 90 90 100 130
20:2         
Total 690 620 600 570 640 700

Table 2.5 Projections of Total Enrolment, Dropouts and Completions, 1994 to 2000 (Hypothetical Example)

 
Year

Age: Year of Enrolment

1994 1995 1996 1997 1998 1999 2000
18:1350 360 340 320 300 330 350
19:1200 210 180 190 180 210 220
19:2250 265 273 258 242 227 250
20:1100 120 100 90 90 100 130
20:2250 322 365 337 329 317 345
Total 1150 1277 1258 1195 1141 1185 1295
Dropouts170 177 191 184 177 184 200
Completions 393 422 459 436 418 406 444
Note: The figures for 1994 are actual numbers


[return to top]