1. Introduction
The motivation for this work arose from a recent article in the Wall Street Journal [
1], as well as personal experience in traveling to many cities, both within the USA and abroad, over the years. In the article, which has the subtitle “Southwest Airlines is studying ways to squeeze more flights per plane—a big focus is on passenger boarding bottlenecks”, there is a quote from the airline’s
that reads, “If you can collect up enough of these minutes in each turn, then you can start to squeeze out some more flying”. Further, the article mentions that boarding times are key to enabling more trips to be performed by the aircraft. This is probably due to the significant variations present in boarding as compared to other aspects, such as cleaning the aircraft before boarding the passengers and the flying time.
A number of articles and research works on this topic have mentioned the bottlenecks involved in the boarding process. For example, in [
2], the author mentions that American Airlines uses a single aircraft to make about six or seven trips in a single day. Further, the author lists the activities that occur during the turnaround time, which is the time between the aircraft pulling into the gate with the passengers onboard and the aircraft taking off with a new set of passengers. According to [
3], the most significant delays could occur in the boarding process, and the article points out that the boarding time has increased by more than 30% since the 1970s. Moreover, the distribution of the boarding time tends to have a longer tail due to events such as late arrivals of passengers, passengers needing assistance, and a few last-minute occurences. Thus, a clear understanding of the tail probabilities, namely percentiles, will help the system manager to provide the needed resources.
A number of publications (see, e.g., [
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14]) discuss boarding strategies, such as front-to-back, back-to-front,
(window seats board first, followed by middles and aisles), outside-in, reverse pyramid, first-come-first-served, random, or other methods in which boarding is performing by calling each passenger group individually, such as Steffen [
12], and changes in the Steffen boarding method. Such strategies may provide insights to determine a strategy that will speed up the boarding process. We refer the reader to [
11] for the different boarding methods adopted by a few airlines. In recent times, most airlines have adopted a strategy that ensures that their most loyal customers are treated better compared to others. Loyal customers are the ones who travel frequently and thus accumulate significant award miles to achieve premier status—silver, gold, or platinum (the category name depends on the type of airline). As is known, in any strategy that an airline adopts, there is always randomness involved in the actual boarding process.
Thus, airlines can reduce the time involved in carrying passengers from one city to another by considering various boarding strategies, as well as the boarding times. While there is research (as pointed out earlier) on boarding strategies, to our knowledge, there is no literature on the use of stochastic models to study the effect of a reduction in the average boarding time on the overall performance of the boarding process. The study of such stochastic models is very timely in view of the recent article by the Southwest
in the Wall Street Journal [
1]. By building such stochastic models, in this paper, we try to answer questions such as “how much will a reduction in the boarding time result in an increase in the average number of trips made by an aircraft?”. With such quantitative descriptions of the reduction in the boarding times’ percentiles, management can adopt an appropriate boarding strategy to arrive at a reduction in the average boarding time.
Hence, our aim in this paper is to model the de-boarding–cleaning–boarding–flying sequence, from point A to point B to point A (for circular travel, as seen in many airlines in both local and international flights), using phase type () distributions and show the impact of eliminating one minute or more from the average boarding time on key measures involving percentiles. It also seeks to quantify the guaranteed boarding time with a certain level of confidence. We also point out how such modeling can be generalized to include a travel path consisting of more than two cities, which does not always need to be circular. Overall, airline companies are interested in scheduling their flights to maximize the number of trips that an aircraft can make (under ideal conditions), and having a stochastic model to analyze the time will contribute to such planning.
-distributions were introduced by Neuts [
15] and have been extensively studied in the literature (see, e.g., [
16,
17,
18,
19,
20]). Recall that a
-distribution is obtained as the time until absorption in an irreducible Markov chain with an absorbing state. In other words, given an irreducible continuous-time Markov chain (
) with
m transient states and one absorbing state with the generator
of the form
where
D of dimension
m governs the transitions corresponding to the transient states and the column vector
governs the rates of absorption into the absorbing state. Note that this column vector is such that the sum of this and the row sum of
D will lead to a zero vector due to the property of the generator of being a
. Suppose that the initial probability vector of this
is taken as
of dimension
m. If
X denotes the time until absorption in the
starting in one of the
m transient states, then the probability distribution of
X is said to be a
-distribution with representation given by
of dimension
m. We denote this statement by displaying
of dimension
m.
Its use in stochastic modeling has been amply demonstrated in numerous publications since the seminal paper by Neuts. Very briefly,
-distributions are obtained as the time until absorption in a finite Markov chain with one absorbing state. These distributions are defined in both discrete and continuous time, and, here, our focus is on the continuous-time version. To completely describe a
-distribution, one needs an initial probability vector and a finite-dimensional matrix that governs the transitions among the transient states. For more details, including properties, examples, and computational aspects, we refer the reader to the above-mentioned references. In particular, we refer to the recent book by Chakravarthy [
18] for detailed descriptions with a number of illustrative examples.
This paper is organized as follows. In
Section 2, the basic model under study is described, along with its analysis. The analysis of the model in both steady state and a transient one is presented in
Section 3. Illustrative numerical examples of the basic model are discussed in
Section 4 and concluding remarks are presented in
Section 5.
2. Model Description
In this paper, we study boarding times by looking at only two cities, e.g., and , such that the same aircraft with a capacity to carry N passengers will shuttle back and forth between these two cities. If one is interested in extending this to include multiple-city trips, such as from city to city and to city , or to reflect circular paths such as to city to city to city , the model studied here can easily be generalized but with more states to describe the system, and the details are left to the reader.
Generally speaking, the process involved in flying a commercial aircraft is as follows. After landing in a city, the aircraft pulls into the gate and the passengers de-board. The cleaning of the aircraft occurs before a new set of passengers boards the plane. Once the gate closes, the flight is ready to take off, and, after landing in a new city, the process continues. It should be pointed out that while the de-boarding, cleaning, and boarding times depend on the number of passengers, the time from the gate closing to landing in another city does not depend on the number of passengers.
We assume that the vector of the probability mass function () of the number of passengers boarding in city is given by and that of the one in city is . Thus, with probability , the aircraft with a capacity N will leave city with j passengers onboard. Similarly, with probability , the same aircraft will leave city with j passengers onboard. In this paper, we place no restriction on the nature of these two s. They can be generally distributed (e.g., binomial, truncated geometric, and truncated Poisson) with support on the set .
We use
-distributions to model the eight sets of random variables. Generally, the de-boarding, cleaning, and boarding times depend on the number of passengers on the plane; hence, we will model these dependencies by enabling the underlying
-distribution to include another set of parameters. Our analysis here can be modified to model the dependencies under a more general setup using different
-distributions. However, this will increase the number of input distributions significantly. To this end, we define the following (column) vectors of rates.
Once again, we place no restriction on the nature of the rate vector, for our modeling purposes. The following sets of random variables are needed to study the model.
Define the random variables, , for de-boarding in city and assume that of dimension .
Define the random variables, , for the cleaning of the aircraft while in city and assume that of dimension .
Define the random variables, , for boarding in city and assume that of dimension .
Define the random variable, , to represent the time required for the aircraft to leave the gate in city and arrive at the gate for de-boarding in city . Let of dimension .
Define the random variables, , for de-boarding in city and assume that of dimension .
Define the random variables, , for the cleaning of the aircraft while in city and assume that of dimension .
Define the random variables, , for boarding in city and assume that of dimension .
Define the random variable, , to represent the time taken for the aircraft to leave the gate in city and arrive at the gate for de-boarding in city . Let of dimension .
In the following, we need the terms below.
is a column vector of 1s with appropriate dimensions, which should be clear from the context. Where clarity is needed, the dimensions will be displayed.
I is an identity matrix of appropriate dimensions. Again, when clarity is needed, the dimensions will be displayed.
Suppose that a is a vector such that . Then, denotes a diagonal matrix of dimension n whose ith diagonal element is given by . The inverse, when it exists, of this diagonal matrix will be denoted as . In other words, .
The symbols ⊗ and ⊕, respectively, define the Kronecker product and sum of matrices. A few key works on these can be found in [
21,
22,
23].
We define a column vector
of dimension
to be such that
For use in this work, we define the mean (
), the variance (
), and the invariant vector (
) of the
-renewal process, namely
, associated with the
-distribution
. These quantities are as given below (see, e.g., [
18,
20]).
4. Illustrative Numerical Examples
In this section, we illustrate the key concepts with two sets of numerical examples. The time units, unless otherwise specified, are hours. The input parameters for the illustrative examples are chosen as follows. In the airline industry, the
passenger load factor (
) [
24] is defined as the ratio of the number of actual passengers to the number of available seats. We will denote this fraction as
p in the following. Based on the data provided in [
24], we note that this fraction (for domestic flights) ranges from 0.55 to 0.85 approximately. Thus, we conduct our analysis by taking
to be in this range. However, here, we discuss the examples by fixing
p at 0.55 and 0.85.
It would be ideal to perform analyses with all practical data for the model studied here. However, except for
, to our knowledge, there are no data on boarding times available to use here. It is possible that these data are protected by each airline and are not available to the public. When these data are made available or when an airline wishes to explore the use of the model proposed here, one can fit the data to a
-distribution (see, e.g., [
25,
26,
27,
28,
29,
30,
31,
32]). For the
of the number of passengers, we consider truncated binomial, truncated (reverse) geometric, and truncated Poisson forms. In particular, we take
and
to be one of the following three discrete distributions.
Truncated binomial (): This is a binomial distribution that is truncated so that the mass is within
. Specifically, the
is of the form
Truncated geometric (): This is a (reversed) geometric distribution that is truncated so that the mass is within
. Specifically, the
is of the form
Truncated Poisson (): This is a Poisson distribution that is truncated so that the mass is within
. Specifically, the
is of the form
where
c is the normalizing constant to ensure a legitimate
.
In order to compare the various scenarios (when varying the type of ), we have to choose the parameters of the above-mentioned in such a way that the mean number of passengers will always be the same. For example, if and , then the mean number of passengers onboard will be 55. Thus, to arrive at this mean for the truncated binomial, one has to choose , , and . It is worth mentioning that, due to the size of N and the values of p considered for the illustrative examples, the truncated binomial reduces to the binomial one since .
It should be pointed out that when computing the binomial probabilities, one can encounter overflow or underflow issues, especially when
N is large. To avoid this, one should find the mode of the binomial distribution and then compute the rest of the probabilities recursively. To facilitate such computation, we provide the mode values for the binomial case. For the set of parameters considered in this section,
Table 1 lists the corresponding parameter values.
While, for the truncated binomial and the truncated Poisson, the choice of the values for the parameters N and p to guarantee that the probability of the number of passengers onboard is less than, e.g., 10, is insignificant (i.e., close to zero), this is not the case with the truncated geometric. This is due to the choice of the geometric parameter required to guarantee a given mean. However, it is easy to modify this by ensuring that the mass of this geometric distribution has a positive value beyond a specific number, such as 10.
For the rate vector, given the starting value, e.g., (at 1), and the ending value, e.g., (at N), we consider four possible scenarios consisting of (a) linearly decreasing rates (); (b) quadratically decreasing rates (); (c) decreasing rates in a square root manner (); and (d) decreasing rates in a logarithmic manner (). Note that and are, respectively, the rates when one passenger and N passengers are onboard. Naturally, we impose a restriction in which . Thus, we have the following.
- :
Here, we have (note that we suppress the suffix
r in
)
- :
- :
- :
Since the main goal of this work is to determine the impact of reducing the average boarding time on the other measures, we fix the average values of the times spent in various events, such as de-boarding, cleaning, boarding, and leaving a particular city for another city. In order to do this, we need to accordingly fix the means of the eight
-distributions. To this end, we use Equation (
16), which relates the means
to
, for
. Thus, given a specific probability vector
, the rate vector,
, and the mean,
, we can find the value of
that will give the set value for
.
For the following two examples, we fix the input parameters as follows. The unit is the number of hours, unless otherwise indicated.
We take the probability vectors
and
to be identical, but we vary the common one to be one of the three, namely
and
P as listed above. Moreover, for the rate vectors, we take
and
. The parameter values of
for these two sets, namely for
and
, will be
and
. However, we vary the type of decreasing to be one of the four listed above. In
Figure 1, we display a sample plot of the values of
under the four scenarios:
and
.
The means,
and
, are varied from
to
in increments of 1 min, i.e., in increments of
h. The capacity of the aircraft,
N, is varied from 50 to 400 in increments of 50. To consider how, under the values chosen, the mean of the underlying random variable
(and hence others), which can be controlled by the system providing the needed resources, behaves as we vary
N, the type of probability vector, and the type of rate vector, we can consult the plots in
Figure 2,
Figure 3 and
Figure 4. Before we consider these (spider) figures, a few details are provided for explanation. The two-tuple values displayed at the perimeter of the outermost circle correspond to
N and the mean times (in minutes). Thus, the two-tuple value 50 25 corresponds to
and the mean boarding time is 25 min. The legend containing
and
P indicates the type of distribution used to model the
of the number of passengers.
One can clearly notice the patterns in these plots, indicating significant changes to the mean as N and are varied, as well as the type of probability vector (either truncated binomial or truncated geometric). However, we do not see a significant difference between the truncated binomial and truncated Poisson. As is to be expected, the mean increases as the average boarding time is increased under both values of considered. This behavior indicates that an increase in the rate calls for additional resources or additional strategies to quicken the process of boarding. For example, this can be achieved by increasing the number of gate attendants (based on the value of , which should be known ahead of time). Among the four types of rate vector considered, it appears that a quadratically decreasing rate gives the smallest mean (for and ), indicating that the system can dynamically provide gate attendants to help the passengers to board the aircraft.
Illustrative Example 1: In this example, we use Erlang distributions for all underlying random variables. Recall that an Erlang of order
m with parameter
, denoted as
, has the
given by
Note that the mean and the variance are, respectively, given by and . One advantage of using this probability function is that by choosing the order to be a large positive integer, we can model a random variable that has very minimal variation.
Using the notation
the order and the parameter values for this example are as follows.
The parameters
N and
are varied, respectively, from 50 to 400 and from 25 to 30 min in increments of 1 min. In
Figure 5 and
Figure 6, respectively, we display the key measures for probability vectors labeled
B and
G under a linearly decreasing (
) rate vector. Since we saw similar behavior for the other types of theta vectors, namely
and
, we display the results here only for the
case.
It is clear by looking at these figures (as well as the ones not provided here due to the similarity of the plots) that the following occurs.
Only the type of probability vector (whether it is truncated binomial or truncated geometric) and the type of theta vector ( through ) appear to have an impact on the percentiles.
Comparing the truncated binomial probability (B) and the truncated geometric (G) schemes, we notice that (a) scheme G gives small values for the 50th percentile for both values of ; (b) for the 75th percentile, while scheme G gives small values when , the values are similar for both schemes when ; (c) scheme B gives small values for the 95th percentile for both values of . This indicates that scheme G starts with small values for the percentiles (as compared to scheme B) and then yields progressively larger values for higher percentiles.
Finally, we look at the percentage reduction in the percentiles when decreasing the average boarding time. We denote by
the value of the
ith percentile when the average boarding time is
j. Thus,
stands for the 50th percentile when the average boarding time is 25 min. The reduction percentage, for a given
ith percentile and a given average boarding time
j, is calculated as
The reduction percentages do not appear to be significant when the probability vectors, the rate vectors, or
are changed. Hence, in
Figure 7, we display the reduction percentages for the case of a truncated binomial and
rate.
It is very clear from this figure that the reduction percentages are almost identical for all three percentiles. Further, a 5 min reduction in the average boarding time results in a more than 16% reduction in the percentile value. This translates into a guarantee that, for at least 95% of the time, the boarding time will not exceed between 45 and 69 min depending on the value of N. The average 95% guarantee time across the board is about 50 min. If one were to give a guarantee at a 50% level, the boarding time will not exceed anywhere between 21 and 26 min depending on the value of N. The average 50% guarantee time across the board is about 25 min.
Illustrative Example 2: In this example, we use Erlang as well as hyperexponential distributions for the underlying random variables. Recall that a hyperexponential of order
m with parameters
with the corresponding mixing probabilities,
, has the
given by
Note that the mean and the variance are, respectively, given by
and
This probability function can be used when there is large variability in the underlying random variable. We will denote this hyperexponential by
. The order and the parameter values for this example are as follows.
As in the previous example, we vary the parameters N and , respectively, from 50 to 400 and from 25 to 30 min in increments of 1 min.
In
Figure 8 and
Figure 9, respectively, we display the key measures for probability vectors labeled
B and
G under a linearly decreasing (
) rate vector.
It is clear by looking at these figures (as well as the ones not provided here due to the similarity of the plots) that the following occur.
Only the type of probability vector (whether it is truncated binomial or truncated geometric) and the type of theta vector ( through ) appear to have an impact on the percentiles. This observation is similar to the one seen in the previous example.
Comparing the truncated binomial probability (B) and the truncated geometric (G) schemes, we notice that (a) scheme G gives small values for the 50th percentile for both values of ; (b) for the 75th and the 95th percentiles, while scheme G gives small values when , the values are rather similar for both schemes when . This indicates that the large variability in the boarding times appears to nullify any significant differences in the value, especially when the percentiles increase.
The plot of the reduction percentage for this example is almost identical to the one seen in the previous example and hence the figure is not displayed here. However, the guarantee times differ and the details are as follows. A 5 min reduction in the average boarding time results in a more than 16% reduction in the percentile value. This translates to a guarantee that, for at least 95% of the time, the boarding time will not exceed between 73 and 79 min depending on the value of N. The average 95% guarantee time across the board is about 75 min. If one were to provide a guarantee at a 50% level, then the boarding time will not exceed between 9 and 11 min depending on the value of N. The average 50% guarantee time across the board is about 10 min.
It is worth pointing out that, comparing the two illustrative examples, while the guarantee times at the 50% level are much smaller for the boarding times with large variability, at a 95% level, the guarantee times are small for the boarding times with small variability. This is probably due to the long tail of the probability distribution associated with the boarding time event.
Illustrative Example 3: Here, we provide a brief discussion of the
of the time to travel by looking at the case when
, with the truncated binomial probabilities and
rates for the above-mentioned two illustrative examples. In
Figure 10, we plot the representative
of the time to travel from city
to city
and then back to city
; moreover, the
of the travel time from city
to city
under three scenarios is plotted in
Figure 11.
It is evident from the above plots that, for all Erlang cases (see Illustrative Example 1), the of the total travel time from city to city and back is a bell-shaped curve. Thus, one can try to fit a normal with mean h and standard deviation 0.488975 h. For the of the total travel time from city to city in the Erlang case (see Illustrative Example 1), the normal curves for the three schemes, and P, have the same mean of h but the standard deviations are, respectively, 0.343545 h, 0.430563 h, and 0.374347 h. This normal fit approximation will be helpful to managers seeking a quick solution using Excel or Excel-type worksheets in workplaces. Hence, we point out the possibility to implement the model proposed here.
In the case of Illustrative Example 2, wherein we used a hyperexponential distribution to model the boarding times, we notice a long-tailed distribution for the . One can try to fit a three-parameter gamma, a three-parameter Weibull, or even a three-parameter lognormal distribution to approximate the when dealing with a long-tailed distribution such as the one seen in the plots. For example, we fitted a three-parameter gamma distribution with the shape, scale, and threshold parameters, respectively, of 29.0, 0.153, and 1.5 for the total travel time from city and back to city .