Travel demand estimation models are based on behavioral data obtained by surveying a sample of the population that is expected to use a proposed transportation system. Such data can be based on revealed or stated preference. When the goal is to estimate demand for improving an existing system or building a new system which is similar to something that exists in some form in the area, the behavior of the users of the existing system can be observed. In other words, data on the revealed preference of potential users need to be collected. This is not possible when estimating demand for a system that is not in existence; in other words, the revealed preference of users cannot be observed. In such cases, data collection will be based on stated preference methods.
4.2. Stated Preference Survey
The population targeted for the survey used in this research was the BSU community, consisting of students, staff, and faculty. Email addresses of 5000 students and 3681 faculty and staff were obtained from the BSU Office of Institutional Research. The student list was a random sample of the student population; the faculty and staff list constituted the entire population of faculty and staff. The survey questionnaire was sent to 8681 students, faculty, and staff in April 2018. A total of 1821 people completed the survey, resulting in a response rate of 21%.
4.2.1. Survey Questionnaire
The primary objective of the survey was to assess the willingness of the respondents to use an aerial tramway between the campus and downtown Boise based on specific attributes of the hypothetical tramway relative to the current mode used by the respondents to travel between these locations. To keep the choice task simple, only two attributes were used to described the non-existent mode, the aerial tramway. The attributes were cost and convenience. The two attributes were thought to capture multiple characteristics of a mode that a traveler considers either implicitly or explicitly when deciding between travel modes. Such characteristics include travel time, travel cost, comfort, and parking availability.
The survey started with an introductory statement that described the purpose of the survey. It then asked the respondent about their affiliation with the university. The choices for affiliation were: lower division undergraduate student (freshman/sophomore), upper division undergraduate student (junior/senior), graduate student, staff, and faculty. The next question was whether they made a trip downtown in the last 30 days. For respondents who had not made the downtown trip, the survey ended with one last question regarding the choice for the aerial tramway under various combinations of cost and convenience, as described in
Section 4.2.2.
Respondents who did make a trip downtown in the last 30 days were then asked about the mode they used to make the trip. The mode choices were car, shuttle, bicycle, and walking. The questionnaire then branched out to one of four blocks of questions based on the four modes. Respondents had to answer only the questions related to the block to which they were directed.
Questions in each of the blocks were almost identical. Respondents were asked about the travel time of their trip and how they rated their trip in comparison to other potential modes in terms of cost and convenience. The final question respondents had to answer was whether they would choose the aerial tramway over their chosen mode under various combinations of cost and convenience. The questionnaire used in the online survey is shown in
Appendix A.
4.2.2. The Choice Task
Three levels of each attribute were used in the choice task. The levels chosen were whether the cost and convenience of the proposed mode (aerial tramway) were better, the same, or worse than the current mode used by the respondent. A full factorial experiment design was used. With two attributes each with three levels, the total number of choices is nine, which was not considered to be too onerous for respondents.
Respondents who had made a recent trip downtown would be aware of the cost and convenience of the mode they used for that trip. Respondents who had not made this trip in the last 30 days and who responded to this question must have remembered the cost and convenience of the mode that they had used to make this trip whenever they made the trip in the past. The survey instrument used in the online survey including the choice task question asked is available in
Appendix A.
4.3. Theoretical Background
In a logistic regression model, the probability of a binary outcome variable can be expressed as a linear function of user-selected predictor variables. Let
Y be a binary outcome variable denoting failure/success by 0/1 and
p be the probability that
Y = 1, or
Let
be a set of predictor variables. In a logistic regression of
Y on the set of predictor variables, the probability of success,
p, can be related to a function of the predictor variables through the logit transformation of
p, as shown in Equation (1).
The probability of success,
p, can then be expressed as shown in Equation (2).
Estimates for the parameters,
, can be obtained via the maximum likelihood method applied on Equation (1). The first parameter,
, corresponds to the constant term in the equation. Further details about logistic regression are available in references such as [
23,
24].
As the change in log odds, expressed by the logit function, due to a unit change in a predictor variable, is difficult to interpret, a different measure is used for this purpose. The alternative measure is the odds ratio. The odds ratio (OR) is the ratio of the odds of the outcome among respondents with to the odds of the outcome among respondents with . This interpretation applies when the predictor variable is a dummy variable with possible values of 0 or 1. For a continuous predictor variable, the OR needs to be interpreted in terms of a unit change in the predictor variable. The OR can be shown to be equal to the exponent of the coefficient for : .
When the predictor variable is a dichotomous variable, using OR to assess the effect of the variable in the probability of choosing the response variable is meaningful. However, when the predictor variable is a non-dichotomous variable, the effect of a change of one unit in the probability of the response variable may not be meaningful, depending on the units used to measure the variable. For example, if a travel time variable is used as a predictor variable and if it is measured in minutes, then a change of 1 min of travel time may not be meaningful. We may be more interested in finding the impact of a change in travel time of 5 min or 10 min. Fortunately, OR can still be used to assess the impact of a change of any multiple of the basic unit in the predictor variable. For example, if the effect of a change of c times the basic unit of variable
is desired, the OR for that change can be expressed as being equal to
[
23].
Exponentiating the estimated coefficient of a predictor variable to measure the OR of the outcome when there is a unit change in the value of the predictor variable will work when there are no interaction terms involving the predictor variable in the model. When interaction terms are present, the effect of one variable will depend on the level of the other variable that interacts with the first variable. A more involved interpretation of the coefficients will be needed in such a case, as explained below.
Consider the case of two nominal variables, Affiliation and Mode, with multiple levels each. As shown in
Table 1, Affiliation has five levels, and there are four modes for the Mode variable. If these variables are included in the model, they will be represented by four and three dummy variables, respectively. One of the levels in each of the nominal variables will be treated as the base level, with the value of zero. The dummy variables will be assigned a value of one if present in the model or zero if not included in the model. The abbreviations UGLD and UGUD in
Table 1 denote, Undergraduate Lower Division and Undergraduate Upper Division, respectively.
Consider an example in which the affiliation of the respondent is UGLD, represented by the dummy variable
RoleUGLD, and the respondent uses the car mode to make the trip. The car mode is represented by the dummy variable
ModeCar. The logistic regression model for this example is represented by Equation (3):
Note that the other three dummy variables for Affiliation and two other dummy variables for Mode have not been shown in Equation (3) to simplify the discussion. Because of the interaction term in the model, the effect of either dummy variable depends on the value of the other dummy variable. To assess the effect of UGLD, two cases have to be considered: one when the mode used is not a car and the other when the mode is a car. For the former, the effect of UGLD can be accounted for by the value of alone. For the latter, () has to be exponentiated to find the OR when the Role is UGLD and the Mode used is a car relative to the base case for both Role and Mode.