The choice of methods was based on the purpose of the research and the availability of information. In Brazil, there is no long-term time series information about air links. Another element is that jet fuel data is available only for domestic air companies. The panel data approach proved to be adequate because it has cross-section and periodic dependences that have specific characteristics. Initially, tests were performed by considering only one stage; however, the possibility of endogeneity in the variables was observed, and thus, the two-stage method with an instrumental variable was adequate. In addition, the statistical tests recommended the adoption of fixed effects, both in the cross-section and in the period. The choice of variables was the result of a search, even in other sectors of activity [
3], which provided information related to the company’s operating activities.
This research was developed following the steps shown in
Figure 1. The first step was the motivation for this research and the second was to find a key performance indicator to measure jet fuel efficiency. In the third step, data that could be linked to jet fuel efficiency was collected and processed, available data were searched for, and the necessity of their preparation was verified. The fourth step was the effective preparation of the data. In the fifth step, the unbalanced panel data model was analyzed to understand which factors within the collected data could influence jet fuel efficiency. The sixth, seventh, and eighth steps involved the choice of the analytical methodology to be adopted and its results. For the final target, this research demonstrated that a reduction in the idle capacity for air transport can significantly improve environmental results.
2.1. Analytical Methodology
Panel data is a structure that is recommended when the explanatory variables are time-dependent, also known as longitudinal data, representing repeated observations of a set of units in a cross-section. That is, the predictive and explanatory variables of interest are measured on different occasions, generally over time, for each single individual or element (in the case of this study, air routes). In longitudinal studies, the observations of an individual over time are correlated and thus demand statistical techniques that take account of that dependence [
18]. Longitudinal data offer several advantages over data distributed over a cross-section or time series only. The benefits include being able to study dynamic relations over time and model differences among individuals [
19]. Approaches using econometric analysis of panel data have evolved over the years and experts have developed several methodologies to contemplate specific characteristics of the observed data [
20]. The literature recommends experimenting with various approaches in order to select the most appropriate modelling. Statistical tests have been developed to assist the process of selecting among approaches.
The analytical model proposed in this paper endeavored to explain the relationship between partial productivity of fuel and idle capacity. As the study focused on average air transport efficiency per route per year, no distinction was made between airlines and the aspects of competition among airlines on the routes were not addressed. It is important to point out that in the model addressed in this study, we aimed to explain a variable related to the operational cost of the route that is strongly linked to the fuel consumption. Thus, the productivity of the fuel in a specific route in a given year can be determined by operational characteristics of the airline company operating that route in that specific year. There is an expectation that with the passing of years, we can observe an improvement in the performance of airline companies related to fuel usage, either by the refinement of operational procedures or by continuous technological improvement in the industrial sector. In this sense, the hypothesis of a fixed or random annual effect was tested in the choice of the model. Another important aspect is the characteristics of each route that cannot be explained solely by the distance between two airports. It is necessary to consider an effect for each route that must be fixed or random, with the appropriate statistical tests carried out. Therefore, in order to confirm the suitable type of model, we will performed the following tests: redundant fixed effects or Chow test (likelihood ratio), omitted random effects (Lagrange multiplier), and correlated random effects (Hausman test) [
19].
Once the panel data approach is defined, it is necessary to consider the possibility of endogeneity in the model, which can define a two-stage least squares panel approach using an instrumental variable. Therefore, the model included: a variable that represents the usage level of the airplanes (capacity), an operational variable defining the airline companies in the route, and an instrumental variable that could mitigate endogenous problems in the model. Other variables, such as waiting time and aircraft taxi data, also affect the partial fuel productivity, but no data were available to estimate their impact. For a thorough discussion of the two-stage least squares panel approach, see Hsiao [
20], and for software implementation, see EViews 11 [
21]. The dependent variable was the tonne∙km transported per liter of fuel. The variable linked to the level of usage was the idle capacity, which was determined using the airplane’s total capacity minus the annual average load factor. An airline company’s operational decision regarding the operation of the route will be represented by the annual average payload offered on that particular route. The model’s instrumental variable was the average weekly frequency of flights in a specific route per year. The proper effects of each route and of each period were defined in accordance with the results suggested by the statistical tests carried out (fixed or random effects). In the period considered, it a uniform pattern of the fleet usage was observed for the routes considered. Once the analysis was performed for each route per year, the specification and technical characteristics of the airplanes were not included in the formulation of the models. The software used to perform the regressions was the econometric software EViews 11 [
21], which is a statistical modeling software. In general, for processing, there is no strict specification about the hardware as the model does not require much computational effort and can be processed on a personal computer. In this research, an Intel Core i7 computer with 16 GB RAM memory was used.
The equations’ notation was in accordance with EViews 11 [
21]; however, they were adapted for the study variables. All variables reflect the annual mean on each route (from city
to city
) for each year
. The corresponding model used to estimate the regression parameters is shown in Equation (1).
where
c: constant
: the natural logarithm of the variables;
: estimated cross-section fixed effect coefficients;
: estimated period fixed effect;
and : estimated regression model coefficients;
: mean fuel productivity on route to in year (tonne∙km/L);
: mean aircraft size on route i to in year (kg);
: mean idle capacity on route to in year (ratio);
: regression error.
is the instrumental variable, as given in Equation (2):
: mean weekly frequency on route to in year ;
: regression error;
c, a, and b: estimated regression model coefficients.
2.2. Data
The data set was formatted as unbalanced panel data for domestic air routes in Brazil from 2007 to 2016. Although the database comprised information since 2000, the year 2007 was chosen for the analysis by considering the years when the four airline companies operating the domestic Brazilian air routes reached 90% of market share (see
Table 1). From 2000 up until 2006, Brazil experienced a process of consolidation and bankruptcy of Brazilian airline companies, which caused the market to be unstable in terms of its operational conditions. From 2007 on, two airline companies, TAM and GOL Airlines, dominated the Brazilian domestic air market. After this year, AVIANCA Airlines, which entered the market in 2003, began to have a significant share; in 2008, AZUL Airlines entered the market, very quickly increasing its share. In 2016, these four companies represented 99% of the market share for commercial domestic routes. The evolution of participation in revenue passenger-kilometers (RPKs) in the total of Brazil is shown in
Table 1. As the structure of the Brazilian domestic air transport is almost the same in 2019, and there were no substantial changes in the fleet of aircrafts, we believe that the research results are still valid.
The annual information was organized using the available Brazilian National Civil Aviation Agency (ANAC) database. The main reason the data set was unbalanced is the variation in Brazil’s air transport network during the study period, especially the regional routes. In the case of missing data, the unbalanced panel considered all underlying information, such as years, not excluding the cross-section by total. Additionally, to avoid the presence of outlier data, two conditions were stablished for the sample. The first considered that the route should have at least an average of one round trip per week, this filter selected the regular routes, removing any outlier from the base. The second established that the load factor should be higher than 10%. These two conditions limited the sample to regular operations throughout the year, avoiding seasonal or sporadic operations.
Once there was a relation linking the amount of fuel an airplane needed at the moment of take-off, the total weight of the airplane, the embarked weight, and the destination airport, an approach to measure the partial productivity of fuels was the transported weight per fuel unity because this ratio reveals the specific average performance of fuel usage, which is an important input for airline companies. Thus, in this study, we adopted the work load unit (WLU) as an indicator of the weight being transported. Fuel is the most relevant item in the increasing of airline companies’ operational costs in Brazil. Estimates indicate that 40% of those total operational costs are due to the fuel consumption.
Fuel efficiency was chosen as the dependent variable because it displays characteristics that are important both to airline performance and to monitoring the use of this resource and related environmental impacts. Airlines reduce their operating costs by increasing the productivity of this important item on their cost spreadsheets. Meanwhile, society benefits from an activity that is essential to economic and social development through its efficient use for which there is, as yet, no alternative, but which results in adverse environmental impacts. Another no less important aspect is that productivity is fundamental to economic and social development, and no measure should neglect this variable as it is an important item in the decision-making process in terms of both air transport policy and airline operations planning. At present, one prominent policy aspect is the restriction on pollutant gas emission levels. The fuel productivity variable is expressed by Equation (3):
where
: total revenue tonne-kilometers on route to in year ;
: total fuel burn on route to in year (liter).
A revenue tonne-kilometre
is generated when a metric tonne of revenue load is carried one kilometer. Where that load includes a passenger load, the number of passengers is converted into a weight load, usually by multiplying this number by 90 kg (to include baggage). This figure is based on Doganis [
22], which reports that most airlines use 90 kg to express a passenger and baggage, so 11,111 passengers are equivalent to 1 tonne.
The independent variable of the model, the idle capacity (
), represents what percentage of capacity offered is not used by the market. The load factor variable (
) is estimated as the ratio of
to
. The available tonne-kilometers (
) is the volume of tonne-kilometers offered, that is, the sum of the product of payload (the total available load weight per aircraft available for transporting passengers, freight, and post) and the route distance. A high load factor means that the flight is being more fully utilized, and accordingly, it is expected that fuel burn productivity will be higher. This is one of the main indicators of air transport performance and will be significantly related to the partial productivity of fuel. The variable
is expressed using Equation (4):
where
: total revenue tonne-kilometers on route to in year ;
: total available tonne-kilometers (supplied) on route to in year .
The aircraft size variable is represented using the mean payload supplied on the route in a certain year. The variable (in kg) it is expressed using Equation (5):
: total payload supplied on route to in year (kg);
: total take-offs on route to in year .
This is a decisive variable through which the airline determines how much transport capacity to offer on the market. Although cases of over-supply may exist for reasons of competition, such cases are distributed across all operations on the route in the year, thus reducing the bias that such cases can cause in assessing the variable. As the study worked with a very large data set, it is to be expected that such distortions will be minimized.
As the presented model suggests the possibility of endogeneity, it was necessary to include a decision variable related to the airline companies in order to mitigate this problem. This was done using a two-stage least squares panel model estimation. The chosen variable was the average weekly frequency of take-offs observed in a specific year for that route. This variable is defined according to Equation (6):
where