**1. Introduction**

The world's population is rapidly increasing and, according to the most recent projections, it is expected to reach 9.8 million in 2050 and 11.2 million in 2100 [1]. To that end, the planet should be ready to cope with the expected rapid population growth. Producing and delivering adequate, high quality food will be one of the most important challenges for humanity in the next century [2]. The evolution of technology has led to intensification of agricultural production leading to increased productivity and (in most of the cases) quality of agriproducts as well. However, this intensification has significantly increased the environmental footprint of agriculture, leading to a number of environmental impacts associated with the extensive use of fertilizers, pesticides, water, changes in land use, etc. [3]. The environmental issues related to agriculture have drawn the attention of the scientific community, which is now turning towards exploring the definition of agricultural sustainability without having ye<sup>t</sup> reached consensus [4,5].

Undoubtingly, defining agricultural sustainability, as with every other sustainability concept, is a challenging task. Nevertheless, it is a common agreemen<sup>t</sup> that agricultural sustainability should at least address the three basic pillars of sustainable development by appraising simultaneously environmental, economic, and social issues related to agricultural practices [6]. However, the sustainability assessment of agricultural practices, in general, can be a very challenging task since it involves many case-specific variables to be taken under consideration.

Figure 1 presents various processes, inputs, and outputs involved in agricultural production, demonstrating the di fficulty and complexity in generalizing the sustainability assessment process. There are general cultivation guidelines and corresponding operations stages for almost all crops (e.g. seeding, irrigation, and harvesting). However, the agronomic practice, the machinery types, the technology level, as well as the quantities and type of materials used may vary, depending on the type of crop, the implementation practice, the country (even the region of the cultivation), and the prevailing climatic conditions. All of the aforementioned parameters a ffect the cultivation process and the respective inflows and outflows.

It is obvious that the standardization of the Agricultural Sustainability Assessment is a challenging task. Considering the growing interest in assessing the sustainability issues related to agriculture, several tools and methodologies have been developed [7,8]. Among those tools some have gained greater acceptance and are widely used by the majority of practitioners worldwide, such as life cycle assessment (LCA), which is standardized by ISO in ISO 14040:2006 and ISO 14044:2006 [9]. In addition, many indicator-based methods have been developed for the sustainability assessment of agricultural practices that use di fferent approaches with regards to the overall objective, the intended users, and the definition of agricultural sustainability they employ [4].

**Figure 1.** Variables involved in agricultural sustainability assessment.

Considering what was mentioned above and that there is not ye<sup>t</sup> an established standardized methodology, it is very important for anyone attempting to assess agricultural sustainability to have an overview of the available and most usually used methodologies and tools to that scope. As a result, there is a need for a methodological framework that will help practitioners to evaluate the existing available tools and methods in order to select the appropriate one, for each specific task.

To that end, the present paper has a two-fold objective:


framework is applied to 38 Agricultural Sustainability studies published in peer-reviewed journals in the last decade (2009–2018).

#### **2. Materials and Methods**

#### *2.1. Methodological Framework for the Systematic Review of Agricultural Sustainability Studies*

#### 2.1.1. Research Design

The evaluation process implemented to assess and select the criteria needed for the methodological framework of the systematic review on agricultural sustainability studies is presented in Figure 2. Initially, scientific literature published in Science Direct and Scopus was searched using the specific keywords and Boolean operators (AND/OR). The keywords were selected with respect to the integrated concept of "sustainability assessment", as well as the individual processes it consists of, namely, "environmental assessment", "economic assessment", and "societal assessment" (or "social assessment") combined with the keywords agriculture/farming using the Boolean Operator AND to exclude results that are not relevant to the field under examination. It should be added that the concept of "agricultural sustainability" was also included in the search.

The first sample of scientific papers that resulted from the initial search included 55 papers from peer-reviewed scientific journals. These papers were put through a screening process considering specific exclusion criteria presented in Figure 2. Specifically, studies that were not related to agriculture and especially focused on alternative agricultural processes were excluded. As a result, papers exclusively focused on aquaculture or organic farming studies, biofuels and biorefinery as well as review of studies comparing agronomic protocols were excluded from the present assessment. Additionally, review studies regarding soil quality, land management, food processing systems and discussions that did not specifically define the methods of the review conducted, were excluded. At this point, it should also be stated that in the context of agricultural sustainability studies, livestock farming was included in the search.

**Figure 2.** Evaluation process to introduce a methodological framework for the systematic review of agricultural sustainability studies.

The final paper collection comprises 16 review papers or studies that assess agricultural sustainability. It should be noted that the literature is relatively scarce regarding studies that consider all the three dimensions of sustainability with respect to other scientific fields, for example, the secondary production of goods. To that end, the sample includes studies considering the environmental aspect of agricultural sustainability which is the most often studied. The sample was then assessed in two

ways, a systematic and critical [10]. The systematic way concerns the listing of the papers based on specifically defined criteria [11]. The initial listing criteria in the case of the presented framework, include the title and author of the paper, the year of publication as well as the spatial coverage of the study (Global or Regional) and the type of review (Critical or Systematic).

Critical reviews are thorough literature works that attempt to evaluate and assess the basic aspects or inputs and document the di fferences in methodology and implementation of scientific studies on a specific field [11]. In this case, the critical evaluation of the sample concerns the individual analysis of the selected studies with the purpose of extracting the individual evaluation criteria used in each study. The individual criteria with similar context were aggregated in a general table of criteria. Then, each paper was systematically reviewed as to whether each criterion was included in the review.

The resulting table is a comprehensive overview of the issues most frequently examined in a review study. The criteria that were used the most are the criteria that should be integrated in the methodological framework for the systematic review of agricultural sustainability studies. The rule followed in the present paper was to exclude criteria that were used in less than four papers. Following next is the sample presentation as well as the criteria frequency table along with a critical assessment of the sample used for the evaluation.

#### 2.1.2. Systematic Approach

The 16 review papers that were extracted by the implementation of the first steps of the methodology, presented in the previous section are presented in Table 1 along with their classification with respect to their type and spatial coverage.


**Table 1.** Review studies examined.

As presented in Figure 3, during 2016–2017, the number of review papers has increased, indicating a boosted interest in the sustainability of agricultural practices. Payraudeau et al. (2005) first analyzed and systematically reviewed six (6) agricultural sustainability methods employed in eleven (11) case studies, indicating the variety of objectives, target groups, and methodologies used [20]. Bockstaller et al. (2008), followed by presenting a typology of indicators and the evolution of the methods used for their advancement [19], in 2009, critically evaluateing four (4) comparative studies to analyze the methods of the comparison, highlighting their main results [23]. Also focusing on indicators, Binder et al. (2010) presented an evaluation review framework that was used to review agricultural

sustainability methods [4]. The framework assessed the normative, systematic and procedural aspects of the methods under evaluation.

**Figure 3.** Sample presentation (review papers): (**a**) Number of papers per year and (**b**) type of paper.

Regarding the types of review papers and their classification to systematic or critical according to the definitions presented in the previous section [11], it is observed that, in principal, both categories are equally preferred by the researchers. However, in some cases, the distinction is not clear or a systematic and critical review is performed at the same time. Such example is the work of De Luca et al. (2017), where authors performed a critical and systematic review to determine, among other issues, which Multi Criteria Decision Analysis (MCDA) and participatory methods have been used along with LCA tools and the type of integration used in each case [10]. Also, Baldini et al. (2017) critically reviewed forty-four (44) LCA studies on milk production and systematically compared their methods and results to highlight issues requiring further discussion and investigation [15].

Considering the selected samples, it can be stated that in most cases systematic reviews are used in order to compare methodologies and results regarding a specific field of agricultural application. Towards this objective, Peter et al. (2017) performed a systematic evaluation of eighteen (18) carbon footprint calculators used in energy crop cultivations [12]. Cerutti et al. (2011) systematically reviewed twenty-two (22) fruit production sustainability assessment studies [7], whereas De Vries et al. (2015) systematically reviewed LCA studies on beef production [21]. Additionally, McAuliffe et al. (2015) conducted a chronological review of LCA studies in pig production, attempting to demonstrate how LCA has captured technological advancements in the field as well as the methodological issues observed [17].

On the contrary, the majority of the reviews that were characterized as critical are dealing with the evaluation of indicator-based methods or the classification of agricultural sustainability indicators, such as the work of Acosta-Alba et al. (2011), who reviewed eight (8) agricultural sustainability frameworks that use reference values for their indicators and analyzed the methods for the establishment of the reference values and investigating ways for their improvement [14]. Latru ffe et al. (2016) provided a review of the available agricultural sustainability indicators, highlighting the relative high increase of environmental indicators as compared with the smaller interest in economic and social indicators [18]. Finally, Lebacq et al. (2013) reviewed the types of sustainability indicators and proposed indicative ground rules for the selection of agricultural sustainability indicators [22].

With respect to the spatial coverage of the reviews (Figure 4), the majority deals with studies from all around the world. Nevertheless, there are reviews assessing studies in specific countries or regions. For example, Roy et al. (2012), based on a systematic review and synthesis, presents a set of indicators that could be used to assess agricultural sustainability in Bangladesh, highlighting the need for integrated approaches and participatory processes during agricultural sustainability assessment [13]. Additionally, Morais et al. (2016) systematically reviewed twenty-two (22) agri-food-dedicated LCA studies in Portugal, revealing issues regarding the challenges faced and the lack of systematic regional approach in the country that could safeguard the accuracy and comparability of the results [16]. Lastly, Yan et al. (2011) reviewed thirteen (13) LCA studies on European milk production, indicating that direct comparison is challenging due to inconsistency regarding the used methodologies [9].

**Figure 4.** Spatial coverage of the papers reviewed.

#### 2.1.3. Critical Approach

The selected sample, which was thoroughly described in the previous section, was screened, to extract the individual evaluation criteria used during each review. As some criteria had the same objective or were of the same context they were categorized accordingly. Also, some studies further analyzed the criteria including various subcriteria, but this is out of the scope of this paper since it is an issue related to the scrutiny of the review each author aims to achieve and the corresponding scope.

Figure 5 presents the criteria identified during the screening process and the frequency of their occurrence. A total of forty-four (44) di fferent criteria were used in the sixteen (16) studies reviewed. The review criteria frequency table is presented in detail in Appendix A (Table A1). The first six criteria (beginning from the top of Figure 5) were common in most of the reviews examined and include the name and description of the assessment method or tool, the field of application, the country of application, and the year of issuing. The literature typology concerns the type of the document reviewed. For example, De Luca et al. (2017) classified the selected publications into three categories (Journal Article, Book Chapter, and Conference Proceedings paper) [10]. Baldini et al. (2017), on the other hand, refers to publication types classifying the sample according to whether the literature is an original article, a review, a research direction, or a scenario analysis [15].

**Figure 5.** Criteria frequency use.

Regarding the level of assessment and the system boundaries, many different approaches were identified. Payraudeau et al. (2005), Roy et al. (2012), and de Luca et al. (2017) assessed the level of assessment of the studies reviewed regarding its spatial coverage (global, national, regional, local, and farm level) [10,13,20]. Lebacq et al. (2013) distinguished studies by whether they concern a farm or a region, and Bockstaller et al. (2009) examined the level of assessment giving two different scale

options (field scale or farm scale) [22,23]. Cerutti et al. (2011) and Baldini et al. (2017) identified the system boundaries of the studies assessed [7,15], following a cradle-to-gate or cradle-to-market approach, whereas de Vries et al. (2015) reviewed studies at least from cradle-to-farm gate [21]. Lastly, Peter et al. (2017) examined both the level of assessment (global, regional, etc.) and the system boundaries (farm–gate or farm–gate–grave) of the studies they review [12].

The issue of the intended user of a method or tool is being considered in several of the studies reviewed. Binder et al. (2010) identified the target group of the examined methodologies [4], whereas de Luca et al. (2017) referred to the specific criterion as actors involved in the assessment process (i.e., local experts, scientists, workers, etc.) [10]. Bockstaller et al. (2008) classified the reviewed works according to the target user of the method reviewed, i.e. decision-maker, researcher, technician or farmer [19]. Considering the type and the accessibility of data criteria, Baldini et al. (2017) distinguish the data in experimental and model data [15]. The accessibility of data (or availability as expressed by Roy et al. 2012 [13]) is examined by Bockstaller et al. (2008) for three user groups, farmers, advisors, and administration [19].

With reference to the name and type of the indicators reviewed, many approaches were identified during the screening process. Lebacq et al. (2013) and Latruffe et al. (2016) categorized indicators based on their representation of the three pillars of sustainability (environmental, economic and social) [18,22]. Lebacq et al. (2013), however, extended the research scope by identifying means-based, system state, emission, and effect-based indicators [22]. Based on the calculation method Latruffe et al. (2016), Lebacq et al. (2013) and Bockstaller et al. (2008) distinguish indicators based on the method used for their calculation, i.e., single variables, emission factors, combination of variables, operational models, mechanistic models, etc. [18,19,22]. De Luca et al. (2017) categorize LCA indicators based on the impact categories they address [10].

Based on the rule set in the methodology section, the red line in Figure 5 presents the criteria exclusion threshold. Only criteria identified more than four times in the sample reviewed are included in the methodological approach for the systematic review of agricultural sustainability studies. A total of eighteen (18) criteria surpassed the exclusion threshold. These criteria are classified in groups with respect to their context and are presented in the subsequent section.

#### 2.1.4. Methodological Framework Presentation

Following the criteria determination process described in the previous sections, Figure 6 presents the critical synthesis to systematically review agricultural sustainability related studies. The proposed methodological framework is based on a series of criteria and divided into five (5) underlying categories. The first two categories refer to the initial screening stage. During this preliminary stage, the studies are assessed to determine if the study will be included in the sample on the basis of the case-specific exclusion criteria determined with regards to the scope of the review.

The initial screening stage includes two categories (i.e., "method identification" and "general information") of criteria with respect to the basic description of each study. The general information of a study concerns the year of publication and the type of literature which can be journal article, conference proceedings paper, book chapter, technical report, etc., and the country that the study was conducted. The method identification category includes criteria that deal with the assessment method developed or employed. Therefore, the criterion description of the assessment tool describes the method or tool presented based on whether it is a presentation of a new methodology, the application of an existing method or tool or a combination namely a new methodology that is implemented with an application example. The last criterion is the level of the assessment performed, i.e., global, national, regional, or farm level, according to the approach introduced by Gomez-Limon et al. (2010) [24]. After the initial assessment and finalization, for the sample to be reviewed, phase is completed; the in-depth review stage follows. For this stage, three (3) categories of criteria have been defined.

**Figure 6.** Methodological framework for the systematic review of agricultural sustainability studies.

The first category of criteria assesses the scope of the studies reviewed. The first criterion is the identification of the goal (or objective) of each assessment, so as it is feasible to perform comparative reviews among studies with the same objective. For that purpose, following the definition of Gaviglio et al. (2017), the papers are classified according to whether a method is "goal prescribing or "system describing" [25]. Other criteria proposed concern the determination of the target user, as well as the functional unit and the time dimension of the assessment.

The second category refers to the identification of impacts starting with the definition of the sustainability dimension examined in each study, continuing with the documentation of the impacts considered during the assessment expressed in indicators (name and type). The last category concerns the data and the calculation methods used for the assessment. The criterion type of data examines whether the data used are model or experimental. Furthermore, to examine the accessibility of data, the present study refers to the definition of Angevin et al. (2017) [26]. Therefore, depending on the data used, the assessment can be characterized as ex ante (indicating expectation and uncertainty) when focusing on assessing a new scenario or as ex post (indicating processing actual field data) when examining a current situation [26]. Additionally, for each study reviewed, the validation and aggregation methods should be examined too.

The proposed methodological framework aims at facilitating the comparison among studies in order to capture the research advancements and current practices in the field under examination. This is an issue of particular importance since the assessment of agricultural sustainability is not a standardized process and entails a plethora of different methods, tools, and frameworks that assess a large number of different indicators that represent an analogously large number of different impacts. Prior to designing any assessment model, an exhaustive review is mandatory to safeguard consistency and relevance with other works. Also, the systematic documentation of the advancements in the field is the only way to begin constructing a unified, commonly accepted methodology for agricultural sustainability assessment.

#### **3. Crop Production Sustainability at Farm Level**

#### *3.1. Search Scheme*

The methodology presented above was used to investigate the available and mostly used methodologies to assess the sustainability of crop cultivations at the farm level. The review begins with the collection of the initial sample of papers by searching within the most acknowledged databases

and more specifically, Scopus and Science Direct. The search scheme is based on specific keywords and their combination as presented in Table 2, and the use of Boolean operators (OR and AND) to increase the e fficiency of the search. The initial search resulted in 959 papers containing the keywords searched. The initial sample was then screened based on the inclusion/exclusion criteria of Table 2. This secondary assessment resulted in 387 papers which where, then reviewed against the initial screening criteria (Figure 6).



As the purpose of this review is to examine studies assessing crop agricultural sustainability at the farm level, the 387-paper sample was filtered to select the peer-reviewed journal articles that fulfilled the following criteria. (a) Examine all three pillars of sustainability (environmental, economic, and social). (b) Examine production at the farm level. (c) Examine only crop cultivation. The final resulting collection of journal articles consists of 38 papers as presented in the Table 3 and Appendix A.
