*Article* **Interlaboratory Empirical Reproducibility Study Based on a GD&T Benchmark**

## **Ali Aidibe \*, Souheil Antoine Tahan and Mojtaba Kamali Nejad**

Mechanical Engineering Department, École de Technologie Supérieure (ÉTS), Montreal, QC H3C 1K3, Canada; antoine.tahan@etsmtl.ca (S.A.T.); mojtaba.kamalinejad@etsmtl.ca (M.K.N.)

**\*** Correspondence: ali.aidibe@etsmtl.ca

Received: 2 June 2020; Accepted: 6 July 2020; Published: 8 July 2020

**Abstract:** The ASME Y14.5 geometric dimensioning and tolerancing (GD&T) and ISO-GPS (geometrical product specifications) standards define tolerances that can be added to components to achieve the necessary functionality and performance. The zone that each feature must lie within is defined in each tolerance. Measurement processes, including planning, programming, data collection (with contact or without contact), and data processing, check the compliance of the part with these specifications (tolerances). Over the last two decades, many works have been realized by the metrology community to investigate the accuracy, the measuring methods, and, specifically, the measurement errors of fixed and portable coordinate measuring machines (CMMs). A review of the literature showed the progression of CMMs in terms of accuracy and repeatability. However, discrepancies were observed between measurements using different CMMs or operators. This paper proposed a GD&T-based benchmark for the evaluation of the performance of different CMM operators in computer-aided inspection (CAI), considering different criteria related to the dimensional and geometrical features. An artifact was designed using basic geometries (cylinder and plane) and free-form surfaces. The results obtained from the interlaboratory comparison study showed significant performance variability for complex GD&T, such as in the composite profile and localization. This, in turn, emphasized the importance of GD&T training and certification in order to ensure a uniform understanding among different operators, combined with a fully automated inspection code generator for GD&T purposes.

**Keywords:** measurement system analysis; coordinate measuring machine; reproducibility; GD&T; quality; metrology; measurement uncertainty

## **1. Introduction**

Geometric dimensioning and tolerancing (GD&T) (or geometrical product specifications (GPS)) is a language of symbols widely used in engineering drawings and computer-generated models to describe, communicate, and determine feature geometry permissible deviations. GD&T is an efficient and unambiguous way of communicating the measurement conditions and specifications of a part. This language accompanies the entire process chain and helps communicate the part intent and function through the design, manufacture, and inspection. As well, it provides a more precise depiction of part features and focuses on the feature-to-feature relationships.

Standards, such as ASME Y14.5-2009 [1] and ISO-GPS [2–8] are comprised of a library of symbols, definitions, rules, and conventions that describe a part in terms of tolerances based on the size, form, orientation, and location. The main and necessary steps needed to derive GD&T results begin with nominal information that describes a specific feature. The manufactured part is inspected using a measurement device (such as a coordinate measuring machine (CMM)) and compared with the nominal definition (e.g., a computer-aided design (CAD) file) in order to verify the dimensional and

geometric feature specifications (the actual size and tolerance). The deviations are then computed and displayed. Figure 1 presents the inspection process definition model.

**Figure 1.** The inspection process definition activity model.

## *1.1. Measurement Uncertainties—Overview*

In metrology science, the true values (ideal quantities) may never be known, and all measurements could potentially have some degree of uncertainty, which is often a function of several variables (sources). The difference between the true and measured values is known as an error. Uncertainty can, as defined by the Guide to the Expression of Uncertainty in Measurement (GUM) [9], be considered as a *'parameter, associated with the result of a measurement that characterizes the dispersion of the values that could reasonably be attributed to the measurand'.* Thus, the estimated value *y* of the measurand *Y* is generally calculated using the relationship presented in Equation (1):

$$y = f(\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_n) \tag{1}$$

where *xi* is the estimation for each input variable *Xi* that could potentially have a significant influence on the measurement result (*y*). The function *f* can be known and explicit. However, in some cases, the measurement function is unknown or very complex, and no analytic expression is available.

If the function *f* is explicit and the input quantities are not correlated, the law of propagation of uncertainties given in [9] generally represents the combined standard uncertainty on the estimated value *u*(*y*) by:

$$u(y) = \sqrt{\sum\_{i=1}^{n} \left(\frac{\partial f}{\partial \mathbf{x}\_i}\right)^2 u^2(x\_i)}\tag{2}$$

where *u*(*xi*) is the standard uncertainty of each input variable *xi* In practice, the expanded uncertainty *U*(*y*) corresponds to the combined standard uncertainty multiplied by a coverage factor *k,* where *k* is chosen, for a prior confidence interval (1–α) to be the t1−α/2,<sup>ν</sup> critical value from the *t*-table, with ν degrees of freedom (Section 6, [9]).

Monte Carlo simulations are typically used to approximate the statistical behavior of the measured value in situations where the measurement function cannot be found directly. To determine the output, the input variables are generated randomly for each simulation within their respective uncertainty ranges. The output probability density function (PDF) is then used for evaluating the uncertainty [10]. Finally, in cases where the measurement function is very complex (or unknown), an empirical estimation can be established using certain assumptions and simplification hypotheses, as proposed in the Measurement System Analysis (MSA) guide from the Automotive Industry Action Group [11].

The MSA consists of a specifically designed experiment aimed at determining the components of variation in the measurement (e.g., the reproducibility, repeatability, bias, etc.). Indeed, the process of obtaining measurements (and defect level estimation) may have variations and produce uncertainty. The analysis tools proposed by the MSA (e.g., the gage repeatability and reproducibility (R&R)) evaluate the uncertainty on a direct measure (*f*(*x*) = *x*), such as the thickness measurement from a micrometer. The aim of the whole process is to guarantee the integrity of the data used for quality analysis and to consider the consequences of a measurement error for decisions taken on the product. The reader is referred to [11] for more details.

## *1.2. Measurement Uncertainties Associated with Dimensional and Geometric Measurement Using CMM*

During the last three decades, the coordinate measuring machine (CMM) saw progress in terms of accuracy and repeatability, which, in turn, resulted in productivity improvements. Currently, CMM plays a major role in GD&T standards, such as [1–8], which call for crucial measuring equipment needed for manufacturing quality control [12]. Notwithstanding such improvements, however, uncertainty can be induced not only by the equipment used, but also by the algorithmic choices and the measurement methodology adopted [13–16].

Measurement uncertainty evaluation (quantification) is a crucial step in characterizing and certifying the consistency of the inspection results [17,18]. Measurement uncertainty evaluation must be carried out to ensure advances in measurement science. CMM measurement uncertainty evaluation has become a key focus area for research by many institutions around the world. The Physikalisch-Technische Bundesanstalt (PTB) in Germany, for instance, suggested an expert system scheme for CMM uncertainty evaluation and investigated the impact of the measurement strategy on the overall CMM uncertainty [19].

The National Physical Laboratory (NPL) in the UK standardized the measurement strategies for CMM in order to ensure that the measurement results are reliable [20]. A few authors have employed the design of experiment techniques to estimate the CMM measurement uncertainty. The factorial design of experiments was applied by Feng et al. [21] to study the measurement uncertainty of the position of a hole measured by CMM. They analyzed the effect of variables and their interactions on the uncertainty, while complying with the fundamental rules of the Guide to the Expression of Uncertainty (GUM) [9].

Kritikos et al. [22] designed and implemented a random factorial design of experiments in order to analyze and quantify the influence of different factors (stylus diameter, step width and speed) and their interactions on the CMM measurements' uncertainty of the variable's parallelism, angularity, roundness, diameter, and distance. Other authors, such as Kruth et al. [23] and Sladek et al. [24], proposed methods to determine uncertainties using the Monte Carlo method for feature measurements on CMM. Hongli et al. [25] proposed the Simplified Virtual Coordinate Measuring Machine (SVCMM) method, which makes full use of the CMM acceptance or reinspection report and the Monte Carlo simulation method.

For dimensional metrology with CMM measurements, a task-specific uncertainty estimation was suggested by Haitjema [26], and can be extended to other measurement types as well as linear dimensions, forms (flatness, cylindricity, etc.) and roughnesses. Beaman and Morse [27] performed an experimental evaluation of the software estimation of the task-specific measurement uncertainty for CMMs. Jakubiec et al. [28] addressed this topic and proposed an evaluation of CMM uncertainty, not by studying each axis of the machine, but by proceeding based directly on key specifications expressed in the GD&T standard. Jbira et al. [29] suggested a benchmark including several geometrical and dimensional features for the algorithm efficiency comparison of different Computer-Aided Inspection (CAI) software applications. A comprehensive review of different methods, techniques, and various artifacts for monitoring CMM performance can be found in the research work conducted by [30–33].

In coordinate metrology, Weckenmann et al. [34] mentioned the main contributors to uncertainty, which they subdivided into five groups: measuring devices, environment, workpieces, software, operators, and measurement strategy. A great deal of work has been carried out by the metrology community in terms of investigating the measuring devices, environment, and workpiece components. Although no common understanding of software validation procedures currently exists, the reader is referred to [35], as well as to the European Metrology Research Project (EMRP) under the denomination 'Traceability for computationally-intensive metrology (TraCIM)' [36–38] for research performed on software validation in the field of metrology.

In this paper, we aimed to analyze the measurement uncertainty from an empirical (experimental) perspective. From a review of the literature on the subject, the collective impact of the operator (training, skills, certification, GD&T decoding and interpretation, etc.), the measurement strategy (amount of data, samples, number of measurements, etc.), and the software employed (algorithms used, filtering or removal of outliers, optimization of the stability of the algorithm, layout handling, etc.) had been surprisingly overlooked. We proposed a new GD&T-based benchmark (test artifact) for evaluating (comparing) the performance of measurement systems in different measurement organizations (e.g., industry, schools, and metrology service companies) by considering the uncertainty that can be induced by the operator, the measurement strategy, and the software used.

Under the conditions proposed by the equipment manufacturer, the current hardware is accurate enough to perform the "good" measurements to capture the actual position of a measuring point in the 3D space. In other words, the uncertainty induced by the measuring device is significantly less than that induced by the operator choices, the software options, and the measurement strategies. This means that the performance of a measurement system represents an estimation of the combined variation of the measurement errors (systematic and random errors), which include equipment (hardware) errors, algorithmic errors (software), and operator errors. In this paper, software and operator errors were combined into one, as they can be strongly correlated. According to the MSA approach, this was strictly a reproducibility study [11].

The basic concepts of metrology and related terms that conform to the International Vocabulary of Metrology (VIM) [39] were employed in the present work. According to VIM, reproducibility is the *'closeness of the agreement between the results of measurements of the same quantity, where the individual measurements are made: by di*ff*erent methods, with di*ff*erent measuring instruments, by di*ff*erent observers, in di*ff*erent laboratories, after intervals of time quite long compared with the duration of the single measurement, under di*ff*erent normal conditions of use of the instruments employed'* [39]. According to the Automotive Industry Action Group (AIAG) Measurement System Analysis (MSA) reference manual [11], reproducibility is traditionally referred to as the 'between appraisers' variability. Typically, the term is defined as the average of measurements made by different appraisers using the same measuring instrument when measuring the identical characteristic of the same part.

For the remainder of this paper, MSA terminology will be used [11]. EV stands for Equipment Variation, which is the variation due to repeatability, and AV stands for Appraiser Variation, which is the variation due to reproducibility.

To allow validation of this hypothesis, a GD&T-based artifact was designed using common geometric features (plane, cylinder, etc.) and free-form surfaces. A total of five parts were created, one without any intentional defect (part #1), and four others with a predefined number of intentional dimensional and geometrical defects (parts #2 to #5). The artifacts were intended for use in assessing the performance of many measurement institutes (interlaboratory comparison) in accordance with the dimensional and geometrical tolerance criteria.

The remainder of this paper is structured as follows: In Section 2, we outline the proposed test artifact model, followed by an experimental procedure. A comprehensive metrological and statistical analysis, followed by a general discussion of the results, is presented in Sections 3 and 4. Finally, a summary is provided and future works are described.

## **2. Materials and Methods**

A new GD&T-based test artifact is presented in this section. The model is designed for interlaboratory comparisons of CMMs. Figure 2 provides a visual representation and a description of the proposed artifact, as well as its sub-elements. To ensure the measurement of different shapes and geometrical tolerances, the artifact included basic geometric features (primitives), such as rectangular, planar, cylindrical, and conical surfaces; bore and hole patterns; and free-form surfaces.

gp y p

**Figure 2.** Description of the proposed geometric dimensioning and tolerancing (GD&T)-based artifact.

As shown in Figure 2 and Table 1, a total of ten different features (items) were selected in the artifact, and five main categories related to GD&T were proposed to be characterized and controlled based on ASME Y14.5 (2009) [1]:



**Table 1.** Predefined computer-aided design (CAD) geometrical defects (all dimensions are in mm).

The overall dimensions of the artifacts were 138 × 90 × 50 mm. They were conveniently transportable and could fit into small CMM metrology systems. The artifacts were made from aluminum: Part #1, with no intentional defects, and parts #2 to #5, with some predefined and intentional geometrical imperfections. The geometrical imperfections were performed in accordance with the procedure described in Table 2.


**Table 2.** Creation of the defects (all dimensions are in mm).

Table 1 presents the predefined defects, which were considered as the reference values (nominal defects). Their respective amplitudes were approximately in the same order of magnitude of tolerance. The final real geometry of the part 'as manufactured' was unknown and the actual values were calculated from the measurement points.

The artifact parts were manufactured on three-axis CNN milling machines at the École de technologie supérieure's Products, Processes, and Systems Engineering Laboratory (P2SEL). Figure 3 presents one of the five manufactured artifacts.

**Figure 3.** The proposed GD&T-based artifacts.

A total of 15 fixed and portable CMMs from different and independent industrial and academic collaborators in North America (Canada and the USA) were included in this investigation, and are presented in Figure 4. The CMMs used were named according to ISO 10,360 [40]. The accuracy of the CMMs used (equipment variation) in this study typically ranged between 0.7 and 45 μm (±2 σ level). All the induced defects for artifacts #2 to #5 were significantly higher than the aforementioned accuracies (Table 1).

**Figure 4.** The uncertainty caused by bias and linearity *uE* (equipment variation) of the participating fixed and portable coordinate measuring machines (CMMs) (in mm).

The measurements were performed from November 2013 to December 2017. Different institutes and industrial partners (eight industries, three schools, and three companies in the field of dimensional metrology) were asked to measure artifacts #2 to #5 without any particular focus. The aim was to analyze the ordinary measurement performance of each institute. Each artifact received a unique code for each partner (only the coordinator maintained the part-operator-equipment traceability). The circulation of the artifacts was arranged in a circular path, with the evaluation kit forwarded to the next participant and the results sent to the coordinator. Each partner carried out measurements with their own CMM system, which included calibrated equipment, specific software, and an appraiser.

The data were collected through a dynamic pdf form. In this form, each inspection item was mentioned and the operator was asked to: (1) accept or refuse the item and (2) mention the measured value. An online database was connected to this form for fully automatic and secure data collection.

Based on [9,18], the general mathematical model for determining the CMM task-oriented uncertainty is presented in Equation (3):

$$
\Delta L\_c \cong \pm \mathbf{k} \sqrt{\mu\_E^2 + \mu\_{EV}^2 + \mu\_{AV}^2} \tag{3}
$$

where *uE* is the uncertainty caused by bias and linearity (the equipment variation as provided by the manufacturers); *uEV* is the uncertainty caused by repeatability as defined in MSA [11]; *uAV* is the uncertainty caused by reproducibility as defined in MSA [12] (this includes the software used and the measurement strategy); *Uc* is the expanded combined uncertainty with a coverage factor *k* (obtained from Student's critical value table); and *Uc* represents the total error of the inspection process.

Some assumptions were made:


(Figure 4). Given the preceding, Equation (2) can be simplified and the measurement variation in this paper can be considered equal to *Uc* ≈ *AV* = ± *k uAV* .


## **3. Results**

Table 3 presents the [minimum, median, maximum] geometric and dimensional deviations for items #1 to #10.2, respectively. As illustrated in Figure 5, the results of the investigation are presented on individual value plots with error (interval) bars. For geometrical tolerances with zero target values, the measurements for each part (#2–5) are shown directly. For dimensional tolerances (in this case, the target is the nominal value of CAD), the deviations between the digitized parts (measurement) and the nominal part (CAD) are presented. Figure 6 presents the plots for size tolerances, while Figure 7 presents the plots for form tolerances (items #1 and #4.2) and orientation tolerances (items #3 and #6), Figure 8 presents the plots for location tolerances (items #5.1, #5.2, #7.1, #7.2, #8.1, and #10.2), and Figure 9 presents the plots for profile tolerances (items #9 and #9.1).


**Table 3.** Results (all dimensions are in mm).


**Table 3.** *Cont*.

**Figure 5.** General representation of the results: (**a**) geometric and (**b**) dimensional tolerances.

**Figure 6.** Results of the size tolerance items (**a**) #5 and (**b**) #8.

**Figure 7.** Results of the form tolerance items (**a**) #1 and (**b**) #4.2 and angularity tolerance items (**c**) #3 and (**d**) #6.

*Appl. Sci.* **2020**, *10*, 4704

**Figure 8.** Results of the location tolerance items (**a**) #5.1, (**b**) #5.2, (**c**) #7.1, (**d**) #7.2, (**e**) #8.1, and (**f**) #10.2.

**Figure 9.** Results of the profile tolerance items (**a**) #9 and (**b**) #9.1.

## **4. Discussion**

This investigation revealed the presence of varying degrees of uncertainty in measurement reproducibility while operating CMMs in different laboratories and institutions. Differing amounts of appraiser variation (*AV*) were present when identical parts were measured by different operators on different (but similar) CMMs of approximately similar designs. Based on the results of the different analyses:


**Figure 10.** This boxplot of variation = |measured-nominal| amplitude for different GD&T categories; (**a**) form, location, orientation, and size tolerances; (**b**) profile tolerances.

Overall, items without induced defects presented low variability (measurement uncertainty), while those with complex GD&T (e.g., composite features), as well as those recently added to the standard, presented high variability. The combination of different factors, such as the logistics and measurement strategy, the operator type, the set-up type, the size of the point clouds, the choice of the inspection algorithm, etc., appeared to be the source of this overall high measurement uncertainty.

These experimental findings may be applied to technical industrial practice to ensure the quality of the measurement results. They may also serve as an inspiration for proposing solutions to reduce the

measurement uncertainty. These solutions may include GD&T training and certification to recognize proficiency in the application and understanding of the GD&T principles expressed in the standards. This would ensure a uniform understanding of the drawings prepared using the GD&T language by different operators, as well as a uniform selection and application of geometric controls to drawings. Another such solution could be in the form of innovative combinations of applied methods, such as a fully automated inspection code generator for GD&T purposes.

**Author Contributions:** Conceptualization, S.A.T.; methodology, A.A., S.A.T. and M.K.N.; formal analysis, A.A., S.A.T. and M.K.N.; investigation, S.A.T. and M.K.N.; data curation, A.A.; writing—original draft preparation, A.A.; writing—review and editing, A.A., S.A.T. and M.K.N.; visualization, A.A. and S.A.T.; supervision, S.A.T.; project administration, S.A.T.; funding acquisition, S.A.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), grant number RGPIN-2015-05995.

**Acknowledgments:** The authors would like to thank École de technologie supérieure (Montreal, QC, Canada), the Natural Sciences and Engineering Research Council of Canada (NSERC), as well as all industrial and academic participants in North America for their support and contributions.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
