3. Normal vs. Non-Normal Statistics
The automation of the matching task is a process where a high number of false correspondences occur. These errors must be filtered in successive steps to determine the orientation, but even though a good orientation solution may be achieved, that does not imply that the subsequent Digital Surface Model (DSM) is free from blunders since there may be correspondences that do not belong to the object but to alien elements such as vegetation, urban artefacts, traffic, people, birds, etc.
On the other hand, when comparing the DSM obtained by means of photogrammetry to the DSM obtained by means of laser scanning it is important to have in mind that both systems have their own sources of error and their own sensitivity to the impact of outliers. In addition, if both systems are to be compared, it is important to notice that the points on both DSMs are not the same and thus, a strategy to find the best correspondences must be applied. An additional source of gross error is that the positions of both sensors are not identical and thus, not all the object surfaces are equally captured in the data sets.
Due to the reasons mentioned above, it should be expected that the percentage of gross errors associated to the data sets increases and thus, the methods based on normal Gaussian statistic do not perform well. To do so, the steps to be applied are the following: firstly, to check the normality assumption based on statistical graphics (QQ-plots) and numerical methods (skewness and kurtosis indices). Secondly, to test the accuracy measures of discrepancies dataset for the normal distribution based on mean error (μ), standard deviation (σ) and their corresponding confidence intervals (CI). And thirdly, to apply a robust model based on non-parametric estimation using sample quantiles as reference and adding the median (m) and the biweight midvariance (BWMV) as robust measures of the mean and standard deviation, respectively.
Therefore, we propose a break with the “deeply rooted custom” of the zero and normal distribution of errors being considered as the appropriate standard measure of accuracy in the photogrammetric and laser scanning studies (e.g., as adopted by the National Standard for Spatial Data Accuracy—NSSDA), incorporating and establishing a comparison between parametric and nonparametric statistical methods.
Although the sensitivity of normality tests to non-normal data could seem an efficient alternative, it should be remarked that these tests do not work properly with large datasets since the central limit theorem comes into play [
23], so that normality tests were only applied in those cases with a reduced number of observations. To large datasets, a better diagnostic for checking a deviation from the normal distribution is the visual plot quantile-quantile (QQ-plot). In this case, the quantiles of the empirical distribution function are plotted against the theoretical quantiles of the normal distribution. If the distribution follows a Gaussian function, the QQ-plot should be a diagonal straight line. The skewness parameter (
Equation (5)) provides an indication of departure from symmetry in a distribution (asymmetry around the mean value), whereas kurtosis parameter (
Equation (6)) is a measure of whether the data are peaked or flat relative to a normal distribution. If the distribution is perfectly normal, skewness and kurtosis values of zero are obtained:
where
μ is the mean,
n is the number of data points and
σ is the standard deviation.
When the datasets follow a normal distribution the classical accuracy measures such as mean error (
μ) and standard deviation (
σ) are considered. Likewise, confidence intervals (
CI) are provided for parameters based on the theory of errors and considering the interval (
x ± 1.96
σ), where
x is the parameter and
σ its standard deviation for a confidence level of 95%. In those cases where some minimum outliers could remain, the parametric approaches establish the 3
σ rule to remove outliers which can corrupt the true statistical distribution of errors. If after applying the 3
σ rule error data sets still follow a non-normal distribution, robust and non-parametric methods for the derivation of accuracy measures should therefore be applied. For a small Gaussian sample size, the 3
σ rule could be replaced by Chauvenet's criterion [
24] that rejects those errors which have a probability of occurrence that is less than the probability of occurrence corresponding to a proportion of 1 − 1/(4
n) of the sample, being
n the sample size [
25]. For those cases for which the distribution of the data is not known, other approaches for deriving accuracy measures need to be applied. Interquartile range is an unbiased estimator of standard deviation [
26], whereas the
BWMV (
Equation (7)) is a robust estimator of the statistical dispersion for heavy tailed distributions [
27]:
where
m is the median,
n is the number of points,
U is a parameter that comes from
Equation (8),
μ is the mean and
MAD is the median absolute deviation, being:
where
m denotes de median and
mx is the median of the data. Finally, the value of the parameter
a (
Equation (7)) could be 0 or 1 depending on the
U value. If −1 ≤
U ≤ 1, thus
a is1; in any case
a is 0.
4. Methodology
Two similar sites were chosen to undertake the experiment: the portals of two Romanesque churches—San Pedro and San Segundo—located in the city of Ávila (Spain) (see
Figure 1). For these two sites, two data sets with two different instruments, digital camera and terrestrial laser scanner (
Table 1) were acquired:
A double data set by means of convergent photogrammetry, using two different reflex cameras (Canon EOS 500D, Nikon D80).
A double data set by means of two different types of terrestrial laser scanner, Faro Photon 80 and Trimble GX (
Table 1).
In addition, when processing the photogrammetric data sets, two different tools were used, the commercial software (CS), Agisoft Photoscan, and the PW-Photogrammetry Workbench, the in-house software.
This gives us a total of six DSMs for each of the portals: two laser scanner data sets processed once each, and two images data sets processed by two methods each. When comparing photogrammetry
vs. laser scanner all the photogrammetric DSMs (four per portal) were compared with all the laser scanning DSMs (two per portal) which gives a total of eight comparisons per portal. The camera positions are depicted in
Figure 2 along with images of both portals and some examples of the points cloud. To guarantee a common reference frame some control points were accurately observed by means of a total station, Topcon IS Imaging Station.
Both cameras have been chosen for their medium-range performance together with their affordable cost. They provide a 4.7–6 μm pixel size (see
Table 1) that guarantees small enough Ground Sample Distances (GSDs): about 3 mm for a focal length of 17–18 mm and a shooting distance of about 10 m. These values, which concern the
a priori accuracy as well as those that express the
a posteriori accuracy, are collected on
Table 2 and further commented below this table.
The laser scanners are different in their performance: different scanning speed and different vertical field of view (see
Table 1). Both sensors are also different in their accuracy: 1.4 mm at 50 m and 2 mm at 25 m. (see
Table 1). For the specific case of the two portals data acquisition, the
a priori accuracy values (as well as the
a posteriori values) are collected on
Table 2 and further commented immediately below.
The total station was chosen for its performance and high accuracy. In any case, this last issue is not very important since its role was to provide a unique coordinate frame (see
Figure 2) for all the data sets so that they could be compared to each other. In other words, it is the relative (and not the absolute) accuracy which is relevant here.
The software to process the laser scanners point clouds was Trimble Realworks for a time of flight laser scanner and Faro Scene for the phase-shift sensor, whereas the georeferencing of the points clouds according to the total station coordinates was performed in Helios in-house software (own developed software).
Finally, it should be remarked that both portals exhibit an a priori radiometric good behaviour. The surfaces materials are wood or stone with rich density level patterns so that matching algorithms can perform well.
Figure 1 below shows examples of both set of images whereas
Figure 2 shows examples of some of the Digital Surface Models obtained.
Table 2 shows values of the
a priori and
a posteriori accuracy of the datasets. For the
a priori photogrammetric accuracy, the Ground Sample Distance (GSD) is assumed for planimetry (
XZ plane) while
GSD*D/B is assumed for the relief direction (
Y axis), where
D is the average distance along the
Y axis between camera stations and object and
B is the maximum distance between camera stations. For the
a priori laser scanning accuracy, the Reshetyuk equation [
28] is used. For the photogrammetric case, a posterior accuracy, the sigma naught of the bundle adjustment “projected” on the object (by means of the quotient between the GSD and the pixel size) is used. For the
a posteriori laser scanning accuracy, the root mean square error (RMSE) of geo-referencing of the point cloud with the Ground Control Points (GCP) is used. In order to provide a good assessment of the degree of agreement between DSM from photogrammetry and laser scanning the following procedure was assumed:
- (1)
To match the points of the laser scanning DSM to the points of the photogrammetric DSM a minimum distance approach was applied.
- (2)
Once a pair of points was set, the difference of the three coordinate components was evaluated. This threefold strategy is due to the fact that a 2.5D photogrammetric structure configuration must be assumed rather than a 3D one and a different behaviour for the fundamental plane XZ and for the relief direction (Y) must be expected. X is the width dimension of the portal, Z is the height of the portal; Y is the depth of the portal.
5. Experimental Results
As a previous step and in order to assess the agreement between photogrammetric results and laser scanner results, the non-parametric correlation coefficients were computed, by means of Spearman's correlation coefficient. On
Table 3 the results for the Spearman coefficient between data sets for the three coordinates are collected. From them we can see a very high and consistent agreement regarding the
X and
Z coordinates (higher for the
Z case) and a not so high and not so consistent agreement for the
Y coordinate. These results confirm that photogrammetric and laser scanning derived point clouds are very largely equivalent for the planimetric dimensions (
XZ), but that along depth dimension (
Y) the agreement should be addressed much more carefully.
After this, and in order to assess how well the Gaussian and non-Gaussian parameters, discussed on Section 4, describe the differences between pairs of data sets, for both portals, the items that appear on
Table 4 (first column) were computed for the three coordinate components (
X,Y,Z) for each of the comparisons resulting from the laser scanners (Faro Photon 80, Trimble GX) against the photogrammetric approaches (PW, CS) using digital cameras (Canon 500D, Nikon D80). This means that, as described on the previous section, for each pair of data sets (one from scanner laser and the other from photogrammetry) and for each of the three coordinates, a new data set was computed consisting of the difference between matching points from both original data sets. Once obtained this “discrepancy set” the items of the first column were computed by means of statistical tools of Matlab software as well as own implemented statistical software (STAR). Certainly, not all these parameters have the same significance but they are shown here for illustration purposes.
Table 4 is an example showing the results obtained for the
Z component.
The last five rows of
Table 4 show the behaviour of gross errors according to the following criteria:
Percentile 0.01 and
Percentile 0.99 collect the observed values that correspond to such percentiles whereas ±2.326
*σ, shows the value of observations that lie outside the range of a standard deviation of ±2.326. This threshold is chosen to agree with the percentile 0.01–0.99 criteria, that is, that includes 98% of the sample (or leaves out 2% of the sample). The two last rows show the percentage of observations whose value is smaller than −2.326*
σ, or larger than +2.326*
σ, respectively (according to strict Gaussian theory it should always be 1%).
Besides this, the QQ-plots of all of the 48 comparisons were obtained.
Figure 3 shows two examples. From the visualization of every one of these plots it could be concluded that there is a large departure from the normal distribution of all the samples due to the presence of gross errors.
In order to assess and discuss the results, the following
Tables 5 and
6 show a synthesis of all the comparisons for each of the portals. Each cell gives the average and the [minimum; maximum] interval are presented for the following parameters: Mean (mm), Standard Deviation (mm), Median (mm), Interquartile (Q25–Q75) (mm), Median Absolute Deviation (MAD) (mm), Biweight Midvariance (BWMV) (mm), Kurtosis (adimensional), Skewness (adimensional) and percentage of left (lower) and right (upper) blunders (values larger than ±2.326). From these tables it is possible to highlight the following issues:
The sample mean should show that a certain disagreement is presented between the data sets. For the
X coordinate in San Segundo, the values are slightly smaller than 1 mm and for the
Z, the values are even smaller. All the results are very consistent, with small deviations within them and always with the same sign for
X and
Z. In the case of San Pedro a systematic effect occurs between data sets that involve Faro or Trimble laser scanner (
Table 7). The total displacement between both lasers samples is around 2 mm, affects the width dimension (
X) and does not appear on the height dimension (
Z). This question also appears when looking at the median values (see infra) and could be due to an
X shift when referencing both laser data sets at San Pedro.
Regarding the depth dimension (Y coordinate) in the case of San Pedro it can be appreciated that the values are very similar to the planimetric dimensions values, but that the range variation of these values is slightly higher (and, thus, worse) than the planimetric ones. In the case of San Segundo, there appears a difference between 1 and 2 mm between photogrammetry and laser scanner for five of the cases and of 5, 6 and 7 mm for the other three cases which always involve the Trimble laser scanner.
Consequently, although there seems to be a discrepancy between photogrammetry and laser scanner this is too small to declare as significant. In any case, if some systematic trend is appearing it affects the comparative performance between the scanners rather than between the laser scanning and the photogrammetric ones (see
Table 7).
Regarding the values of the median, it must be said, first of all, that they always confirm the behaviour that has been stated above concerning the values of mean. The values of the median are sometimes smaller than the values of the mean, sometimes equivalent and sometimes larger but the differences are always very small and always affect the three coordinate components X, Y and Z in a very consistent way.
In addition, under the assumption that the sample median is a more robust estimator than the mean is, the values of this parameter should show a significant disagreement between the data sets but this disagreement is not apparent from the values that have been obtained: the median shows consistently the same small discrepancies that the mean does. Thus, it can be concluded that the existence of blunders or observations that do not conform to the Normal Distribution does not have an influence on the disagreement between data sets. But, as has been said before, the discrepancies are too small to be regarded as significant.
Concerning the values of the standard deviation, it can be seen that for X and Z, in the case of San Segundo, consistent values around 5 mm are obtained and not as much consistent values around 8 mm are obtained for Y. This confirms what is predicted by theory: the precision along the depth direction (Y) is worse than for the planimetric coordinates (X, Z). For these cases (San Segundo), the maximum values (up to 17 mm along Y direction) are obtained when using the Nikon camera and the commercial photogrammetric software (CS).
In San Pedro, the standard deviation results are worse than for San Segundo, concerning the X and Z coordinates: around 11 mm for the former and around 7 mm for the latter. Also, the consistency is not as high as in the San Segundo case: for both dimensions, the results for the Canon camera are twice as worse than for the Nikon camera. The Y results are also worse than the results of San Segundo, about 11 mm, and also show the same differential performance between both cameras.
In any case, the values of the standard deviation are always worse than the a priori values that should be expected from theory. They also show that the Y dimension accuracy is worse than the X-Z dimensions accuracy at should be expected. This fact is not as clear for San Pedro as it is for San Segundo.
When analysing the percentage of blunders, the first result that must be remarked is that this percentage is 1.6% on average, that is, slightly above what should be expected (1%) and this is not an excessive number of outliers, but much more relevant than this raw and small number is the large variety of results that can be found: from sets that only show 0.05% of blunders to sets where the number of gross errors is 5.4%. Furthermore: there is almost always a significant lack of symmetry between the number of gross errors at the left (lower) and at the right (upper) on the distribution of the frequencies curve. So, what should be highlighted is that it is not a matter of a high number of blunders but of an asymmetric distribution of them.
Confirming what has just been said, all the values that are obtained for the interquantiles as well as for the skewness, show that the data sets do not conform to the symmetry of the Normal curve. It should be remembered that this result is also apparent from the QQ-plots. It should be added that the values of the quantiles (Q25 and Q75) and the median are always consistent with the values of the skewness showing a lack of symmetry to the left or to the right. The consistency is also high for all the comparisons within the same dimension, X, Y or Z.
Finally, when examining the values of the Median Absolute Deviation (MAD) and the Biweight Midvariance (BWMV), very consistent results are achieved for X and Z, slightly better for Z than for X. This behaviour (Z better than X) can be due to the fact that both portals are less complex (in their shapes) along height (Z) direction than along width (X) direction. It must also be noted that the X-Z values are significantly better in the case of San Segundo than in the case of San Pedro. This could be related to the same explanation: San Pedro surfaces are more articulated than San Segundo surfaces. In any case, the values of the Biweight Midvariance for X and Z lie between 2 and 4 mm and these values certainly meet the a priori expectations. It should be remembered that, on the contrary, the standard deviation values do not meet what the theory predicts.
The MAD and BWMV results for the depth dimension (Y coordinate) are also very consistent within them. They are also very similar between the two portals. These values also meet the a priori expectations and, therefore, also show the relation with the planimetric accuracy values that theory states. In addition, this relation between the accuracy of X-Z dimensions and Y dimension, expressed through MAD and BWMV, is much clear than when it is expressed through the standard deviation.
6. Conclusions
The DSMs obtained from photogrammetry are largely equivalent to the DSMs obtained from the laser scanner. Some very small inconsistencies have arisen, but these affect the comparative performance of the laser scanners or the comparative performance of the cameras rather than the comparative performance of photogrammetry and laser scanning.
All sets show a large lack of symmetry that leads to the conclusion that the standard Normal parameters are not adequate to assess this type of data. The Normal distribution fails to appropriately describe the data for the cases that have been examined. In particular, this is especially the case when assessing accuracy through the standard deviation, since this parameter fails to provide a good estimation of the results.
Use of non-Normal statistics gives a more appropriate description of the data and yields results that meet what may be expected concerning the assessment of accuracy. The results obtained for the Median Absolute Deviation and for the Biweigth Midvariance agree with the values predicted by the theory.
This can be extended to what usually happens in photogrammetry in a 2.5D case. The planimetric dimensions show better results than the relief (depth) dimension according roughly to the factor D/B (distance to the object-camera base quotient). Some results appear to agree with the shape itself of the objects but these values are not apparent enough to consider them a straightforward conclusion.
Regarding future work and in order to extend the validity of these results in the wider field of imaging, this type of experiment could be applied to other cases in which some other shapes, depth variations, images settings or some other network designs (to include real 3D cases) should be analysed. Also, other materials (metal or uniformly painted walls) that should not be as favourable as the ones tested here (wood and stone) must be considered. Of course, more hardware and software should be tested to extend the validity of the conclusions presented here.