A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects

Elmahboub, Widad; Scarpace, Frank; Smith, Bill

doi:10.3390/rs1030278

Open AccessArticle

A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects

by

Widad Elmahboub

^1,*,

Frank Scarpace

² and

Bill Smith

³

¹

Mathematics Department, Hampton University, Hampton VA 23668, USA

²

University of Wisconsin-Madison, Madison, WI, 53706 USA

³

NASA Langley Research Center for Atmospheric Science Hampton, VA, 23681 USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2009, 1(3), 278-299; https://doi.org/10.3390/rs1030278

Submission received: 11 May 2009 / Revised: 29 June 2009 / Accepted: 8 July 2009 / Published: 15 July 2009

Download

Browse Figures

Versions Notes

Abstract

:

Atmospheric correction impacts on the accuracy of satellite image-based land cover classification are a growing concern among scientists. In this study, the principle objective was to enhance classification accuracy by minimizing contamination effects from aerosol scattering in Landsat TM images due to the variation in solar zenith angle corresponding to cloud-free earth targets. We have derived a mathematical model for aerosols to compute and subtract the aerosol scattering noise per pixel of different vegetation classes from TM images of Nicolet in north-eastern Wisconsin. An algorithm in C++ has been developed with iterations to simulate, model, and correct for the solar zenith angle influences on scattering. Results from a supervised classification with corrected TM images showed increased class accuracy for land cover types over uncorrected images. The overall accuracy of the supervised classification was improved substantially (between 13% and 18%). The z-score shows significant difference between the corrected data and the raw data (between 4.0 and 12.0). Therefore, the atmospheric correction was essential for enhancing the image classification.

Keywords:

improvement of supervised classification accuracy; remote sensing interpretation; simulation; modeling

1. Introduction

While technical advances have addressed some of the inherent data integrity problems associated with satellite-based remote sensing operations, inaccurate classification of remote sensing data remains an area of growing concern in the scientific community. This inaccuracy is due to a number of effects, among which are atmospheric and environmental effects [1]. This paper will focus primarily on a spectral approach to enhance scene classification accuracy through correction of atmospheric aerosol scattering. Much of the literature has addressed the correction of atmospheric effects using a dark object subtraction technique (DOS) requiring that a constant value representing a dark object pixel be subtracted from each pixel count throughout the image. The DOS approach is predicated on the hypothesis that measured reflectance values should ideally be close to zero over dark objects such as a shadow or a clear lake. When the measured radiance over such objects is markedly greater than zero, then these values must be due to atmospheric effects and can be subtracted from all other pixels in the image to reduce the hazy appearance of the image data caused by molecular and aerosol scattering. This method might improve the contrast of the image, but it has little effect on classification accuracy. Although the mean values of the classes change by the same amount, the covariance matrix remains the same. In certain circumstances, radiance calibration of the image data is necessary prior to classification using multitemporal images [2]. The atmospheric effect can prevent the proper interpretation of images if it is not corrected [3]. For many other applications involving image classification and change detection, atmospheric correction is unnecessary for a single date image [4]. As long as the training data and images to be classified are on the same relative scale, atmospheric correction has little effect on classification accuracy [5,6,7]. Past investigators have used relative atmospheric correction techniques, which proceed on the assumption that a linear relationship exists between the measurements from a targeted area over time. This methodology is used for scaling historic image data from a surface target when change detection trends are required. Correlations are performed on time series data to derive the proper scaling parameters for the normalization of image data [7]. A few studies investigating atmospheric effects have employed techniques that have resulted in the production of higher classification accuracy. However, these improvements in accuracy were not significant. A study by Karathanassi [8] introduces the Minnaert constant which is affected by the atmospheric diffusion of light. This is based on the optimum value of the Minnaert constant, which leads to a higher classification accuracy.

A number of radiative transfer codes based on radiative transfer theory have been developed to correct satellite images from atmospheric effects. Although radiative transfer codes can accurately correct the satellite measurements at the time of image acquisition, they are not frequently available. This makes the routine atmospheric correction of images difficult [9]. Many applications of remote sensing have relied on an algorithm that utilizes information derived from the image itself to correct for atmospheric effect. This technique is limited by an image-based correction algorithm. However, these algorithms are still limited to deriving optical properties from a single dark object and may not be effective in producing significant atmospheric corrections resulting in only modest improvements in classification accuracy.

We hypothesize that the accuracy of image interpretation and classification can largely be determined by the amount of aerosol scattering, which is directly proportional to the size of the solar zenith angle. As a result of atmospheric scattering due to aerosols, different pixels from different sites can be viewed as being spectrally inseparable and therefore would be inaccurately classified. In our approach, we investigated Landsat TM scenes located at high latitudes in the northern hemisphere to obtain and analyze data corresponding to large solar zenith angles and sizable target areas. This method of inquiry is based on the fact that the size of the solar zenith angle increases with latitude and in the northern hemisphere remains larger than about 60° throughout the year [10]. Significant scattering can exist for a solar zenith angle larger than 60° [11], which was a desirable experimental configuration since we wanted to test the robustness of our methodology under extreme conditions. Methods to derive and test correction algorithms were divided into three sections. Section 2.1 discusses the theoretical background and the mathematical derivation of the atmospheric correction components. Section 2.2 discusses the training set data development and the testing methods. Section 2.3 discusses the algorithm and the atmospheric correction method.

2. Methodology

An algorithm was written in C++ with iterations to correct for atmospheric scattering due to solar zenith angle variation and variation of the aerosol optical properties in different wavelengths. This method demonstrates a greater potential of being able to substantially reduce the number of misclassified pixels and enhance the classification accuracy. This was done by using randomly selected training sets of the vegetation classes across the scene.Atmospheric components such as scattering efficiecy, scattering coefficient, and path radiance per pixel were derived mathematicaly. The scattering efficiency was derived using Mie particle theory. We introduce a mathematical model which consists of two concepts to derive the scattering coefficient and the path radiance per pixel. This was followed by the implementation of the atlgorithm and atmospheric correction method. The methodology was divided into the following sub-sections.

2.1. Study Area

The study area was represented by four scenes of Landsat TM data in different dates for two different geographic locations in three bands (λ = 0.5–0.7 µm). The Milwaukee scene in south east of Wisconsin-USA located at 43°–43°7.5’ N latitude and 88°–88°5’ W longitude. The Nicolet scene in north east of Wisconsin-USA located at 45°–46°15’ N latitude and 88°15’–89°15’ W longitude. The Milwaukee scene #1 ID: Y5013916033x07; the date: 7/18/1984; 5,965 rows and 706 columns were extracted. Milwaukee scene #2 ID: Mil2330-92; the date: 5/5/1992; 3,001 rows and 300 columns were extracted (starting at the top left corner). The Nicolet scene #1 ID: TM930810; the date: 8/10/1993. Nicolet scene #2 ID: t 2428-9210; the date: 10/3/1992; 3,000 rows and 600 columns were extracted (starting at the top left corner). The Milwaukee roads and highways and Nicolet vegetation cover were used, including GPS ground truth data reference files.

2.2. Theoretical Background and Derivation of Atmospheric Correction Components

Radiative Transfer Theory is the theoretical basis for driving the equations used to develop the atmospheric correction algorithm. At a satellite detector, the received electromagnetic radiation reflected from the earth and scattered through the atmosphere (multiple scattering), is composed of the three components in the following equation:

L = L_{o} + L_{s} + L_{d}

(1)

where L_o is the radiation that is scattered through the atmosphere and arrives at the satellite detector without being reflected by the earth’s surface. The L_o component is independent of surface reflectance and causes a loss of contrast in the image spectral signatures. The L_s component is the radiation that is reflected by the surface, transmitted through the atmosphere, and intercepted by the satellite detector. The L_d component is the radiation that is scattered through the atmosphere from the neighboring pixels and transmitted to the sensor as

The scattering path radiance is defined as the sum of two components [12], the radiation of sunlight that scattered before reaching the surface (L_o); and the radiation that is scattered through the atmosphere by the neighboring pixels (L_d).

To derive the atmospheric model for aerosol scattering, we have to understand the physics of scattering in a realistic medium. Consider a cylinder of a unit volume, with cross sectional area dσ = 1, and length ds, to be embedded in the medium of the atmosphere. A pencil of monochromatic radiation with intensity I_V (Ω) traverses the cylinder and emerges with an intensity of I_V(Ω) + dI_V(Ω) after being attenuated by scattering in the medium. Two types of radiation emerge from the volume. The first one is the incident beam from direction Ω that undergoes the change of intensity -K_scaI_Vds on its passage, but continues in the same direction. The value of K_sca, the scattering coefficient, is the fraction of change in intensity I_V per unit length [13]. The second type is the radiation field emerging from the volume arising from scattering radiation and incident on the volume from all directions Ω’ except direction Ω (diffused radiation); this is called path radiance [13].The sum of the two radiation types gives the following equation of transfer:

d I (Ω) = - K_{s c a} I_{0} d s + \frac{K_{s c a}}{4 \prod} * d Z \int_{4 \prod} P (Ω, Ω^{'}) I (Ω^{'}) d Ω^{'}

(2)

I^{*} = \frac{K_{s c a}}{4 \prod} \int_{0}^{2 \prod} \int_{- 1}^{1} I (Ω^{'}) P (Ω, Ω^{'}) d θ^{'} d φ^{'}

(3)

where dI (Ω) is the change in intensity in the forward direction, I* is the diffuse multiple scattered radiation, Ω is the direction of propagation, Ω’ is the direction of the diffuse radiation, P is the phase function, Z is the altitude, θ’ and ϕ’ are the viewing zenith and azimuth coordinate angles, respectively [13].

We derived the path radiance (error per pixel) that is intercepted by the field of view of the satellite’s sensor by considering two radiative transfer equations as follows:

The basic equation of radiative transfer for direct scattering along the line of sight,
The equation of radiative transfer of diffuse scattered radiation.

The necessary assumptions include the following:

A plane parallel atmosphere in which radiation is transmitted up or down
A clear atmosphere (cloud free).
Attenuation due to absorption is relatively insignificant compared with scattering

The plane parallel atmosphere is related to the path length dS = dZ / cosθ (where θ is the solar zenith angle) and to the component of path radiance that scattered in the atmosphere before reaching the surface and subsequently was transmitted to a satellite detector L_o. The path length dZ is related to the component of radiation that has been scattered by aerosols, from the neighboring pixels L_d, when surface reflectance is directly reflected and transmitted to a satellite detector oriented such that it takes the observation at nadir [1]. The basic Radiative Transfer Equation (RTE) for the two components (L_o–L_d) (see Figure 1) from Equation 1 [13] is given by the following expression:

d I = K_{s c a} I (d S + d Z)

(4)

where dl is the change in intensity due to atmospheric scattering. Equation 4 is integrated as in the following:

\int_{I_{0}}^{I} \frac{d I}{I} = \int_{0}^{z} K_{s c a} d Z / \cos θ + \int_{0}^{z} K_{s c a} d Z

(5)

The absolute intensity can be written in the following equation:

I_{0} = I * E x p (Z \times K_{s c a} (\frac{1 + \cos θ}{\cos θ}))

(6)

To derive the error per pixel due to scattering, we assume a fraction µ of the scattered radiance is transmitted to the satellite field of view and is related to the path radiance and intensity in the following equation:

d W = μ d I

(7)

W = μ (K_{s c a} \times Z \times I_{o} \times X \times (\frac{1 + \cos θ}{\cos θ}))

(8)

where

I = I_{o} \times X

,

X = E x p (- Z \times K_{s c a} Y)

,

Y = (\frac{1 + \cos θ}{\cos θ})

, W is the path radiance and Z is the altitude in kilometers. Rearranging Equation 4 through Equation 8, µ can be written in the following form:

μ = W / (K_{s c a} \times Z \times I_{o} \times X \times Y)

(9)

Since µ is also a fraction of the scattered in the forward direction from the target subtracted from the diffuse multiple scattered radiations (L–L_s) (see Figure 1) from Equation 1, W can be written in the following:

W = μ [(A \times I^{*} \times Z) - (A \times K_{s c a} \times I_{o} \times X \times Y)]

(10)

where, and A is the pixel area, and I* is given by Equation 3. The term I₀ is absolute intensity of pixel value (free of error) which can be represented by the following expression:

I_{o} = I^{p i x e l} - W_{i j}

(11)

where W_ij represents the error (scattering effect) at row i and line j across the scene and I^pixel is the pixel value. By substituting Equation 9, and Equation 11, in Equation 10, the path radiance per pixel due to the three components of the scattering L_o, L_s and L_d that are transmitted to a satellite FOV is given by the following equation:

W_{i j} = ((1 + A) \times K_{s c a} \times I^{p i x e l} \times X \times Y \times Z) / ((1 + A) \times K_{s c a} \times X \times Y \times Z + 1)

(12)

where, K_sca is the total scattering coefficient for aerosol particles for which derivation is shown below.

The basic equation of the scattering coefficient is in the following:

K_{s c a} = \int_{0}^{\infty} \prod r^{2} n (r) Q_{s c a} d r

(13)

where, r is particle size, n(r) is the size distribution, and Q_sca is the scattering efficiency per single particle. The scattering efficiency per single particle Q_sca was derived by using expansions of Bessel’s function and Legendre Polynomial for the scattered field surrounding the spherical particle. The scattered field is represented by an infinite series of vector spherical harmonics [12]. The vector spherical harmonics are the electrical and magnetic modes of the scattered field surrounding the single particle. The coefficients of the expansions are given by the following:

a_{n} = [m ψ_{n} (m x) ψ^{'} (x) - ψ (x) ψ_{n}^{'} (m x)] / [m ψ_{n} (m x) ξ_{n}^{'} (x) - ξ_{n} (x) ψ_{n} (m x)]

(14)

b_{n} = [ψ_{n} (m x) ψ_{n}^{'} (x) - m ψ_{n} (x) ψ_{n}^{'} (m x)] / [ψ_{n} (m x) ξ_{n}^{'} (x) - m ξ_{n} (x) ψ_{n}^{'} (m x)]

(15)

where: ψ and ψ’ are the wave functions; ξ and ξ’ are Bessel’s functions and x is the parameter size (

X = \frac{2 Π r}{λ}

), m is the complex index of refraction. According to Mie Particle Theory, coefficients of expansions are accurate to x⁶ [12]. The coefficients are expanded and shown by the following equations:

a_{1} = \frac{i 2 x^{3} (m^{2} - 1)}{3 (m^{2} + 2)} - \frac{i 12 x^{5} (m^{2} - 2) (m^{2} - 1)}{{(m^{2} + 2)}^{2}} + \frac{4 x^{6} (m^{2} - 1)}{(m^{2} + 2)} + 0 (x^{7})

(16)

b_{1} = - \frac{i x^{5}}{45} (m^{2} - 1) + 0 (x^{7})

(17)

a_{2} = \frac{- i x^{5} (m^{2} - 1)}{15 (2 m^{2} + 3)} + 0 (x^{7})

(18)

b_{2} = 0 (x^{7})

(19)

From Equation 15 through 18, the scattering efficiency per particle is shown in the following:

Q_{s c a} = \frac{8}{3} X^{4} {[\frac{m^{2} - 1}{m^{2} + 2}]}^{2}

(20)

The complex indices of refractions m and particles’ sizes are arbitrary as mentioned previously (0.1–0.5 µm for band 1; 0.1–0.6 µm for band 2; 0.1–0.7µm for band 3) [12].

The scattering coefficient K_sca is also given in terms of the size distribution n(r) [14] as in the following equation:

n (r) = C \times {(r / r_{\min n})}^{- 4.5} For r_{\min} \leq r \leq r_{\max}

(21)

where r is particle size, r_mim = 0.1 µm, r = 10 µm, and C is constant representing the highest concentration of particles in a unit volume [14]. Since the particles are distributed along the path length and our assumption included plane parallel atmosphere, the path length that is derived from particle sizes dr in a unit volume is given in terms of the solar zenith angle by the following [13]:

d r = (\sec θ) d r_{m}

(22)

where θ is the solar zenith angle, dr_m is the path length in the vertical direction. The scattering coefficient decreases exponentially with respect to the altitude Z up to three kilometers [15,16,17], and from Equations (13), (20), (21), and (22) we have the following equation:

K_{s c a} = \int_{0}^{r} (C \times \frac{{(r_{m})}^{\frac{5}{3}}}{{(r_{\min})}^{1.5}} {(r_{m} / r_{\min})}^{- 1.5} \times) \times Q_{s c a} \exp (- Z / H) \times \sec θ d r_{m}

(23)

From Equations (19), (21) and (22) we have the following equation:

K_{s c a} = \frac{8}{3} {(\frac{2 Π}{λ})}^{4} (C \times {(r)}^{\frac{8}{3}} / {(r_{\min})}^{1.5} \times [\frac{m^{2} - 1}{m^{2} + 1}] \times \exp (- Z / H) \times \sec θ

(24)

where H is the scale height which equal 0.8 in a clear sky condition [6]. Equation 24 represents the derived scattering coefficient in terms of the particle efficiencies and solar zenith angle which is substituted in Equation 15 which represents the error in pixel value W_ij due to the scattering effect and is converted to radiance units using the conversion formula.

2.3. Training Set Data Development and Testing

The overall objective of the training set selection process is to assemble a set of statistics that describes the spectral response pattern for each land cover type. The actual classification of multispectral image data is a highly automated process which uses the training data for classification [18]. We obtained samples of the necessary training sets within the image by using the training set signature editor in ENVI 4.4 (commercial software) where a reference cursor on the screen was used to manually delineate training area polygons in the displayed image. The pixel values within the polygons were used in the software to develop a statistical description file for each training area. The training set development incorporates GPS ground truth reference data since it requires substantial reference data and a thorough knowledge of the geographical area [19]. Neural network is an automated classifier in ENVI 4.4 which uses the statistical description file and the ground truth reference data file for the classification process [19].

A set of GPS ground truth reference data were collected from the study area of Nicolet forestland vegetations in northern Wisconsin (wetland for shrub swamp, sedge meadow and upland opening shrub) using GPS single point positioning technique. The single point positioning technique utilizes pseudo-range measurements to determine positions of vegetation classes in UTM coordinates accurately. The UTM coordinates file of different vegetation types was converted to POLYGRID file format, then to ERDAS GIS file. The GIS file of the ground truth reference data was imported to ERDAS IMAGINE as an image file. This file was used in the processes of the training set development, classification, and accuracy assessment (contains 100% accurate positions of vegetations) [20].

The accuracy improvements are the result of iterative parameterizations of the atmospheric correction using the classification accuracy as the goal. In the pre-processing stage, a portion of the training data were randomly selected and used in the iterative process to simulate the aerosol size distribution constant; to compute the scattering efficiencies, and the scattering coefficients; and to correct for the atmospheric scattering throughout the image. In the post-processing stage, another portion of the training data (independent training sets) was employed to compare the class accuracy between corrected and uncorrected images through sequence of classifications and accuracy assessments.

In both stages, the accuracies of classifications were assessed using the error/covariance matrices (classification results). This is done by comparing the accuracy criteria of the error/covariance matrices of the corrected image with the original image (image prior to correction) and incorporating GPS reference points. The accuracy criteria include the overall accuracy,

\hat{K}

(Kappa/KHAT statistics), and Z-Score. The overall accuracy is computed by dividing the total number of correctly classified pixels (i.e., the sum of the elements along the major diagonal of the error matrix) by the total number of reference pixels. The

\hat{K}

statistics is a measure of the difference between the actual agreement between reference data and automated classifier, and the chance agreement between the reference data and an automated classifier [1] as in the following:

\hat{K} = \frac{Observed accuracy - Chance agreement}{1 - Chance agreement}

This statistic serves as an indicator of the extent to which the percentage of correct values in an error matrix is due to true agreement versus chance agreement. The value for

\hat{K}

varies between 0 (poor classification) and 1 (ideal classification). A higher

\hat{K}

indicates that the data are more separable. Z-score is the statistical test, which follows a Gaussian distribution and is used to determine whether the accuracy of the two classifications is significantly different. If the Z-score exceeds 1.96, then the difference between accuracies of the two classifications of the corrected and the original image is considered significant [21].

2.4. Algorithm and Atmospheric Correction Method

In our approach, we used a large image from the Northern Hemisphere. This is based on the fact that the solar zenith angle in the Northern Hemisphere remains larger than about 60° throughout the year [10]. Significant scattering can exist for the solar zenith angle larger than 60° (between 75° and 77° for this aerosol model) [11]. Since the size of the solar zenith angle depends on the latitude, the variation in the solar zenith angle occurs when the scene is large with respect to the latitude direction. Equations 11, 12, 16-21, and 24 were used in the algorithm (in C++) to correct for atmospheric scattering. The variable parameters across the scene are the solar zenith angle and the scattering coefficient. The path length of the solar radiation relies on the scattering coefficients. The scattering coefficient can be computed by using the particles’ efficiencies and size distribution [22]. While the optical properties of single particles such as shape, size, and complex index of refraction are arbitrary according to Mie Particle Theory, the particles’ efficiencies can be computed using sequences of iterations. The size distribution of aerosol particles can be determined by using particles sizes that are less or comparable to the image wavelengths (Mie Particle Theory) and by determining the constant of the size distribution (see Equation 21). The constant of the size distribution of aerosols particles can be determined by starting at an initial value and applying sequences of iterations throughout the correction process. The optimum correction can potentially enhance the classification accuracy of training data substantially resulting in a reduction in the number of misclassified pixels. The steps of the algorithm are outlined as follows:

Step 1 Computation of Atmospheric Input Parameters

Arbitraryinitial values for the index of refractions and the full range of aerosol particle sizes between 0.1microns and the sensor’s bandwidths (0.5–0.7 microns) were used as input values to compute the scattering efficiencies (Equation 20).
Using latitude at the upper left corner of the image, the solar zenith angles were computed for the entire image using the hour angle and the declination of sun.

Step 2 Input Measurements

GPS ground truth data were imported as an image file using ERDAS IMAGINE 8.7 and overlaid on the TM image.
Two categories of training sets were used: the first category was a group of randomly selected training sets for the pre-processing stage and the second category was a group of independent training sets for the post-processing stage. The randomly selected training sets in the pre-processing stage were used for the iterative parameterization of the correction throughout the image. The independent training sets in the post-processing stage were used for the sequences of classifications and accuracy assessments.

Step 3 Pre-processing Stage of Iterations and Simulation

The pixel values throughout the image were converted to radiances prior to the correction process.
The size distribution was computed (Equation 21) using initial value for the size distribution constant.
The scattering coefficient was computed for each pixel throughout the image (Equation 24) using scattering efficiencies (Equation 20), size distribution (Equation 21), and solar zenith angle.
The aerosol scattering effect for each pixel (error per pixel) was computed for each pixel (Equation 12).
The atmospheric errors were subtracted from all pixels throughout the image.
By using the randomly selected training set, the imaged was classified using neural network classifier at each iteration stage.
The classification accuracy of the corrected image was compared with the classification accuracy of the original image (prior to correction) using the error /covariance matrices incorporating the ground truth reference points.
If the accuracy was not improved, the algorithm would be repeated.
The iterations continued until reaching the optimum accuracy relative to the aerosol model.
The histogram and scatter plot of the image were checked at each iteration stage to avoid image distortion.

Step 4 Post-processing stage

Using the independent training sets, sequence of classifications and accuracy assessments were implemented.

The learning algorithm in FORTRAN (Spherical Particle) was used to compute the scattering efficiencies using the arbitrary initial values for the particle sizes (between 0.1 microns and the image bandwidth (0.5-0.7 microns), Mie Particle Theory and the complex indices of refractions [22]. Sequences of iterations were used to simulate the size distribution constant C. The C++ algorithm uses the output values of the scattering efficiencies as input values to compute the scattering coefficients and the scattering effect W_ij (error per pixel, see Appendix A). We employed training samples of the vegetation classes (Nicolet vegetation cover) within the image to be classified at each iteration stage using a neural network classifier. The iterations and the process of subtraction of W_ij from each pixel continued until the number of misclassified off-diagonal pixels of the error matrices (classification results) were reduced and optimized relative to the aerosol model. This process is based on comparing the criteria of the classification accuracies of the two images (corrected and original-uncorrected) and incorporating GPS reference points. In the post-processing step, sequences of classifications and accuracy assessments were implemented using independent training samples (different from those that were used for the iteration steps). Throughout the iterations, extreme correction and subtraction of the error from each pixel value (in radiance unit) can be avoided so that the image distortion would not occur [23,24,25]. This is done by checking the histogram and scatter plot of the image at each iteration stage. The iterations should stop at the optimum correction that leads to substantial enhancement in classification accuracy. If the enhancement of classification accuracy was achieved, the training data would be more separable in the error/covariance matrices. A higher

\hat{K}

indicates that the data are more separable.

3. Results

This section describes selected results for atmospheric correction and the classification accuracy stages. The correction scheme was applied each individual band of each image. These gray scale images were merged in one multi-band image after each iteration stage. Sequences of classifications and accuracy assessments were applied using the independent training sets for the corrected and the uncorrected (in the post-processing stage) images. The classification results in the error matrices were used to compute the accuracy criteria which were the overall accuracy,

\hat{K}

, and the Z-score. These accuracy criteria were used to assess accuracies of the classifications

Figure 1. Scattering coefficient versus solar zenith angle for band 1 for the atmospheric model.

Figure 2. Scattering coefficient versus solar zenith angle for band 2.

Figure 3. Scattering coefficient versus solar zenith angle for band 3.

Our hypothesis stated that accuracy of image interpretation and classification can largely be determined by the amount of aerosol scattering, which is directly proportional to the size of the solar zenith angle. Accordingly, in the aerosol model, the solar zenith angle was plotted versus the scattering coefficient and is shown in Figure 1, Figure 2, and Figure 3.

The scattering coefficient increases as the solar zenith angle increases across the scene in bands 1, 2, and 3 (see discussion in Section 2.2). This indicates that there is significant scattering as the solar zenith exceeds 60 degrees.

The constant of the size distribution was determined through iterative parameterization of the atmospheric correction. The constant of the size distribution was found to be 10¹⁶ for band 1, 10¹² for band 2, and 10⁹ for band 3 for the Nicolet image; 10¹⁹ for band 1, 10¹⁵ for band 2, and 10¹¹ for band 3 for the Milwaukee image. The corrected and the uncorrected images were compared. The correction can be identified visually as color and contrast show substantial enhancement (see Figure 4 and Figure 6). Figure 4 shows two images (corrected on the left and uncorrected image on the right) for Milwaukee urban areas such as roads and highways. The image on the right shows marked randomly selected training samples on the ground truth locations. These randomly selected trainings sets were used for the iterative parameterization of the atmospheric correction. Figure 5 represents the independent training samples for the vegetation species on the ground truth locations for Nicolet forest land. Figure 6 represents two images (corrected on the left and the uncorrected on the right) for the vegetation species which includes wetland-shrub swamp, wetland-sedge meadow, and upland opening-shrub. Figure 5 was overlaid on the image of Figure 6 for the classification and accuracy assessments.

Figure 4. Marked samples of training sets for Milwaukee highways and roads in bands 1, 2, and 3. The scene ID #: Y5013916033x07. Left image is the corrected; the right image is the uncorrected one. The marked training samples were classified and results are shown in Table 1A, Table 1B and Table 1C and Table 2A, Table 2B and Table 2C.

Figure 5. Independent training set data acquired by GPS in a GIS file format (vector format) that is converted to image format for Nicolet vegetation cover classes that was overlaid on the images in Figure 6. The purple color represents wetland shrub swamp. The blue color represents wetland-sedge meadow. The green color represents wetland opening shrub.

Figure 6. Nicolet vegetation cover image for which GPS training sets image in Figure 6 that was overlaid. The scene ID # is TM930810. Left image is the corrected image; the right image is the uncorrected one.

The RMSE versus the number of iterations for the neural network classification for a sample of Nicolet vegetation species training sets in Figure 6 were computed. The corrected images showed declining in the RMSE versus iterations compared with the uncorrected images as shown in Figure 7 and Figure 8.

Figure 7. The RMSE vs. iterations for NN classification of the corrected image for a sample of Nicolet vegetation species training sets.

Figure 8. The RMSE vs. iterations for NN classification of the uncorrected image for a sample of Nicolet vegetation species training sets.

The results of the classification are presented in the error matrices in Table 1A, Table 1B, Table 2A, Table 2B, Table 3A, Table 3B; Table 4A, and Table 4B. The training samples in Table 1A, Table 1B and Table 1C and Table 2A, Table 2B and Table 2C are marked points shown in Figure 4. Using the correction scheme, the number of the off-diagonal misclassified pixels in error matrices were minimized and optimized relative to the aerosol model. The smaller the off-diagonal misclassified pixel values are, the higher the overall accuracy represented by the diagonal elements of the error matrix. This indicated that the corrected image classification was enhanced compared with the uncorrected one which was shown by the error matrices in table 1A, table 2A, table 3A, and table 4A (Figure 4 and Figure 6, on the left)), in which the diagonal elements exceed that of the error matrices of the uncorrected image in table 1B, table 2B, table 3B, table 4B (Figure 4 and Figure 6 on the right).

The accuracy assessment criteria including overall accuracy,

\hat{K}

,

σ_{k}^{2}

(variance of

\hat{K}

), and Z-Score are shown in Table 1C, Table 2C, Table 3C, and Table 4C (see section 2.2). Samples of covariance matrices for Milwaukee scene (urban areas) and Nicolet scene (vegetation cover) are also presented to assess the separability between classes.

Table 1A. Sample of classification result error matrix of the corrected image (image 1, Milwaukee urban (Scene ID#:Y5013916033x07).

**Table 1A.** Sample of classification result error matrix of the corrected image (image 1, Milwaukee urban (Scene ID#:Y5013916033x07).
	Reference points
Cover class	Set 1	Set 2	Set 3	Set 4	Row Total
Set 1	16	1	0	0	17
Set 2	0	10	5	2	17
Set 3	2	0	8	5	15
Set 4	0	0	0	10	10
Column Total	18	11	13	17	59

Table 1B. Sample of classification result error matrix of the uncorrected image (image 2, Milwaukee urban areas.

**Table 1B.** Sample of classification result error matrix of the uncorrected image (image 2, Milwaukee urban areas.
	Reference points
Cover class	Set 1	Set 2	Set 3	Set 4	Row Total
Set 1	13	0	1	0	14
Set 2	1	9	6	3	19
Set 3	4	2	6	2	14
Set 4	0	0	0	12	12
Column total	18	11	13	17	59

Table 1C. Accuracy assessment for classification the results in Table 1A and Table 1B.

**Table 1C.** Accuracy assessment for classification the results in Table 1A and Table 1B.
Image #	Overall Accuracy	$\hat{K}$	σ_k ²
1	74.57%	0.673	0.00510
2	67.79%	0.574	0.00619
Z score = 4.40		-	-

Table 2A. Another sample of classification result error matrix of the corrected (image 1, Milwaukee Urban areas).

**Table 2A.** Another sample of classification result error matrix of the corrected (image 1, Milwaukee Urban areas).
	Reference points
Cover class	Set 5	Set 6	Set 7	Set 8	Set9	Row total
Set 5	5	0	0	0	0	5
Set 6	0	9	0	0	0	9
Set 7	0	0	18	0	0	18
Set 8	0	0	1	11	0	12
Set 9	1	0	0	2	8	10
Column Total	6	9	19	13	8	55

Table 2B. Another sample of classification error matrix of the uncorrected image (image 2, Milwaukee urban areas).

**Table 2B.** Another sample of classification error matrix of the uncorrected image (image 2, Milwaukee urban areas).
	Reference points
Cover Class	Set 5	Set 6	Set 7	Set 8	Set 9	Row Total
Set 5	6	0	0	0	1	7
Set 6	0	9	1	0	0	10
Set 7	0	0	12	2	0	14
Set 8	0	0	3	10	3	16
Set 9	0	0	3	1	4	8
Column total	6	9	19	13	8	55

Table 2C. Accuracy assessment for the classification results in tables 2A and tables 2B.

**Table 2C.** Accuracy assessment for the classification results in tables 2A and tables 2B.
Image #	Overall Accuracy	$\hat{K}$	σ_k²
1	92.73%	0.906	0.002364
2	74.55%	0.677	0.00710
Z score = 11.91		-	-

Table 3A. The error matrix of the corrected image (image 1) for Nicolet (northern Wisconsin, scene ID #is TM930810).

**Table 3A.** The error matrix of the corrected image (image 1) for Nicolet (northern Wisconsin, scene ID #is TM930810).
	Reference points
Cover class	Wetland-shrub swamp	Wetland-sedge meadow	Upland opening-shrub	Row Total
Wetland-shrub swamp	9	0	1	10
Wetland-sedge meadow	3	12	3	18
Upland opening-shrub	2	0	11	13
Column total	14	12	15	41

Table 3B. The error matrix of the uncorrected image (image 2) for Nicolet vegetation cover.

**Table 3B.** The error matrix of the uncorrected image (image 2) for Nicolet vegetation cover.
	Reference points
Cover class	Wetland-shrub swamp	Wetland-sedge meadow	Upland opening-shrub	Row Total
Wetland-shrub swamp	9	2	1	12
Wetland-sedge meadow	1	6	2	9
Upland opening-shrub	4	4	12	20
Column total	14	12	15	41

Table 3C. Accuracy assessments for the classification results in Table 3A and Table 3B.

**Table 3C.** Accuracy assessments for the classification results in Table 3A and Table 3B.
Image #	Overall Accuracy	$\hat{K}$	σ_k²
1	78.58%	0.674	0.00896
2	65.85%	0.481	0.0128
Z- score =5.55

Table 4A. Another sample of classification results of the corrected image (image 1) for Nicolet vegetation cover.

**Table 4A.** Another sample of classification results of the corrected image (image 1) for Nicolet vegetation cover.
	Reference points
Cover class	Wetland-shrub swamp	Wetland-sedge meadow	Upland opening-shrub	Row Total
Wetland-shrub swamp	10	1	4	15
Wetland-sedge meadow	0	9	1	10
Upland opening-shrub	0	0	6	6
Column total	10	10	11	31

Table 4B. Another sample of classification results in the error matrix of the uncorrected image (image 2) for Nicolet vegetation cover.

**Table 4B.** Another sample of classification results in the error matrix of the uncorrected image (image 2) for Nicolet vegetation cover.
	Reference data
Cover class	Wetland-shrub swamp	Wetland-sedge meadow	Upland opening-shrub	Row Total
Wetland-shrub swamp	9	6	2	17
Wetland-sedge meadow	1	4	1	6
Upland opening-shrub	0	0	8	8
Column total	10	10	11	31

Table 4C. Accuracy assessments for the classification results in Table 4A and Table 4B.

**Table 4C.** Accuracy assessments for the classification results in Table 4A and Table 4B.
Image #	Overall Accuracy	$\hat{K}$	σ_k²
1	80.64%	0.655	0.00825
2	67.74%	0.518	0.0158
Z- Score =5.70

The covariance matrices for the corrected image and uncorrected image are introduced below. In the covariance matrices, the higher the diagonal elements than the off-diagonal elements, the better separability between classes. The values of the diagonal elements of COV (1)/COV (3) for the corrected images exceed the values of the diagonal elements of COV (2)/COV (4) for the uncorrected images which indicated that the classification accuracy was enhanced through the correction scheme.

C O V (1) = (\begin{array}{l} 61.6 - 20.8 ​​​ - 10.6 - 14.2 \\ - 20.8 18.9 - 4.58 - 7.50 \\ - 10.6 - 4.58 12.3 4.16 \\ - 14.2 - 7.50 4.16 25.0 \end{array})

C O V (2) = (\begin{array}{l} 40.3 - 15.8 ​ - 3.00 - 14.0 \\ - 15.8 12.3 - 0.833 - 7.00 \\ - 3.00 - 0.833 3.67 - 6.00 \\ - 14.0 - 7.00 - 6.00 36.0 \end{array})

C O V (3) = (\begin{array}{l} 32.3 - 17.3 - 1.33 \\ - 17.3 9.33 - 1.33 \\ - 1.33 - 1.33 9.33 \end{array})

C O V (4) = (\begin{array}{l} 8.33 - 6.67 - 0.833 \\ - 6.67 9.33 - 2.33 \\ - 0.833 - 2.33 2.33 \end{array})

4. Discussion and Conclusion

In comparing the corrected image and the uncorrected image, we conclude that correction method shows substantial enhancement of image qualities such as color and contrast which can be visually identified in Figure 4 and Figure 6. The amount of scattering shows increase with the solar zenith angle, as shown in Figure 1, Figure 2, and Figure 3. Therefore, we conclude that the hypothesis is verified. Evaluation of the spectral class separability of the classification result was done through the confusion matrix (error matrix) approach. The statistical criteria of the confusion matrix that was introduced as separability between the vegetation classes for the classification results are overall accuracy,

\hat{K}

(Kappa), and z-score. The overall accuracy of the classification is assessed in reference to the ground truth data. Comparing the accuracies of corrected and the uncorrected images for Milwaukee urban areas (high ways and roads), the overall accuracy of the corrected image is (74.57% for image 1) is higher than that of the uncorrected image (67.79% for image 2). The value of

\hat{K}

for corrected image (0.673) exceeds that of the uncorrected image (0.574). The overall accuracy, the values of Kappa corrected image indicates that the spectral separability between the vegetation classes for the corrected image is higher than that of the uncorrected image. In addition to that, the increase in separability between the vegetation classes was also confirmed by plotting the root main square error (RMS) of the neural network classification for the corrected and uncorrected images. The corrected images showed declining in RMS which indicates higher separability between the vegetation classes compared with uncorrected images as shown in Figure 7 and Figure 8.

Since Z-score for the two images exceeds 1.96 (see Section 3.4) and the overall accuracy of the corrected image is higher than that of the uncorrected image, therefore, the classification accuracy of the corrected image is substantially improved over the classification accuracy of the uncorrected image. Comparing the accuracies of corrected and the uncorrected images for the Nicolet vegetation cover, the overall accuracy of the corrected image (78.58% for image 1) exceeds that of the uncorrected image (65.85% for image 2). The value of

\hat{K}

for the corrected image (0.674 for image 1) exceeds that of the uncorrected image (0.481 for image 2) which indicates a higher spectral separability. Since Z-score exceeds for the two images (5.55) exceeds 1.96, therefore, the classification accuracy of the corrected image is substantially improved over the classification accuracy of the uncorrected image.

The vegetation classes show less improvement (up to13%) than the brighter pixels on the road classes (up to 18%). This due to the fact that mean values of the vegetation classes have spectral variation due to atmospheric factor as well as other factors such as environmental factor. Therefore, we come to conclusion that the classification of images in urban areas has better accuracy than vegetation areas.

Another approach for evaluating spectral separability is the use of covariance matrix. Samples of covariance matrices of the classification results have been introduced in equations 25-28 to show the separability of the data before and after correction scheme. For example, the data along the diagonal of the covariance matrix for the corrected image exceed that of the uncorrected image. Therefore, the covariance matrix for the corrected image shows that the data are more separable than that of the uncorrected image for both the Milwaukee urban areas and for the Nicolet vegetation cover. In addition to that, the mean values of the randomly selected training sets of the corrected image are more separable compared to the mean values of the randomly selected training sets of the uncorrected image, as shown in Figure 7 and Figure 8.

A high number of samples for different vegetation classes across the scene for different date images have been tested using the classification accuracy improvement criteria. Therefore, all the data that have been tested show significant improvement in overall accuracy, 13% - 18% using the correction of atmospheric noise. In this paper, we showed results for samples of nine road classes in urban areas, three vegetation classes for different images in different dates. The results are consistent for using the same vegetation classes for a different image in different date which is approximately up to 13%.

In this paper, we incorporated GPS reference data in the training set development and in comparing the results of classifications for the fact that they are more reliable than the conventional training sets which are usually developed by visual human recognition of spectral information within the image. The aerosol model dealt with atmospheric variability across the scene in terms of solar zenith angle, aerosol optical properties, and size distribution. Large scenes in northern hemisphere (Nicolet) were used with computed solar zenith angle exceeded 60°; the optical properties of aerosols and size distribution were simulated in the aerosol model using iterations. Scattering effect was computed for each pixel and the images were subjected to corrections which impacted the classification accuracy considerably (improvement by 13–18%). This method shows considerable enhancement of classification accuracy compared to the methods which used subtraction of a constant that represents a dark object pixel value or extraction of aerosol optical properties from a single dark object across the scene in the process of atmospheric correction [4].

Acknowledgments

We acknowledge NASA for MUI (NASA grant #SOS 2003) grant award and support during the adaptation of this research investigation. Special thanks for Dr. Philip Sakimoto (former grant officer) and Dr. Larry Cooper (grant officer) at the Space Science Education and Public Outreach Program at NASA headquarter.

References and Notes

Lilesand, T.; Kiefer, R. Remote Sensing and Image Interpretation, 3rd ed.; John Wiley & Sons: New York, NY, USA, 1994. [Google Scholar]
Duggin, M.J.; Robinove, C.J. Assumptions Implicit in Remote Sensing Data Acquisition and Analysis. Int. J. Rem. Sens. 1990, 11, 1669–1694. [Google Scholar] [CrossRef]
Verstraete, W.; Vandevivere, P. New and Broader Applications of Anaerobic Digestion. Crit. Rev. Environ. Sci. Tech. 1999, 29, 151–173. [Google Scholar]
Song, C.; Woodcock, E.; Seto, K.; Lenney, M.; Macomber, S. Classification and Change detection using TM data: When and How to Correct Atmospheric Effects. Rem. Sens. Environ. 2001, 75, 230–244. [Google Scholar] [CrossRef]
Potter, J.F. Haze and Sun Effects on Automatic Classification of Satellite Data-Simulation and Correction, in Scanners and Imagery Systems for Earth Observation. Proc. SPIE 1974, 51, 73–83. [Google Scholar]
Fraser, R.S.; Baheth, O.P.; Al-Abbas, A.H. The Effect of the Atmosphere on the Classification of Satellite Observation to Identify Surface Features. Rem. Sens. Environ. 1977, 6, 229–249. [Google Scholar] [CrossRef]
Ohtant, Y.A.; Kusaka, T.; Ueno, S. Classification Accuracy for MOS-1 MESSR Data Before and after the Atmospheric Correction. IEEE Trans. Geosci. Rem. Sens. 1990, 28, 755–760. [Google Scholar]
Karathanassi, V.; Andronis, V.; Rokos, D. The Radiative Impact of Aerosols Emanating from Biomass Burning, through the Minnaert Constant. Int. J. Rem Sens. 2003, 24, 5135–5146. [Google Scholar] [CrossRef]
Haan, J.F.; Hoventer, J.W.; Kokke, J.M.; Stokkom, H.T.C. Removal of Atmospheric Influences on Satellite-Borne Imagery: A Radiative Transfer Approach. Rem. Sens. Environ. 1991, 37, 1–21. [Google Scholar] [CrossRef]
Singh, S.M. Lowest Order Correction for Solar Zenith Angle to Global Vegetation Index (GVI) Data. Int. J. Rem. Sens. 1988, 9, 237–248. [Google Scholar] [CrossRef]
Kaufman, Y.J.; Sendra, C. Algorithm for Automatic Atmospheric Corrections to Visible and Near Satellite Imagery. Int. J. Rem. Sens. 1988, 9, 357–1381. [Google Scholar] [CrossRef]
Crane, A.I. An Example Based, Post Processing Classification Images with Neural Networks. Master Thesis, Univ. Wisconsin-Madison, Madison, WI, USA, 1992. [Google Scholar]
Bolstad, P.V.; Lillesand, T.M. Automated GIS Integration in Land Cover Classification. ASPRS/ASCM 1991, 3, 23–32. [Google Scholar]
Milovich, J.A.; Gagliardini, D.A. Environmental Contribution to the Atmospheric Correction for Landsat-MSS Images. Int. J. Rem. Sens. 1995, 16, 83–87. [Google Scholar] [CrossRef]
Kaufman, Y.J. Aerosol Optical Thickness and Atmospheric Path Radiance. J. Geophys. Res. 1993, 98, 2677–2692. [Google Scholar] [CrossRef]
Fraser, R.S. Computed Atmospheric Corrections for Satellite data. Proc. SPIE 1974, 51, 64–72. [Google Scholar]
Hadjimitsis, D.G.; Clayton, C.R.I.; Hope, V.S. An Assessment of the Effectiveness of Atmospheric Correction Algorithms through the Remote Sensing of Some Reservoirs. Int. J. Rem. Sens. 2004, 25, 3651–3674. [Google Scholar] [CrossRef]
Bohren, C.; Huffman, D. Absorption and Scattering of Light by Small Particles, 2nd ed.; Wiley Interscience: New York, NY, USA, 1983. [Google Scholar]
Coulson, K. Polarization and Intensity of Light in the Atmosphere; Deepak Publishing: Hampton, VA, USA, 1988. [Google Scholar]
Elmahboub, W.M.; Scarpace, F.; Smith, B. An Integrated Methodology to Improve Classification Accuracy of Remote Sensing Data. In IEEE International Geosciences and Remote Sensing Symposium Proceedings, Toulouse, France, July 21-25, 2003; IV, pp. 2161–2163.
Sader, S.A.; Douglas, A.; Liou, W. Accuracy of Landsat-TM and GIS Rule-Based Method for Forest Wetland Classification in Maine. Rem. Sens. Environ. 1995, 53, 133–144. [Google Scholar] [CrossRef]
Running, S.; Thomas, R.L.; Loveland, L.; Pierece, R.; Nemani, R.; Hunt, E.R. A Remote Sensing Based Vegetation Classification Logic for Global Land Cover Analysis. Rem. Sens. Environ. 1995, 51, 39–48. [Google Scholar] [CrossRef]
McClatchey, R.A.; Penn, R.W.; Selby, J.E.A.; Volz, F.E.; Garing, J.S. Optical Properties of the Atmosphere, 3rd Ed. ed; Rep. No. AFCRL-72-8497; Air Force Cambridge Research Laboratories: Cambridge, MA, USA, 1972. [Google Scholar]
Smith, W.L.; Revecomb, H.E.; Laporte, D.D.; Bujis, H.; Murcary, D.G.; Murcary, F.J.; Sormovsky, L. A. High-Altitude Aircraft Measurements of Upwelling IR Radiance- Prelude to FTIR from Geosynchronous Satellite. Mikrochim. Acta 1988, 2, 421–427. [Google Scholar]
USGS. Landsat Data Users Handbook; Geological Survey: Washington, DC, USA, 1979.
Shabanov, N.V.; Lo, K.; Gopal, S.; Myneni, R.B. Subpixel Burn Detection in Moderate Resolution Imaging Spectroradiometer 500-m Data with ARTMAP Neural Networks. J. Geophys. Res. 2005, 110, D03111:1–D03111:17. [Google Scholar] [CrossRef]

© 2009 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Elmahboub, W.; Scarpace, F.; Smith, B. A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects. Remote Sens. 2009, 1, 278-299. https://doi.org/10.3390/rs1030278

AMA Style

Elmahboub W, Scarpace F, Smith B. A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects. Remote Sensing. 2009; 1(3):278-299. https://doi.org/10.3390/rs1030278

Chicago/Turabian Style

Elmahboub, Widad, Frank Scarpace, and Bill Smith. 2009. "A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects" Remote Sensing 1, no. 3: 278-299. https://doi.org/10.3390/rs1030278

APA Style

Elmahboub, W., Scarpace, F., & Smith, B. (2009). A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects. Remote Sensing, 1(3), 278-299. https://doi.org/10.3390/rs1030278

Article Menu

A Highly Accurate Classification of TM Data through Correction of Atmospheric Effects

Abstract

1. Introduction

2. Methodology

2.1. Study Area

2.2. Theoretical Background and Derivation of Atmospheric Correction Components

2.3. Training Set Data Development and Testing

2.4. Algorithm and Atmospheric Correction Method

3. Results

4. Discussion and Conclusion

Acknowledgments

References and Notes

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI