The accuracy of object detection, segmentation, and matching in images is directly influenced by the image processing workflow and the selection of processing methods. This section starts by showcasing the workflow for acquiring object motion information. It then delves into the relationship between object motion patterns and earthquake intensity. Finally, two examples are provided to demonstrate the entire intensity assessment process.
4.1. Obtaining Information on Object Movement
The current study designated areas such as the ground, walls, and ceiling as the background, while considering all other objects as the foreground. By analyzing the movement patterns of the foreground objects, the earthquake intensity was calculated. The detailed steps are visualized in
Figure 9.
Upon retrieving the seismic period video from the monitoring system, in order to obtain comprehensive pre- and postearthquake information, the frame preceding the earthquake was extracted as the pre-earthquake image, while the final frame served as the postearthquake image. Brightness compensation was performed using adaptive histogram equalization. Bilateral filtering and local saliency detection were applied to blur the background and enhance the foreground. To select an appropriate image segmentation and matching algorithm, the image power spectral density was employed as a measure of the image complexity. The images are categorized into low-complexity and high-complexity.
In Equation (7), PSD represents the image power spectral density, H(u, v) is the frequency domain filter function, I is the input signal, and F and F
−1 represent the two-dimensional fast Fourier transform (FFT) and its inverse transformation (IFFT), respectively. H and W represent the height and width of the image, respectively. In this study, we selected 109 sets of earthquake images from
Section 3.2. The power spectral density range of the calculated image indicated a low-complexity image when it was (0, 1540), a high-complexity image when it was (1950, 64,516), and an image with overlapping regions when it was [1540, 1950].
4.2. Low-Complexity Image Processing
Images with a power spectral density of less than 1540 are low-complexity images. Low-complexity images usually include scenes such as bedrooms, living rooms, offices, and corridors, and the texture distribution of these images is not very concentrated. The image segmentation algorithm based on edge information can be used to segment the image [
29], and the image matching algorithm based on the stable feature region can detect and obtain the state of moving objects. Therefore, the image segmentation based on edge information is designed to judge the movement of objects according to the information obtained from stable feature points.
4.2.1. Adaptive Multi-Scale Canny Edge Detection
In order to obtain the information regarding the movements of the objects in the image, it was necessary to segment the foreground objects in the image more accurately. The adaptive multi-scale canny algorithm has a strong stability and can detect most of the object edges. Through detection processing at different scales, it can also pick up the edges of objects that are difficult to detect and relatively fuzzy. The working principle of the adaptive canny algorithm is firstly to carry out Gaussian filtering on the image and obtain the weighted average of the neighborhood around each pixel to reduce the impact of noise. A sobel operator is used to calculate the gradient amplitude and direction of each pixel in the filtered image. In the gradient direction, non-maximum suppression is performed, only the pixel with the largest value in the gradient direction is retained, and other pixels are suppressed. The double threshold is set, and the edges are divided into three categories: strong edge, weak edge, and non-edge. The strong edge is marked as the true edge, the weak edge is marked as the false edge, and the non-edge pixels are excluded. A scale pyramid is constructed, adaptive canny edge detection is performed on images of different scales, and fuzzy edges are picked up, which are also recorded as weak edges. Finally, the weak edge connected with the strong edge is analyzed for edge connection and connectivity, and the complete edge is formed.
4.2.2. Region-Adaptive Median Filtering
Based on the adaptive multi-scale canny algorithm, there is a large amount of salt-and-pepper noise in the image after edge segmentation. The salt-and-pepper noise is characterized by multiple isolated white noise points in the image, and the salt-and-pepper noise can be eliminated by using the regional adaptive median filtering algorithm. The working principle of the adaptive median filtering algorithm is to first segment the image into nine regions of the same size and calculate the region edge density of the image. The area with denser edges has more and denser salt-and-pepper noise. Therefore, a smaller filter core is used for the area with a higher edge density, and a larger filter core is used for the area with a lower edge density. The image is sorted by pixels in the filter range, the median gray value is detected, and all pixels in the filter are replaced by the median pixel to eliminate all abrupt white noise.
4.2.3. Morphology Operation
After adaptive median filtering, there were still a large number of broken edges and some abnormal holes in the image, which would affect the accuracy of subsequent object detection. In morphological processing, the edge is first expanded to connect the broken edges and some holes are filled, and then the edge is refined by corrosion to prevent excessive expansion from affecting the size of the object. However, the simple use of a single expansion corrosion core will have some problems. Too large a core will cause some small objects to disappear after expansion, and too small a core will cause large edge fractures and cannot be connected. In order to solve this problem, multi-filter kernel nesting processing was used, and the image of all the processing results was obtained according to the size of filter kernel from small to large, step-by-step expansion, and corrosion. The image difference and the image phase sum were, respectively, made for the image after high–low kernel processing, so as to obtain the most suitable processing image and ensure the integrity of main features. The principle of dilatation corrosion is to establish a filter; that is, the result of dilatation is that the pixel expands around the corresponding size, and the result of corrosion is that the pixel shrinks inward of the corresponding size.
After morphological processing, the threshold was set. The stability feature region detection algorithm that was applied in
Section 2 detected that the minimum object length was 17 pixels and the longest line width was 13 pixels. Adaptive threshold binarization was used to detect the connection between pixels in the eight neighborhoods and the current pixel. The equivalent labeling algorithm was used to separate different connected domains, extract the area and center point coordinates, and prioritize the object with the most area similarity. If the area was similar, the distance between the feature points was combined to assist judgment. We counted the number of connected domains, i.e., the number of objects identified. For each connected domain, the number of pixels was calculated to approximate the size of the object, and the type of the object was determined according to the size of the object and the depth-of-field area in which it was located. For each connected domain, the coordinates of the center point were calculated to match the objects in the images taken before and after the earthquake. The displacement of the center point between the two objects is the displacement information of the object. Here, the size and number of moving and non-moving objects and the displacement of moving objects were obtained according to the connected domain marker information [
30].
4.3. High Complexity Image Processing
Images with a power spectral density of greater than 1950 belong to high-complexity images, which are mostly taken from shopping malls, kitchens, and other scenes. The characteristics of such images are that there are a large number of shelves in the images, and a large number of small objects of similar sizes and similar features are placed on the shelves. The edge information is extremely complex, and the feature information is difficult to obtain. The analysis of this kind of image mainly analyzes the movement of objects on the shelves at a low intensity and the fall of objects at a high intensity. Firstly, cluster segmentation was performed on the images taken before and after the earthquake, based on edge information [
31], and the images were divided into the background, shelves, and fallen objects. When there was no falling object, the image was differentially detected, depending on whether there was a moving object.
The key principle of edge densified-based clustering segmentation for images is to use the edge density to quantify the pixel distribution information of the edge and classify pixels with similar edge densities into one class by the clustering method, which can effectively gather the edge pixels together and form an effective segmentation region.
The analysis of a high-complexity image mainly detects whether objects fall in the images taken before and after the earthquake, so it is divided into the following two processing processes according to the situation of objects falling. Firstly, the canny algorithm is used to detect the edge of the image, and the number of edge pixels in the neighborhood of the pixel (eight) is calculated for each pixel of the edge image, so that an edge density image can be obtained. Each pixel of the image represents the edge density around it. The image of the edge density is segmented by K-means clustering, and the whole image is divided into three categories according to the edge density. Shelf areas have a high edge density, background areas have a zero edge density, and falling objects have a low edge density. Binary processing was performed on the image after clustering segmentation, and the threshold of the binary processing was set to 57 pixels, the background was set to 255 pixels, and the dropped object and shelf area were set to 0 pixels. When the images taken before and after the earthquake were differentiated, only fallen objects remained in the differential images, but there were a lot of noise and holes in the acquired images of fallen objects. In this case, hole filling and morphological corrosion and expansion should be used to remove the noise and fill the holes, and then the number and area of fallen objects can be obtained by marking the connected domains of the image. The number and size of fallen objects in the high-complexity image were thus obtained. If the fallen object was not detected in the differential image, the cluster segmentation of this set of images was no longer performed, and the images taken before and after the earthquake were directly differentiated after the local significance detection. In this case, only slight movement of objects occured on the shelf, and the brightness transformation was small. Therefore, binary processing at the 25 threshold value was used for the differential image, and the moving object was set as 0 pixels. We set other areas to 255 pixels. The image after binarization was corroded. Finally, the number and size of only the moving objects were obtained by connecting domain labeling and area screening.
After the connected domain was marked, there were some objects with too small an area; these were the noise regions that could not be processed completely. Therefore, it was necessary to perform a low-area screening on the connected domain area before the object movement statistics were calculated to remove most of the noise.
Images with a power spectral density of greater than 1540 and less than 1950 belong to mixed-complexity images, which are mostly taken from the scenes of utility rooms, factories, kitchens, etc. The local areas of such images are complex, while other areas are relatively normal. For such images, the region segmentation based on the power spectral density was performed first, and the images were divided into high-complexity regions and low-complexity regions. According to the corresponding method, the movement of objects in the images of different complexity regions was detected, and the number, size, and displacement of objects in the mixed-complexity images are finally obtained.
At this point, the object movement information in all the complex images was obtained, and the earthquake intensity was evaluated according to the object movement information.
4.4. Earthquake Intensity Assessment
The description of object reactions in the seismic intensity table is shown in
Table 2.
In
Table 2, object reactions occur at intensity levels 3 to 9, while object movement takes place at intensity levels 5 to 9. The objective of this study was to assess the object movement in indoor environments at intensity levels 5 to 9. The quantitative descriptors used in the seismic intensity scale are defined as follows: (a) “few” refers to below 10%; (b) “some” refers to 10% to 45%; (c) “many” refers to 40% to 70%; (d) “most” refers to 60% to 90%; (e) “nearly all” refers to above 80%.
Using the seismic intensity scale descriptors and the defined ranges of quantity terms, an earthquake intensity assessment table based on the percentage of object movement was obtained, with N1, N2, N3, and N4 representing the quantities of moving objects, light furniture, furniture, and heavy furniture, respectively. The total quantities of small objects, light furniture, furniture, and heavy furniture are denoted as N5, N6, N7, and N8, respectively.
By incorporating the object movement information obtained in
Section 4.1 into
Table 3, the corresponding condition was determined, and the appropriate seismic intensity was assigned. After the initial assessment of the seismic intensity, it was necessary to adjust the seismic intensity estimation based on the object displacement and camera shaking. Minor camera shaking or no shaking during the interval of image capture before and after the seismic event were considered normal conditions. Therefore, the following correction methods were performed in this study:
- (1)
For seismic intensities of below 7 degrees, check if there is any vertical displacement of objects exceeding three times their length. If such displacement is detected, the objects are considered to have fallen. The seismic intensity should be adjusted to 7 degrees.
- (2)
If the camera is visibly shaking, increase the seismic intensity by 1 degree.
4.5. Example of Earthquake Intensity Assessment
In this section, two seismic examples are used to demonstrate the overall process of indoor intensity assessment for low-complexity images and high-complexity images. As shown in
Figure 10, the power spectral density of the images taken before and after the earthquake is 1430. The process of acquiring object movement information in low-complexity images showed that some small objects moved, light furniture moved, and objects fell in the images. The initial evaluation of the seismic intensity was 7 degree.
The final count of connected components was 12, ranging from 110 to 2317. All the fallen objects were small in size. Therefore, the assessed seismic intensity was 7 degrees, which is consistent with the seismic intensity evaluation results of the team.
Figure 11 illustrates the process of obtaining object movement information in low-complexity images. Some small objects were observed to move, objects fell, and light furniture moved as well. The seismic intensity was assessed to be 7 degrees.
After removing excessively large background areas and abnormally small outliers, the object movement information obtained is presented in
Table 4.
By substituting the object movement information from
Table 4 into
Table 2, it was determined that the object movement information satisfied the condition k7, corresponding to a seismic intensity of 6 degrees. However, upon further investigation, it was identified that there were three small objects with vertical displacements exceeding three times their length, leading to object falls. As a result, the seismic intensity was adjusted to 7 degrees, which concurs with the manually assessed seismic intensity results.