1. Introduction
Production and inspection are two important parts of the tile industry. Although there are numerous facilities currently that can automate the production of tiles, the assessment of the tile’s surface quality is still conducted manually [
1]. Defects can occur everywhere on the tile surface, and detecting subtle defects in large-format tile images can be a tough task, especially on complex textured tiles. Complex textured tiles have intrusive background information, and some of the background information is relatively similar to and between defects, so they are easily confused when performing defect detection tasks. Therefore, providing a stable, accurate, and efficient method for surface defect detection is crucial to the development of the tile industry.
Vision-based inspection methods have developed rapidly in recent years, and many scholars have applied methods based on deep learning models to the field of industrial defect detection, such as Fabric Defect Detection [
2], Electrical Insulator Defects Detection [
3], and Bamboo Surface Sliver Detection [
4]. In the field of tile defect detection, tile defect detection is still a difficult task due to the special characteristics of tile datasets, the great resolution of tile images, and the many types and tiny sizes of defects. There are many scholars who have conducted related investigations using machine vision inspection methods, Rahaman et al. [
5] use a first-order derivative edge detector (Sobel) for the tile defect detection task, but this method is only applied to solid-colored tiles and no tests are performed regarding complex textured tiles. Karimi et al. [
6] conduct a variety of defect detection methods, such as histograms, neural networks, morphological techniques, Gabor filters and wavelet transforms. Both of these methods have their own pros and cons, and there is no more comprehensive approach. Hanzaei et al. [
7] integrate defect classification using a multi-class support vector machine with defect edge detection using a rotation invariant local variance metric (RIMLV) operator. This approach can detect tile defects more accurately, but only for solid-colored tiles, and has not been extended to complex textured tiles for related experiments. Deep learning models have stronger performance in defect detection tasks in complex scenes [
8], but traditional deep learning models still have unresolved challenges in complex texture tile defect detection tasks [
9]. This problem is reflected in the fact that traditional deep learning models do not easily take into account the connection between regional features, which makes it difficult for the model to extract the differences between different regions and exclude interference information, and therefore it is tough to identify some of the tile block defects.
In the field of tile block defect detection, this paper summarizes the characteristics of different methods, as shown in
Table 1. As observe from the table, only traditional deep learning models are applied to complex textured tiles, but there are limitations in the connection between regional features. To address the challenges of traditional deep learning models for tile block defect detection, this paper is inspired by Gessert et al. [
10] to relate tile block defect characteristics to regional features. Then, considering the specificity of the tile dataset, this study applies this method to the attention mechanism, which can effectively extract the regional features of tile block defects. As a result, this paper proposes a new attention module—CPAM, which focuses on the different patches. This attention module can be easily added to the end-to-end model and effectively helps the model to accomplish the tile block defect detection task. In this paper, tiles from three different backgrounds containing four defects are studied and a series of experiments are conducted on a dataset based on this. The contributions made in this paper can be summarized as follows:
This paper combines the characteristics of tile block defects with the regional performance of the patch and coalesces the feature information into simple one-dimensional information, where one information represents a patch, and each information represents a patch with different information, thus realizing the association between patch and tile block defects;
In this paper, two linear information alignment methods in different spatial directions are used to fully correlate adjacent patches and combine the two methods to establish stable linear 1D information, enhance 1D information expansion and reduce specificity bias in spatial location;
In this paper, the above two methods are combined and applied to the attention mechanism, and a new attention mechanism–CPAM, is proposed to help the model effectively complete tile block defect detection by highlighting the importance of different patches;
In this paper, CPAM is plugged into several end-to-end models and a series of experiments are conducted on the constructed dataset, and the results illustrate the effectiveness of CPAM for tile block defect detection gain effect. In addition, CPAM is compared with the attention mechanism commonly used in tile block defect detection, and the results demonstrate that CPAM extracts patches with a better gain effect for detecting tile block defects.
The rest of this paper is as follows. In
Section 2, this paper details the deep learning and attention mechanisms in the field of tile block defect detection. In
Section 3, this paper describes the methods of tile block defect detection.
Section 4 shows the experimental results. Finally, the content of this paper is discussed and summarized.
3. The Proposed Method
In this paper, a new attention module—CPAM is proposed, which can be easily plugged into the model. The structure of CPAM is shown in
Figure 1, and it is mainly divided into a pooling part, a convolution part, and a weighting part. First, the input feature mapping
F is globally pooled to obtain the compressed feature mapping
, which can be expressed as
f represents the whole feature map of the input is compressed into a one-dimensional feature map,
X is the tensor of the input feature map, and the height and width of the feature mapping are represented by H and W, respectively. Then, the feature mapping
f is pooled into patches, and each weight in the pooled feature mapping
represents the pooled patch, which can be expressed as
p represents the open square of the patch value, that is .
The information in the feature mapping F has different spatial location information, and the arrangement of spatial location information affects the patch correlation. Therefore, three sets of strategies with no identical directionality are set up in this study, as shown in
Figure 2. In this study, considering two spatial orientations simultaneously can increase stability, and its discussed specifically in the experimental section. In the experiment (c) strategy is the best, so (c) strategy is chosen,
transposed to obtain
, which can be expressed as
This gives the arrangement of the feature mappings
and
in the X and Y directions, by joining the feature mappings
and
into a one-dimensional vector, respectively, which are
and
, then it is easy to cancat the two vectors to obtain
.
The obtained
represents the feature mapping that distinguishes the importance of different patches. After completing the convolution part,
needs to be weighted into the input feature map, so
needs to be converted into a weightable size. In the weighting part,
is separated into two-dimensional feature mapping
, which is converted into
by the pooling operation, and then
will be obtained by the sigmoid function, and the weighting part can be expressed as
In the equation,
is the sigmoid function. Finally, the weighted feature mapping
can be obtained, and the output is expressed as
Functionally, the most important function of CPAM is to coalesce feature information as a combination of multiple patches, which can obtain weights that contain a large amount of information. Then, CPAM employs two different spatial orientation alignment methods to construct stable one-dimensional information, using one-dimensional convolution to quickly help the model distinguish the importance of different patches. Structurally, CPAM is very simple, divided into a pooling part, convolution part and weighting part, and there is only one convolutional layer in the whole structure, so the operation is efficient. Additionally, unlike other attention, CPAM does not distinguish between the importance of different channels and does not focus on the importance of each piece of information in space. Eventually, CPAM combines the tile block defect characteristics and the regional representation of the patch to apply the method of processing the patch to the attention mechanism, which helps the model to complete tile block defect detection efficiently.
5. Discussion
Tile defect detection has developed rapidly in recent years, and vision-based detection methods have become a research hotspot, whether they are image processing methods, machine vision methods, or deep learning methods, all of which have achieved good results. In the early stage of tile defect detection, the research object is mainly solid-colored tiles, and with the advancement of technology, the research object gradually shifts to complex texture tiles with high detection difficulty. Based on previous research, this paper provides a relevant exploration of complex textured tile block defect detection and provides an improved method regarding the traditional deep learning model.
5.1. Discussions of Complex Texture Tile Block Defect Detection
Complex texture tiles have a more complex background, texture, etc., compared to solid-colored tiles, which makes it much more difficult to detect defects in complex texture tiles. There are many scholars have conducted studies related to solid-colored tiles [
5,
6,
7], but they have not explored complex textured tiles in relation to them. In complex texture tile defect detection, Wan et al. [
9] provided a method for deep learning methods; this method can be applied to complex texture tile defect detection with wide applicability. However, they did not notice the correlation of tile block defects with regional characteristics and therefore did not conduct further studies on tile block defects.
5.2. Discussions of CPAM with Tile Block Defect Detection
There are some defects that are difficult to detect in the complex texture tile block defect detection, which are block defects with high similarity to the tile background, such as pour glaze bad defect, or two defects with high similarities, such as dirty and impurity defects. Based on previous studies and the applicability of the object detection model for complex textured tile defect detection, YOLOv7 is chosen as the base model for this study. To address the challenges that some block defects are difficult to detect, this paper combines tile block defects with regional features and proposes a new attention mechanism—CPAM. CPAM differs from other attention mechanisms in that it does not only extract channel information or spatial information, but extracts spatial information from different regions to obtain a complete regional feature. The patch information extracted by CPAM contains the regional characteristics of tile block defects for which the tile block defect features are regionally expressed. In addition, CPAM employs two linear information alignment methods with different spatial directions and an efficient connect method. Finally, CPAM establishes the connection between different patches and effectively helps the model to accomplish the task of complex texture tile block defect detection by identifying the differences between defective and non-defective patches and defective patches and defective patches. After CPAM is added to different end-to-end deep learning models, the experimental results demonstrate the excellent gain effect of CPAM for complex texture tile block defect detection, which can indicate the strong applicability of CPAM in this task.
5.3. Discussions of Limitations with CPAM
CPAM has high applicability for complex textured tile block defect detection, but there are some limitations: (1) The CPAM method of establishing different patch associations is a very simple and efficient method, but there is still room for improvement; (2) CPAM is currently excellent only for block defect detection and no experiments have been conducted for line defect detection or more complex shape defect detection; (3) there may be limitations in the migratory nature of CPAM and no tests have been conducted for other defect datasets.
6. Conclusions
In this paper, a new attention module is proposed, which associates the regional representation of tile block defects with patches, and coalesces the feature information into simple one-dimensional information as a way to represent the information contained in patches. CPAM uses two spatially oriented linear alignment methods to reduce region-specific bias. CPAM helps the model to effectively distinguish defective patches from non-defective patches by highlighting the importance of different patches. CPAM can be easily plugged into the end-to-end model, and the extensive experiments conducted in this paper can demonstrate the effectiveness of CPAM with good gain for tile block defect detection.
In addition, future work will revolve around the following goals: (1) The methods for CPAM to establish associations of different patches will be further explored, thus making the methods more efficient; (2) there is not enough correlation between the characteristics of CPAM and the characteristics of tile linear defects, and the potential of CPAM on tile linear defects will be further explored; (3) other defects are more different from tile block defects, and the applicability of CPAM on other defects will be further verified.