Next Article in Journal
Integrated Analysis of Lithosphere-Atmosphere-Ionospheric Coupling Associated with the 2021 Mw 7.2 Haiti Earthquake
Next Article in Special Issue
Assessing the Impact of Climate Change on the Biodeterioration Risk in Historical Buildings of the Mediterranean Area: The State Archives of Palermo
Previous Article in Journal
CMIP6 GCM Validation Based on ECS and TCR Ranking for 21st Century Temperature Projections and Risk Assessment
 
 
Article
Peer-Review Record

Recognition of Damage Types of Chinese Gray-Brick Ancient Buildings Based on Machine Learning—Taking the Macau World Heritage Buffer Zone as an Example

Atmosphere 2023, 14(2), 346; https://doi.org/10.3390/atmos14020346
by Xiaohong Yang, Liang Zheng †, Yile Chen †, Jingzhao Feng and Jianyi Zheng *
Reviewer 2: Anonymous
Reviewer 3:
Atmosphere 2023, 14(2), 346; https://doi.org/10.3390/atmos14020346
Submission received: 17 January 2023 / Revised: 5 February 2023 / Accepted: 8 February 2023 / Published: 9 February 2023
(This article belongs to the Special Issue Microclimate of the Heritage Buildings)

Round 1

Reviewer 1 Report

The paper discusses a topic worth investigating. Machine learning is used to differentiate between typical damage patterns for a specific type of brick. However, the validation is unclear. The paper does not provide a methodology and quantity to assess the validity of the model. How is it decided that the model performs well? There are no clear examples of the manual evaluation and the model output. The limitations of the model are not well described. It is unclear what guidelines are required for the pictures.

Given that the paper does not offer the code, it becomes very important that the model is validated. Right now the paper only tells there is a model, but by no means it can be concluded that the model performs adequately.

In order to be considered for publication, the paper should be more clear on the methodology, the validation process, a validation metric, offer examples, and discuss the limitations.

Detailed remarks

In the abstract it is stated that the algorithm is used to detect 5 types of damage, whereas throughout the rest of the paper 6 types are mentioned.

Figure 1: it is not customary to just copy paste screenshots from websites in a journal paper. Please ensure that there is no issue in terms of copy right. Furthermore, in figure 1(c) there is a sliding bar at the bottom. Please edit your graphs appropriately.

1.2 Perhaps it would be interesting to provide some context for people not familiar with traditional Chinese culture, by indicating the centuries that are considered here: era of Qin Bricks, Ming Dynasty.

Please define “green bricks”

Figure 2: please ensure that you have the publication right to the screenshots

Figure 3: please ensure that you have the publication right. It is clear that the 3 pictures show the three categories mentioned in the paragraph above, but no explicit reference is made to the pictures. Please ensure that the caption of the graph clarifies what can be seen on the 3 pictures.

Figure 4: please ensure that you have the publication right to the screenshots. The section seems to indicate that the bricks have a curved top and bottom to allow for a hidden bed mortar. Is this the case? Please state more clearly. Please clarify what the interior side of the wall consists out. This is relevant to better understand damage mechanisms.

Section 2.2, line 154: bricks “attacked” by plants and microbes?

Caption figure 5 “Image source: drawn by the author”. First of all, how do you “draw” pictures? Secondly, if the authors hold the IP right, I presume it is not required to mention this.

Table 1, bricks with stains: it seems only one type of staining is evaluated. The “stain” is figure 5 is for sure not caused by the degradation phenomenon described in table 1. “When the temperature is high, the pollutants in the air will be more easily deposited”. Is it not the other way around? In the morning there are lower temperatures, and due to long-wave radiation the wall surface is subjected to undercooling, in turn leading to surface condensation. At that moment particulate matter in the air adheres to the moist surface. Throughout the day the moisture evaporates, leaving the particulate matter attached to the wall.

Table 2 is of little added value for the scope of the paper.

Section 3.1: please clarify “LOSS value”

Section 3.1: in table 1 different damage degrees are indicated, but this is not mentioned in the discussion of the labelling. It is unclear whether this has been done for the data. Furthermore, it would be interesting to elaborate a bit more on the pictures. Are pictures all taken from about the same distance to the wall? Are corners            also included? All the pictures seem to be very homogeneous: no other materials, no windows, doors, gutters, roof, floor, …

Figure 5, top: based on the single picture, a whole range of different damage issues are identified. But one would assume that this does not align with the level of manual assessment that was done.

Figure 9: it is unclear whether loss has a physical meaning. Is a value of 100 or 5 “good”? How should this be interpreted? It seems that a training set of 50 would suffice. However, even after 5, the loss is very good, although it seems unreasonable to assume that the model would perform reasonable after 5 training pictures (for 5 damage types).

It is unclear what weight files are. Please elaborate, what is 0.5 confidence? Please provide detailed examples of the data labelling to understand how the model is trained. What is Max Epoch?

Section 4.2: first 1000 reference pictures are used. Then 200 are used to train the model. Then the model is tested on a different data set. Why not first test it on the other 800 pictures?

There seems to be no validation: a comparison between observed damage types and the output of the model (for pictures that were not used as training data).

Author Response

 

The paper discusses a topic worth investigating. Machine learning is used to differentiate between typical damage patterns for a specific type of brick. However, the validation is unclear. The paper does not provide a methodology and quantity to assess the validity of the model. How is it decided that the model performs well? There are no clear examples of the manual evaluation and the model output. The limitations of the model are not well described. It is unclear what guidelines are required for the pictures.

Given that the paper does not offer the code, it becomes very important that the model is validated. Right now the paper only tells there is a model, but by no means it can be concluded that the model performs adequately.

In order to be considered for publication, the paper should be more clear on the methodology, the validation process, a validation metric, offer examples, and discuss the limitations.

Thank you very much for your suggestion. The original code of the program cannot be released yet because our program is being used in other research. The Supplementary Material (training set for machine learning) for this article can be found online at: https://data.mendeley.com/datasets/rtf5d2v9rm/1 (accessed on February 1, 2023) Model testing and inspection instructions are also mentioned in 4.1 and 4.2. Maybe it wasn't explained clearly before. We have now completed this part of the text.

 

Comments: In the abstract it is stated that the algorithm is used to detect 5 types of damage, whereas throughout the rest of the paper 6 types are mentioned..

Response: Thank you very much for your suggestion. There are indeed five types. We have double checked the representation of the full text.

 

Comments: Figure 1: it is not customary to just copy paste screenshots from websites in a journal paper. Please ensure that there is no issue in terms of copy right. Furthermore, in figure 1(c) there is a sliding bar at the bottom. Please edit your graphs appropriately.

Response: Thank you very much for your suggestion. Our analytical charts for climate are redrawn by ladybug software.

 

Comments: 1.2 Perhaps it would be interesting to provide some context for people not familiar with traditional Chinese culture, by indicating the centuries that are considered here: era of Qin Bricks, Ming Dynasty.

Response: In order to express more clearly, we have added the corresponding years commonly used internationally after the corresponding Chinese dynasties.

 

Comments: Please define “green bricks”.

Response: Thank you very much for your suggestion. During the period of revising the paper, we contacted professionals for verification and decided to use the expression "Chinese Gray-brick" uniformly. At the same time, explanations are added where necessary in the text.

 

Comments: Figure 2: please ensure that you have the publication right to the screenshots

Response: Thank you very much for your suggestion. We're pretty sure. We refer to some professional vocabulary in the Chinese references, but the pictures have been changed by us.

 

Comments: Figure 3: please ensure that you have the publication right. It is clear that the 3 pictures show the three categories mentioned in the paragraph above, but no explicit reference is made to the pictures. Please ensure that the caption of the graph clarifies what can be seen on the 3 pictures.

Response: Yes, our analysis in the previous part is relatively small. After completing the full text, it seems that the picture description is relatively weak. We have removed this image.

 

Comments: Figure 4: please ensure that you have the publication right to the screenshots. The section seems to indicate that the bricks have a curved top and bottom to allow for a hidden bed mortar. Is this the case? Please state more clearly. Please clarify what the interior side of the wall consists out. This is relevant to better understand damage mechanisms..

Response: Yes. Thanks for your advice. We have supplemented it in the article.

 

Comments: Section 2.2, line 154: bricks “attacked” by plants and microbes?

Response: Yes, because bricks are affected by the local climate, there are many problems caused, such as groundwater seeping upwards. After a long time, some bricks will be attached by microorganisms such as moss, which will further attack the surface of the bricks, resulting in bricks broken. You can refer to the picture of "P&M" in Figure 5.

 

Comments: Caption figure 5 “Image source: drawn by the author”. First of all, how do you “draw” pictures? Secondly, if the authors hold the IP right, I presume it is not required to mention this.

Response: Thank you very much for your suggestion. What we wanted to express at the beginning was: the pictures were taken by us, and we drew these pictures in Photoshop software to arrange them into one and add annotations. To avoid ambiguity, we have refined the expression so far.  

 

Comments: Table 1, bricks with stains: it seems only one type of staining is evaluated. The “stain” is figure 5 is for sure not caused by the degradation phenomenon described in table 1. “When the temperature is high, the pollutants in the air will be more easily deposited”. Is it not the other way around? In the morning there are lower temperatures, and due to long-wave radiation the wall surface is subjected to undercooling, in turn leading to surface condensation. At that moment particulate matter in the air adheres to the moist surface. Throughout the day the moisture evaporates, leaving the particulate matter attached to the wall.

Response: Yes. Thanks for your advice. We have supplemented it in the article.

 

Comments: Table 2 is of little added value for the scope of the paper.

Response: Thank you very much for your suggestion. Table 2 lists the repair methods for the damaged bricks. The repair method of the damaged brick is of great significance for the study of the damaged type of the gray-brick. On the one hand, repair methods for damaged bricks can help researchers better understand the types of brick damage. Distinguish the difference between the damage conditions of the blue bricks, so as to better confirm the type of damage and the detection target. On the other hand, there are economical differences among the repair methods of different blue brick damage. The repair costs of different damage conditions are quite different, and specific differences are required. In order to evaluate the repair cost more quickly, it also reflects the value of setting multiple detection labels in this study.

 

Comments: Section 3.1: please clarify “LOSS value”

Response: Thank you very much for your suggestion. The loss value is an indicator used in machine learning to measure the gap between the predicted results of the model and the actual results, and it can reflect the degree of fit-ting of the model. Its function is to evaluate the performance of the model and can be used to guide the model's training. Through continuous iterative training, the loss value becomes smaller and smaller, so as to achieve the purpose of improving the performance of the model. The corresponding explanation has been supplemented in the article.

 

Comments: Section 3.1: in table 1 different damage degrees are indicated, but this is not mentioned in the discussion of the labelling. It is unclear whether this has been done for the data. Furthermore, it would be interesting to elaborate a bit more on the pictures. Are pictures all taken from about the same distance to the wall? Are corners also included? All the pictures seem to be very homogeneous: no other materials, no windows, doors, gutters, roof, floor, …

Response: Thank you very much for your suggestion. Table 1 evaluates only stain types affected by climate and has been modified with your comments. Data collection is mainly reflected in the researchers' collection, sorting, slicing, and statistics of gray brick photos. In the study, the main part collected is the most easily accessible part of the historic building, that is, the most important external wall envelope. For example, the indoor kitchen and other parts of the roof are not entirely made of gray bricks, so they are not within the scope of this research. The corresponding explanation has been supplemented in the article.

 

Comments: Figure 5, top: based on the single picture, a whole range of different damage issues are identified. But one would assume that this does not align with the level of manual assessment that was done.

Response: Thank you very much for your suggestion. As mentioned above, we collected a large number of photos and assigned labels to 1000 photos, and this process was formed by the joint judgment of professional architects and researchers. A large number of samples is only for the higher learning ability and model accuracy in the machine learning training process. Figure 5 only shows examples of damage types, and it is difficult for us to show more cases (more than 900 photos left) in the text. The situation of the photo is as shown in Figure 6. In addition, we have released the training set for machine learning and can be downloaded.The Supplementary Material(training set for machine learning) for this article can be found online at:https://data.mendeley.com/datasets/rtf5d2v9rm/1 (accessed on February 1, 2023) The description of this part is also supplemented by the availability of data at the end of the article.

 

Comments: Figure 9: it is unclear whether loss has a physical meaning. Is a value of 100 or 5 “good”? How should this be interpreted? It seems that a training set of 50 would suffice. However, even after 5, the loss is very good, although it seems unreasonable to assume that the model would perform reasonable after 5 training pictures (for 5 damage types).

Response: Loss value is an important concept in machine learning. It is the loss cost of the model during training, and it can also be called the error rate. The Loss value can be used to measure the quality of the model. The smaller the model, the better the model, and the larger the model, the worse it is. The Loss value is calculated by calculating the difference between the predicted value of the model and the actual value. Generally, a loss function, such as mean square error (MSE), cross-entropy (Cross-Entropy), etc. is used to calculate the Loss value. The physical meaning of the loss depends on the context of the training process. In general, lower values are better than higher values, so a value of 5 is better than a value of 100. However, without knowing the specific context, it is difficult to interpret the exact meaning of the loss value.

Figure 9 expresses different training times and their corresponding Loss values. Figure 10 expresses that the representative (minimum Loss value, Val_Loss value, and maximum number of iterations) weight file corresponding to the Loss value is extracted from Figure 9 for model testing. It can be seen that the detection effect of the model corresponding to the minimum Loss value is the best (the 138th generation of model training). If you only train 5 times, the best detection effect will not be achieved.

 

Comments: It is unclear what weight files are. Please elaborate, what is 0.5 confidence? Please provide detailed examples of the data labelling to understand how the model is trained. What is Max Epoch?

Response: A weights file is a set of variables in a machine learning algorithm that describes the parameters of the model (such as the slope in linear regression). It can be calculated by the training algorithm and is the result of model learning. The weight file defines the structure of the model, and each variable represents a parameter of the model, which can be used to control the behavior of the model.

The confidence level of 0.5 refers to the confidence level of the machine learning algorithm for a certain result, that is, the machine learning algorithm believes that the probability of a certain result is 0.5, which is 50% possibility. The three columns on the right side of Figure 10 express the impact of setting different confidence levels on the detection results of gray-bricks.

The maximum number of iterations refers to the maximum number of iterations experienced in a training process in machine learning. The number of iterations is increased or decreased according to the needs of the experiment (the effect of model convergence). In this study, the number of iterations is set to 200, and the maximum number of iterations is the 200th generation(Max Epoch).

 

Comments: Section 4.2: first 1000 reference pictures are used. Then 200 are used to train the model. Then the model is tested on a different data set. Why not first test it on the other 800 pictures?

Response: Section 4.2 of this study is the application of the model, and the unprocessed 6000 × 4000 pixel gray brick photos are selected as the test. In addition, the research materials of this study include 1,000 photos of green bricks, all of which are used for model training, and the training process has been iterated for 200 epochs.

 

Comments: There seems to be no validation: a comparison between observed damage types and the output of the model (for pictures that were not used as training data).

Response: In the picture output by the model, the type of brick damage is selected with different color label boxes. There are relevant legends on the lower side of Figures 10 to 13. In the picture output by the model, you can observe the type of brick damage according to the color, and judge whether the result of the model detection is reasonable. 

Reviewer 2 Report

1. Line 53: It should be explained what does it mean that “local climate conditions are more complicated (Figure 1)”. Placing drawings without a word of comment is not enough. 

2. Lines 101-112: First sentences from the section Problem statement and Objectives about the monsoon climate should be moved to the end of the Research background section.

3. Line 145: In Figure 3 name of each of three categories of brick walls should be placed under each picture.

4. Line 148: The caption under Figure 4 should be changed – figure 4 shows drawings of different parts of the wall (floor plan, elevation, etc.), it is not an analysis.

5. Line 162: There should be a sentence informing about the content of the Table 1.

6. Line 171: There is a sentence in Table 1 that “If the temperature is too low, the brick wall will be frozen” whereas in Line 162 can be found that in Macau brick walls are not affected by frost. This inconsistency should be removed.

7. Line 171, Table 1: Climatic factors usually do not affect the brick wall separately, but simultaneously. Therefore, it is not the best solution to describe separately the impact of each of them on wall damage. For example, frost itself will not cause cracks and defects in bricks, if the brick is not previously damp. 

8. Line 183, Table 2: What is the difference between repair methods: "Brick repair", "Splicing bricks", "Brick removal" and "Replacing bricks"? While reading the description of each of them, no difference can be seen.

9. Line 212: There are too many photos in Figure 6. 12 to 16 pictures would be enough.

Author Response

1. Line 53: It should be explained what does it mean that “local climate conditions are more complicated (Figure 1)”. Placing drawings without a word of comment is not enough. 

Response: Thank you very much for your suggestion. In our original submission, Section 1.3 mentioned complex and changeable climates. Currently we have made an adjustment and moved to 1.1.We have made corresponding supplements in Section 1.1. This will also better correspond to our illustrations.

 

2. Comments: Lines 101-112: First sentences from the section Problem statement and Objectives about the monsoon climate should be moved to the end of the Research background section.

Response: Yes, we have made revisions and additions based on your comments.

 

3. Comments: Line 145: In Figure 3 name of each of three categories of brick walls should be placed under each picture.

Response: Yes. We have perfected it.

 

4. Comments: Line 148: The caption under Figure 4 should be changed – figure 4 shows drawings of different parts of the wall (floor plan, elevation, etc.), it is not an analysis.

Response: To avoid ambiguity, we considered deleting this picture.

 

5. Comments: Line 162: There should be a sentence informing about the content of the Table 1.

Response: Yes, we have made revisions and additions based on your comments.

 

6. Comments: Line 171: There is a sentence in Table 1 that “If the temperature is too low, the brick wall will be frozen” whereas in Line 162 can be found that in Macau brick walls are not affected by frost. This inconsistency should be removed.

Response: Thank you very much for your suggestion. We have made further refinements.

 

7. Comments: Line 171, Table 1: Climatic factors usually do not affect the brick wall separately, but simultaneously. Therefore, it is not the best solution to describe separately the impact of each of them on wall damage. For example, frost itself will not cause cracks and defects in bricks, if the brick is not previously damp.

Response: Thank you very much for your suggestion. We have made further refinements.

 

8. Comments: Line 183, Table 2: What is the difference between repair methods: "Brick repair", "Splicing bricks", "Brick removal" and "Replacing bricks"? While reading the description of each of them, no difference can be seen.

Response: Yes, they are actually pretty much the same. There are only slight differences in the actual operation process. To avoid ambiguity, further refinements have been made.

 

9. Comments: Line 212: There are too many photos in Figure 6. 12 to 16 pictures would be enough.

Response: Thank you very much for your suggestion. In fact, although they are all gray bricks, these photos cover our shooting from different building parts. So is it possible to show such a number of photos? However, we have adjusted this picture before inserting it to avoid a bad reading experience.

Reviewer 3 Report

A topic of great interest, which could also be implemented on different building materials. For this reason the method of analysis could be more detailed, so that it can be extended.

I suggest increasing the number of references.

Author Response

Response:Thank you very much for your suggestion. Thank you very much for your affirmation of our research.We also added corresponding references to the background research in the article, as well as the review of the literature review.

Round 2

Reviewer 1 Report

I thank the authors for considering the remarks, and as well, for publishing the training data.

There is still one issue I want to raise. The main goal is to develop a system to automatically recognize damage types. The paper presents such a model. However, it fails to say anything about the reliability. If there are 1000 pictures, it would make sense to use 700 or 800, and then use the model to classify the remaining pictures. Then at least you could say that the model can identify which type of damage is visible in xx% of the pictures. There could be false positives and false negatives as well (not every piece of wall would necessarily contain one out of five damage types). In the rebuttal the authors elaborate on the LOSS coefficient, but to what extent the model might be applicable in practice remains unclear because the paper fails to provide any input on the reliability. Even for a very poor model the LOSS value can be optimized, but it may still remain a very poor model. 

Hence, I would really urge the authors to try to add some kind of practical assessment of the tool. No need to pretend that a model is 100% correct, models always entail limitations, but now the reader is still really in the blue on interpreting the validity and reliability. 

Author Response

Dear Reviewer,

Thank you very much for pointing out the problem.We added the section "4.3. Manual Validation of Models". At the same time, the previously released dataset also updates the 1000 sample results of the machine learning model test (https://data.mendeley.com/datasets/rtf5d2v9rm/2). In order to verify the reliability of the model, 1000 gray brick samples are used to verify the model. The inspectors are composed of a cultural relic expert and two architecture scholars. They manually checked the test results of the model one by one and Found that among the 1000 test results of the model, there were 857 valid test samples, 136 wrong test samples, and 7 missed test samples(specific numbering details are in Appendix B).The results of this experiment show that the detection effect of the model is good, and 85.7% of the samples can be detected effectively, but 13.6% of the samples are still detected incorrectly, and 0.7% of the samples are not detected (Figure 13).

For details you can check our section on updating revised manuscripts.
Thank you very much again!

Back to TopTop