Built-In Functional Testing of Analog In-Memory Accelerators for Deep Neural Networks
Round 1
Reviewer 1 Report
Authors present in this manuscript their online built-infunctional ttesting of NVM-based memory accelerators for DNNs to validate the correct operation of the DNN when on test.
I congratulate the authors since the article is well-written and all doubts and comments I was noting were answered in further sections from the manuscript. Since my expertise it is not based on electronics, and more on ML/DL methods and validation, I appreciate all the well explained steps followed by the authors.
I only have one doubt regarding the architectures chosen. You tested three quite old simple models, and then AlexNet or ResNet18, which despite being old are a more complex architectures. Did you chose them in order to compare validate your approach on more complex systems? Have you considered more actual models? I suppose state-of-the-art models would be extremely complex to validate due of being large designs.
The classification of color images are a lot more complex than binary images as you mentioned in the conclusions section, and while reading I had doubts of whether your approach in color images was going to keep the high fault coverage you had on binary images on a not so easy task, but you kept high numbers. I was wondering if you kept all color-images or preprocessed to use gray-scale information. Just curiosity, not anything to modify in the text, since your work for me is great.
I would encourage the authors to make a final reread of the text to detect some hard to read sentences, but overall the text is well-written.
Author Response
Please see attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
-- The introduction should be improved, the authors should present the novety of their work.
-- More other important reference can be cited in the related work, the authors should compare their results with other works
-- They must explain why they choose DNN algorithms why not LSTM, ARIMA, etc.
-- It's not clair what the authors want to perform using the two data base MNIST and FMNIST, I think that they should describe these two dataset.
-- Figure 3 should be transformed to a table and more other parameters can be compared
-- The conclusion must be improved and re-written
-- English should be improved
Author Response
Please see attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The authors propose a self-testing procedure for deep neural network (DNN) accelerators. They employ an in-hardware DNN of interest for in-memory computing with non-volatile memory. A procedure is also introduced to randomly generate test instances and hence detect a broad range of hardware faults, which enables high coverage of the fault mechanisms. The proposed tools and procedures are demonstrated on DNNs for image classification. The authors use data from MNIST, Fashion-MNIST, and CIFAR-10. Results show indeed a high coverage of hardware faults.
I believe the topic of the paper is of great importance. The authors clearly state the novelties of their work after covering the relevant literature. Notably, obtaining test patterns specific to the targeted model enables the on-demand generation of test patterns, hence leading to lower storage requirements. Furthermore, I believe it is very important that the proposed methods are not dependent on the NVM cell technology.
The paper is clearly written, and the data and results are clearly presented. The authors explain in detail the proposed procedures and models. Test results are properly interpreted in my opinion.
Based on the above, I believe the paper is suitable for publication with minor modifications, which I list below along with a few suggestions.
· Why “the problem of detecting soft errors caused by resistance drift is not considered in this paper.”? Can you discuss the challenges associated with that, and possibly layout related future work directions associated with it?
· At line 251, the authors mention “non-concurrent, online testing”. At line 255, they state “This offline analysis proceeds as follows”. I believe I understand the general procedure: the testing is online, but the BIST controller can schedule it so that it takes place when the system is idle and son on, being less disruptive and offline-like. However, please consider rephrasing some of the text mentioned above to avoid confusion.
· In Section 5.2 (Testing using Structured Patterns) you describe how structured patterns are obtained. Have you considered, for instance, also varying the thickness of the pattern as well? In my experience, that might enable further diversification of the patterns and trigger an even broader fault coverage in some cases. If you agree, you might want to mention it in future work directions (i.e., implementing and testing additional structured pattern generation procedures).
· In Lines 397-399 you state that “Though the ReLU activation function improves DNN performance, there is a downside in that some ReLU neurons may ‘die’ during training and always provide an output of zero for any input from the dataset.” Have you considered using different activation functions? In my experience, for some tasks (especially involving edges and directional entities) other activation functions can also perform well. If you agree, once again you could suggest that as a future work direction (testing the framework with additional activation functions)
· Can you elaborate more on Lines 410-412 (“We theorize this is because tests ∈ TND sensitize easy to detect faults, whereas the augmentation provided by the structured tests along with TUD improves on this coverage by sensitizing harder to detect faults.”)? Why would the unform distribution sensitize harder to detect faults?
· Style/typos:
o Line 67: please consider spelling out TPG, since it is the first time you mention it in the paper.
o Line 77: please replace “convolution neural networks” with “convolutional neural networks”
o Figure 1 consists of three panels (a, b and c), but no reference to them is in the caption. My suggestion is to add a brief reference to them in the caption, as a more detailed explanation is in the text.
o Figure 3: what are MAX_P1 and MAX_P2? I am assuming they are max poling layers, but that should be explicitly mentioned somewhere (caption?)
o Figure 4: since fault coverage will be discussed later on, please consider showing just the metrics here, and then showing a fault coverage table/figure closer to where it is actually discussed.
o Line 379: please replace “Initialize status of all faults to uncovered” with “Initialize status of all faults to be uncovered”
o Line 410: please replace “architectures such as AlexNet, We” with “architectures such as AlexNet. We”
o Lines 410-412: please consider replacing “We theorize this is because tests ∈ TND sensitize easy to detect faults, whereas the augmentation provided by the structured tests along with TUD improves on this coverage by sensitizing harder to detect faults” with “We theorize this is because tests ∈ TND sensitize easy-to-detect faults, whereas the augmentation provided by the structured tests along with TUD improves on this coverage by sensitizing harder-to-detect faults”
Author Response
Please see attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
The authors have well response to all point
The paper is now well written and can be accepted for publication