Next Article in Journal
Cone Metric Spaces over Topological Modules and Fixed Point Theorems for Lipschitz Mappings
Next Article in Special Issue
Verification of Cyberphysical Systems
Previous Article in Journal
IDP-Core: Novel Cooperative Solution for Differential Games
Previous Article in Special Issue
Graph of Outputs in the Process of Synthesis Directed at CPLDs
 
 
Article
Peer-Review Record

FPGA-Oriented LDPC Decoder for Cyber-Physical Systems

Mathematics 2020, 8(5), 723; https://doi.org/10.3390/math8050723
by Mateusz Kuc *, Wojciech Sułek and Dariusz Kania
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Mathematics 2020, 8(5), 723; https://doi.org/10.3390/math8050723
Submission received: 7 April 2020 / Revised: 28 April 2020 / Accepted: 29 April 2020 / Published: 4 May 2020

Round 1

Reviewer 1 Report

The paper presents a hardware (FPGA) implementation of the irregular QC-LDPC decoder and a few variants of decoding algorithm. 

The paper is nicely written and well-organized. Hence, I recommend to accept the paper in its current form with a slide modification as stated below. 

o The paper needs to address more state-of-the art approaches in the Introduction section to clarify the scope of this work. 

Author Response

Dear Sir/Madam,

thank you for your comments. We took your comment into consideration and improved the paper. Complete "List of changes" (including changes made on the basis of the other reviews) is at the end of this letter. We’d like to take a stance on some of your remarks:

Point 1

“The paper needs to address more state-of-the art approaches in the Introduction section to clarify the scope of this work.”

The Introduction section has been re-organized and expanded. We have outlined recent development in the area. Some related works have been referenced. New references are listed below:

  1. J. Chen, A. Dholakia, E. Eleftheriou, M.P.C. Fossorier, X.Y. Hu: Reduced-Complexity Decoding of LDPC Codes IEEE Transactions on Communications, (August 2005), vol. 53, no. 8, 1288-1299.
  2. K. Zhang, X. Huang, Z. Wang, High-Throughput Layered Decoder Implementation for Quasi-Cyclic LDPC Codes, IEEE Journal on Selected Areas in Communications, August 2009, vol. 27, no. 6, 985-994.
  3. Z. Wang, Z. Cui, J. Sha, VLSI Design for Low-Density Parity-Check Code Decoding, IEEE Circuits and Systems Magazine, (First Quarter 2011), vol. 11, 52-69.
  4. J. Li, K. Liu, S. Lin, K. Abdel-Ghaffar, Algebraic Quasi-Cyclic LDPC Codes: Construction, Low Error-Floor, Large Girth and a Reduced-Complexity Decoding Scheme, IEEE Transactions on Communications, (August 2014), vol. 62, no. 8, 2626-2637.
  5. P. Hailes, L. Xu, R. G. Maunder, B. M. Al-Hashimi, L. Hanzo, A survey of FPGA-based LDPC decoders, IEEE Communications Surveys & Tutorials, (Second Quarter 2016), vol. 18, no. 2, 1098-1122.
  6. A. Tasdighi, A.H. Banihashemi, M.R. Sadeghi, Efficient Search of Girth-Optimal QC-LDPC Codes, IEEE Transactions on Information Theory, (April 2016), vol. 62, no. 4, 1552-1564.
  7. J. Andrade, N. George, K. Karras, D. Novo, F. Pratas, L. Sousa, P. Ienne, G. Falcao, V. Silva, Design Space Exploration of LDPC Decoders Using High-Level Synthesis, IEEE Access, (August 2017), vol. 5, 14600-14615.
  8. X. Liu, F. Xiong, Z. Wang, S. Liang, Design of Binary LDPC Codes With Parallel Vector Message Passing, IEEE Transactions on Communications, (April 2018), vol. 6, no. 4, 1363-1375.
  9. Y.C. Liao, C. Lin, H.C. Chang, S. Lin, A (21150, 19050) GC-LDPC Decoder for NAND Flash Applications IEEE Transactions on Circuits and Systems–I: Regular Papers, (March 2019), vol. 66, no. 3, 1219-1230.

The article was corrected linguistically by Native English. This led to a number of minor language changes.

Yours sincerely,

Mateusz Kuc

Wojciech Sułek

Dariusz Kania

Author Response File: Author Response.pdf

Reviewer 2 Report

The literature on QC-LDPC is very dense, as well as the implementations on FPGA. The article here presents a variation of these various implementations. It would have been interesting to compare all the key performances (processing throughput, processing latency, hardware resource requirements, error correction capability, processing energy efficiency, bandwidth efficiency, and flexibility) compared to other known implementations, in order to truly expose all the advantages and all the disadvantages of this design.
Although the title of the article applies this implementation to CPS (distributed, low resources), this infrastructure is ultimately quite underdeveloped and discussed in the text. We would have liked concrete results on a real experience.

Example of articles, chosen at random, among the many possible:
A survey of FPGA-based LDPC decoders, IEEE Communications Surveys & Tutorials December 2015, Peter Hailes, Lei Xu, Robert G. Maunder, Bashir M. Al-Hashimi and Lajos Hanzo
Strategies for High-Throughput FPGA-based QC-LDPC Decoder Architecture, Swapnil Mhaske, Hojin Kee, Tai Ly, Ahsan Aziz, Predrag Spasojevic
Memory System Optimization for FPGA Based Implementation of Quasi-Cyclic LDPC Codes Decoders, Xiaoheng Chen, Jingyu Kang, Shu Lin, and Venkatesh Akella

minor corrections:
102 It is known [15] that significantly improved deoding performance => It is known [15] that significantly improved decoding performance
184 LogicModule int the case => LogicModule in the case

Author Response

Dear Sir/Madam,

thank you for your comments. We took your comment into consideration and improved the paper. Complete "List of changes" (including changes made on the basis of the other reviews) is at the end of this letter. We’d like to take a stance on some of your remarks:

Point 1

“The literature on QC-LDPC is very dense, as well as the implementations on FPGA. The article here presents a variation of these various implementations. It would have been interesting to compare all the key performances (processing throughput, processing latency, hardware resource requirements, error correction capability, processing energy efficiency, bandwidth efficiency, and flexibility) compared to other known implementations, in order to truly expose all the advantages and all the disadvantages of this design.

Although the title of the article applies this implementation to CPS (distributed, low resources), this infrastructure is ultimately quite underdeveloped and discussed in the text. We would have liked concrete results on a real experience.”

 

The results presented in this article are a part of a broader research, with a purpose to build an energy-efficient LDPC decoder. This article pays special attention to two aspects, the essence of which is to orient the decoder implementation to specific logical resources contained in a LUT-based FPGA.

The original contribution is:

  • technological mapping of the normalization block in the LUT cell,
  • an approach to the implementation of the control system.

The method of implementation of the normalization block allows performing calculations in the LUT block providing greater possibilities compared to known methods, while at the same time more efficiently matching the solution to LUT-based FPGA. The distributed form of the control system allows for the elimination of clock skews, and additionally creates the possibility of an easy "clock gating".

In this situation, the experimental part contains the results that confirmed the benefits of the proposed solutions. Probably there are no directly connected references to this specific part of our research. While we can compare the overall decoder resources with a reference of an FPGA implementations  for the WiMax decoders (another decoder that we implemented), we found that these results are loosely connected with the main scope of the article.

Below are examples of results obtained for the WiMax standard (R=1/2):

WiMax
  Architecture Algorithm ALMs Memory bits
Own Partially-parallel Normalized Min-Sum 19259 79528
[1] Partially-parallel Normalized Min-Sum 26505 -
[2] Partially-parallel Offset Min-Sum 11028 60288
[3] Partially-parallel Normalized Min-Sum 40138 76032

[1] Xiao Peng and Satoshi Goto, "Implementation of LDPC decoder for 802.16e," 2009 IEEE 8th International Conference on ASIC, Changsha, Hunan, 2009, pp. 501-504.

[2] K. K. Gunnam, G. S. Choi, W. Wang, E. Kim and M. B. Yeary, "Decoding of Quasi-cyclic LDPC Codes Using an On-the-Fly Computation," 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, 2006, pp. 1192-1199.

[3] Shuangqu Huang, Bo Xiang, Bei Huang, Yun Chen and Xiaoyang Zeng, "A flexible architecture for multi-standard LDPC decoders," 2009 IEEE 8th International Conference on ASIC, Changsha, Hunan, 2009, pp. 493-496.

 

Point 2

“Example of articles, chosen at random, among the many possible:

A survey of FPGA-based LDPC decoders, IEEE Communications Surveys & Tutorials December 2015, Peter Hailes, Lei Xu, Robert G. Maunder, Bashir M. Al-Hashimi and Lajos Hanzo

Strategies for High-Throughput FPGA-based QC-LDPC Decoder Architecture, Swapnil Mhaske, Hojin Kee, Tai Ly, Ahsan Aziz, Predrag Spasojevic

Memory System Optimization for FPGA Based Implementation of Quasi-Cyclic LDPC Codes Decoders, Xiaoheng Chen, Jingyu Kang, Shu Lin, and Venkatesh Akella”

 

The Introduction section has been re-organized and expanded. We have outlined recent development in the area. Some related works have been referenced.

New references:

  1. J. Chen, A. Dholakia, E. Eleftheriou, M.P.C. Fossorier, X.Y. Hu: Reduced-Complexity Decoding of LDPC Codes IEEE Transactions on Communications, (August 2005), vol. 53, no. 8, 1288-1299.
  2. K. Zhang, X. Huang, Z. Wang, High-Throughput Layered Decoder Implementation for Quasi-Cyclic LDPC Codes, IEEE Journal on Selected Areas in Communications, August 2009, vol. 27, no. 6, 985-994.
  3. Z. Wang, Z. Cui, J. Sha, VLSI Design for Low-Density Parity-Check Code Decoding, IEEE Circuits and Systems Magazine, (First Quarter 2011), vol. 11, 52-69.
  4. J. Li, K. Liu, S. Lin, K. Abdel-Ghaffar, Algebraic Quasi-Cyclic LDPC Codes: Construction, Low Error-Floor, Large Girth and a Reduced-Complexity Decoding Scheme, IEEE Transactions on Communications, (August 2014), vol. 62, no. 8, 2626-2637.
  5. P. Hailes, L. Xu, R. G. Maunder, B. M. Al-Hashimi, L. Hanzo, A survey of FPGA-based LDPC decoders, IEEE Communications Surveys & Tutorials, (Second Quarter 2016), vol. 18, no. 2, 1098-1122.
  6. A. Tasdighi, A.H. Banihashemi, M.R. Sadeghi, Efficient Search of Girth-Optimal QC-LDPC Codes, IEEE Transactions on Information Theory, (April 2016), vol. 62, no. 4, 1552-1564.
  7. J. Andrade, N. George, K. Karras, D. Novo, F. Pratas, L. Sousa, P. Ienne, G. Falcao, V. Silva, Design Space Exploration of LDPC Decoders Using High-Level Synthesis, IEEE Access, (August 2017), vol. 5, 14600-14615.
  8. X. Liu, F. Xiong, Z. Wang, S. Liang, Design of Binary LDPC Codes With Parallel Vector Message Passing, IEEE Transactions on Communications, (April 2018), vol. 6, no. 4, 1363-1375.
  9. Y.C. Liao, C. Lin, H.C. Chang, S. Lin, A (21150, 19050) GC-LDPC Decoder for NAND Flash Applications IEEE Transactions on Circuits and Systems–I: Regular Papers, (March 2019), vol. 66, no. 3, 1219-1230.

 

Point 3

„minor corrections:

102 It is known [15] that significantly improved deoding performance => It is known [15] that significantly improved decoding performance

184 LogicModule int the case => LogicModule in the case”

 

Thank you very much. Corrected.

 

Yours sincerely,

 

Mateusz Kuc

Wojciech Sułek

Dariusz Kania

Author Response File: Author Response.pdf

Reviewer 3 Report

In this work, authors have proposed an irregular hardware implementation specifically in FPGA devices of the decoder for Quasi-Cyclic (QC-LDPC) subclass of codes. The decoder can be configured to support the typical decoding algorithms, including Min-Sum and Normalized Min-Sum. The motivation and the proposed approaches are demonstrated, and the paper is well-organized. The main contributions are:

  1. Develop an irregular QC LDPC decoder implementation for FPGA programmable chip
  2. Propose technology mapping approach for NMS algorithms
  3. Present the decoder design and implement LDPC decoder inside LUT-based FPGA devices

However, there are some concerns should be addressed:

  1. The computed control node messages will be saved to the Rnm memory, and then read by the unit calculating bit node and cyclically shifted to the left by SL module. What is the overhead (e.g., time and space) for the storage, read and shift?
  2. The process from the Qn memory to Communication Handling Module is not well demonstrated in Fig. 3 accordingly to the explanation in line 126, since there exists a conditional relationship on the arrow (when all control equations are met).
  3. Application of another (non-linear) normalization functions can result in not worse, possibly improved decoding correction performance. More discussion or evaluation results would be better to clarify the assumption.
  4. There are some grammar mistakes. (an FPGA devices; and sign according to the equation; control unit activate the next control unit; typical FPGA chip FPGAs contains; …)

Author Response

Dear Sir/Madam,

thank you for your comments. We took your comment into consideration and improved the paper. Complete "List of changes" (including changes made on the basis of the other reviews) is at the end of this letter. We’d like to take a stance on some of your remarks:

Point 1

“The computed control node messages will be saved to the Rnm memory, and then read by the unit calculating bit node and cyclically shifted to the left by SL module. What is the overhead (e.g., time and space) for the storage, read and shift?”

 

Thank you very much for the interesting question. Memory sizes Qnm and Rmn strictly depend on the H control matrix and bit precision. It is the product of the number of non-zero sub matrices P, the size of the P sub matrices and the bit precision. For the depicted implementation of the decoder with the matrix shown in Fig. 2, it will be 51 non-zero P sub matrices of size 64 for 4-bit precision, which gives 13056 memory bits.

Writing and reading to and from memory takes one clock cycle. For this purpose, it was necessary to thoroughly analyze the memory control system ensuring its proper activation.

 

Point 2

"The process from the Qn memory to Communication Handling Module is not well demonstrated in Fig. 3 accordingly to the explanation in line 126, since there exists a conditional relationship on the arrow (when all control equations are met)."

 

Thank you very much for your attention. We have introduced appropriate corrections in the text.

Lines 141-144 - action description for result transmission module has been clarified.

 

Point 3

"Application of another (non-linear) normalization functions can result in not worse, possibly improved decoding correction performance. More discussion or evaluation results would be better to clarify the assumption."

 

During the research, numerous variants of the Karnaugh maps were verified. The article presents one of the best Karnaugh map for the tested control matrix H.

Simulation research and hardware verification takes 3-4 days. Only the best result is included in the paper.

 

Point 4

"There are some grammar mistakes. (an FPGA devices; and sign according to the equation; control unit activate the next control unit; typical FPGA chip FPGAs contains;)"

 

The above expressions have been corrected.

The paper has been corrected linguistically by Native English. This led to a number of minor language changes.

 

 

Yours sincerely,

Mateusz Kuc

Wojciech Sułek

Dariusz Kania

 

 

 

List of changes:

  1. The Introduction section has been re-organized and expanded.
  2. Lines 141-144 - action description for result transmission module has been clarified.
  3. The article was corrected linguistically by Native English. This led to a number of minor language changes not included in the above list.
  4. The list of references has been expanded. New references are listed below:
  • Chen, A. Dholakia, E. Eleftheriou, M.P.C. Fossorier, X.Y. Hu: Reduced-Complexity Decoding of LDPC Codes IEEE Transactions on Communications, (August 2005), vol. 53, no. 8, 1288-1299.
  • Zhang, X. Huang, Z. Wang, High-Throughput Layered Decoder Implementation for Quasi-Cyclic LDPC Codes, IEEE Journal on Selected Areas in Communications, August 2009, vol. 27, no. 6, 985-994.
  • Wang, Z. Cui, J. Sha, VLSI Design for Low-Density Parity-Check Code Decoding, IEEE Circuits and Systems Magazine, (First Quarter 2011), vol. 11, 52-69.
  • Li, K. Liu, S. Lin, K. Abdel-Ghaffar, Algebraic Quasi-Cyclic LDPC Codes: Construction, Low Error-Floor, Large Girth and a Reduced-Complexity Decoding Scheme, IEEE Transactions on Communications, (August 2014), vol. 62, no. 8, 2626-2637.
  • Hailes, L. Xu, R. G. Maunder, B. M. Al-Hashimi, L. Hanzo, A survey of FPGA-based LDPC decoders, IEEE Communications Surveys & Tutorials, (Second Quarter 2016), vol. 18, no. 2, 1098-1122.
  • Tasdighi, A.H. Banihashemi, M.R. Sadeghi, Efficient Search of Girth-Optimal QC-LDPC Codes, IEEE Transactions on Information Theory, (April 2016), vol. 62, no. 4, 1552-1564.
  • Andrade, N. George, K. Karras, D. Novo, F. Pratas, L. Sousa, P. Ienne, G. Falcao, V. Silva, Design Space Exploration of LDPC Decoders Using High-Level Synthesis, IEEE Access, (August 2017), vol. 5, 14600-14615.
  • Liu, F. Xiong, Z. Wang, S. Liang, Design of Binary LDPC Codes With Parallel Vector Message Passing, IEEE Transactions on Communications, (April 2018), vol. 6, no. 4, 1363-1375.
  • Y.C. Liao, C. Lin, H.C. Chang, S. Lin, A (21150, 19050) GC-LDPC Decoder for NAND Flash Applications IEEE Transactions on Circuits and Systems–I: Regular Papers, (March 2019), vol. 66, no. 3, 1219-1230

Author Response File: Author Response.pdf

Back to TopTop