Next Article in Journal
Relative-Breakpoint-Based Crack Annotation Method for Lightweight Crack Identification Using Deep Learning Methods
Previous Article in Journal
Anti-Obesity Potential of Sargassum horneri and Ulva australis Extracts: Study In Vitro and In Vivo
 
 
Article
Peer-Review Record

Improving Structured Grid-Based Sparse Matrix-Vector Multiplication and Gauss–Seidel Iteration on GPDSP

Appl. Sci. 2023, 13(15), 8952; https://doi.org/10.3390/app13158952
by Yang Wang 1,2,3, Jie Liu 1,2, Xiaoxiong Zhu 1,2, Qingyang Zhang 1,2, Shengguo Li 1,2 and Qinglin Wang 1,2,*
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(15), 8952; https://doi.org/10.3390/app13158952
Submission received: 28 June 2023 / Revised: 28 July 2023 / Accepted: 29 July 2023 / Published: 3 August 2023

Round 1

Reviewer 1 Report (Previous Reviewer 2)

This paper proposes an algorithm to perform sparse matrix-vector multiplication and Gauss-Seidel iteration on GPDSP processors. They explain the algorithms, and the hardware architecture well. This submission is a great improvement over the previous version. I have only a few comments, mostly minor. My biggest comment concerns explanation of the example on page 6.

Detailed comments:

Page 2:

- Spmv -> SpMV

Page 4:

- At each step of iteration -> At each iteration through the outer for loop

Page 5:

- mehods -> methods

- bu f f er_x: use \textit in math mode to avoid letters from being set apart

- Vector_x to the Buffer_idx: isn’t Buffer_x indexed by Buffer_idx?

Page 6:

- The overall calculation order is shown in the data arrangement order: this is overall mostly clear, apart from how the blocks relate to the grid. Which part of the grid is actually covered by a block? And shouldn’t there be green parts in Figure 5 below?

Page 9:

- Algorithm 4 and Algorithm 4 -> Algorithm 3 and Algorithm 4

Page 13:

- used in scientific -> used in scientific computing

Apart from the minor points listed in my detailed comments, I have no issues with the level of English at all.

Author Response

We have studied comments carefully and have made corrections which we hope meet with approval. The main corrections in the paper and the responds to the reviewer’s comments are as flowing:

Responds to the reviewer’s comments:

Reviewer 1:

  1. Response to comment:

Page 2:

- Spmv -> SpMV

Response:

- We have revised the “Spmv” to “ SpMV”.

 

  1. Response to comment:

Page 4:

- At each step of iteration -> At each iteration through the outer for loop

Response:

- We have revised the “At each step of iteration” to “At each iteration through the outer for loop”.

 

  1. Response to comment:

Page 5:

- mehods -> methods

- bu f f er_x: use \textit in math mode to avoid letters from being set apart

- Vector_x to the Buffer_idx: isn’t Buffer_x indexed by Buffer_idx?

Response:

- We have revised the “mehods” to “ methods”.

- We have used \textit. to avoid letters from being set apart.

- We have revised the “Vector_x to the Buffer_idx” to “Buffer_x indexed by Buffer_idx”.

 

 

  1. Response to comment:

Page 6:

- The overall calculation order is shown in the data arrangement order: this is overall mostly clear, apart from how the blocks relate to the grid. Which part of the grid is actually covered by a block? And shouldn’t there be green parts in Figure 5 below?

Response:

Suppose the dimension of the grid is nx*ny*nz,and the size of each block is set to nx/3 * ny/3 * nz/3, then all the grid will be divided into 3*3*3=27 blocks. Within each block, color is assigned to all nodes. Indeed, as you said, color0-color7 in the Figure 5 below corresponds to 8 different colors, which should include green,blue, etc., but they are omitted from the Figure 5, and only color0 and color7 are shown. "......" in the Figure 5 means all colors except color0 and color7.

 

  1. Response to comment:

Page 9:

- Algorithm 4 and Algorithm 4 -> Algorithm 3 and Algorithm 4

Response:

- We have revised the “Algorithm 4 and Algorithm 4 ” to “Algorithm 3 and Algorithm 4”.

 

  1. Response to comment:

Page 13:

- used in scientific -> used in scientific computing

Response:

- We have revised the “used in scientific ” to “used in scientific computing”.

 

Special thanks to you for your careful review and good comments.

Reviewer 2 Report (Previous Reviewer 1)

The authors revised the paper considering my comments. I think the paper is suitable for publication in its current form.

A certain polishing of English is required (eliminating unnecessary commas, separating long sentences, etc). However, I think this can be done by the scientific editor of the paper.

Author Response

The reviewer has no comments and suggestions.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The authors describe a modification of SpMV and Gauss-Seidel algorithms for GPDSP architecture. Computational experiments demonstrate significant speedup of the modified algorithms.

The paper is technically sounded, but too specialized. I do not think that “Applied Sciences” is a good choice for submitting this paper. I would recommend a more specialized journal on high-performance computing, e.g.  ACM Transactions on Mathematical Software or The International Journal of High Performance Computing Application. Before resubmitting the paper elsewhere, I recommend considering the following comments:

1.       The title of the paper “Optimizing structured grid-based sparse matrix-vector Multiplication and Gauss-Seidel Iteration on GPDSP” is misleading. I can’t find an optimization problem in the paper. What is the objective of the optimization? Which optimization method do you use? I recommend using “Modifying” or “Improving” instead of “Optimizing”.

2.       Line 54. “The main work is as follows:..” Poorly written, please rephrase.

3.       Line 162 “Due to the limited size of AM memory, it is necessary to computer AM_row which is the maximum number of matrix rows that can be accommodated in AM.” Not clear, please rephrase.

4.       Line 209. “Experiment results” What do you mean?

Careful English proofreading is recommendable.

Reviewer 2 Report

The authors propose an approach to optimise grid-based sparse matrix-vector multiplication for GPDSP processors. They describe the algorithms used, and present experimental results obtained with implementations of those algorithms.

The basic idea behind the approach is reasonably well explained, but the background is really lacking in describing the setting and how the matrices are produced. Examples would help the understandability a lot, also in other parts of the paper (see the detailed comments below). The algorithms need to be explained in more detail. In addition, a related work section is completely missing, even though a lot of work has been published on parallel matrix-vector multiplication. How does your work compare to that? The current text needs to be extended considerably to address these points.

Detailed comments:

Page 2:

- Zhao et al. mapping -> Zhao et al. mapped

- However, the related research on memory-intensive programs is still a gap: please rephrase, there is a gap w.r.t. what? What is not addressed?

- Section 2.1: there is much I don't understand about this subsection. How do the formats in Figure 1 relate to partial differential equations? How such a format exactly lead to a matrix? How can you derive the values of n_dof and n_neib from a format? An example is mentioned, but only cited, and not worked out in the current text. I advice the authors to include a small example, and be more precise in how a matrix is derived.

Page 3:

- What is a stiffness matrix?

- algorithms in ELL format: the algorithms themselves are not in ELL format, but the matrices they operate on.

- Gauss-Seidel ... can only be computed serially: that is not what Algorithm 2 implies, because the inner for-loop could be performed in parallel. Be more specific about which part needs to be computed serially.

Page 4:

- Floating point Multiply ACcumulator -> Floating point Multiply ACcumulators

Page 5:

- Section 4.2: a concrete worked out example would help a lot here. How are the points assigned colors?

Page 6:

- data completely independent -> completely data independent

- core_row denotes the matrix row number assigned to one DSP core: Is this correct, so the actual number (ID) of a matrix row, or do you mean the number of matrix rows assigned to one DSP core? What if multiple rows are assigned to a core? The same comment applies to block_row and color_row.

- Figure 5: please explain this figure and the next one in greater detail in the text.

- to computer -> to compute

- What if the conditions mentioned for AM memory do not hold? Then the matrix cannot be processed at all?

Page 7:

- Figures 7 and 8: is there a reason that the vector is called x in one figure and ax in the other? If not, please align this. Please also explain the figures in greater detail. What should I notice in particular?

Page 8:

- When discussing Algorithms 3 and 4, please refer to line numbers as you explain them.

Page 9:

- As you mention the two programming modes, please be more specific. How much higher is the Assembly mode performance, and why? Is the C compiler not as efficient?

Page 11:

- paper, We used -> paper, we used

Page 12:

- I plan: please rephrase (do not use "I" in text, and who is "I"? There are multiple authors

The level of English is acceptable. There are some parts that need improving, and sometimes this caused confusion for me (see detailed comments).

Back to TopTop