Next Article in Journal
Advances in the Research on Cosmic Rays and Their Impact on Human Activities
Previous Article in Journal
Automatic Air-to-Ground Recognition of Outdoor Injured Human Targets Based on UAV Bimodal Information: The Explore Study
 
 
Article
Peer-Review Record

Performance Analysis of RCU-Style Non-Blocking Synchronization Mechanisms on a Manycore-Based Operating System

Appl. Sci. 2022, 12(7), 3458; https://doi.org/10.3390/app12073458
by Changhui Kim, Euteum Choi, Mingyun Han, Seongjin Lee and Jaeho Kim *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(7), 3458; https://doi.org/10.3390/app12073458
Submission received: 3 February 2022 / Revised: 19 March 2022 / Accepted: 24 March 2022 / Published: 29 March 2022

Round 1

Reviewer 1 Report

In the experiments, you are doing wrong; the processor has only ten physical cores, activating the hyperthreading. First, you have to tell us in the manuscript how the affinity is configured, please check affinity setup for the intel processors, that is the reason you have peaks in Figure 5.

https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support/openmp-library-support/thread-affinity-interface-linux-and-windows.html

You have to test the performance consciously, enabling and disabling HT, and when the HT is active, you have to assign two threads to the same physical core, in the way you present your results are not very useful. 

Author Response

- Cover Letter -

 

Performance Analysis of RCU-style Non-blocking Synchronization Mechanisms on a Manycore based Operating System

 

ChangHui Kim, Euteum Choi, MinGyun Han, Seongjin Lee, Jaeho Kim

Gyeongsang National University, Repbulic of Korea

 

Dear Applied Sciences Editors,

 

We would like to submit our revised manuscript entitled “Performance Analysis of RCU-style Non-blocking Synchronization Mechanisms on a Manycore based Operating System” to the Applied Sciences. Please note that this work is a result that has never been published elsewhere. Most modern computer systems are equipped with many-core CPUs. The effective use of these systems is one of the important studies. This study is the result of performance evaluation and analysis on the synchronization schemes for morden many-core systems. We appreciate reviewers’ insightful comments. Below we provide a list of our answers to each reviewers’ comment.

 

  • Reviewer #1
  1. The explanation of thread affinity (thread pinning) setting and performance analysis by NUMA effect are supplemented.

Please refer to Section 5.2 and 5.3.1 of our revised manuscript.

  

  • Reviewer #2
  1. We configured the user and kernel-level benchmark evaluation environments identically and analyzed the results.

Please refer to Section 5.1 and 5.3.1 and Figure 5.

 

  1. An explanation of the reason for the low performance of kernel-level RLU has been added.

Please refer to Section 4.2 and 5.3.2.

 

3. Typos and grammatical errors were corrected and marked in red fonts.

Author Response File: Author Response.pdf

Reviewer 2 Report

This manuscript presents an evaluation of synchronization mechanisms for many-core platforms. RCU, RLU, and MV-RLU mechanisms were implemented at both user and kernel levels in a research operating system called sv6 variant, and they were quantitatively compared. The reviewer agrees that such comparisons are important for selecting appropriate mechanisms and/or proposing new one.

However, the evaluation was not properly designed to compare the mechanisms and discuss the differences. Also, there could be an inadequately implementated mechanism, which led to extremely poor performance. So the reviewer recommends authors to revise the manuscript in the following viewpoints.

(1) Experimental environments.
Reasons for using different machines between user-level and kernel-level experiments were not clear. If there aren't any particular reason, the same machine (or set of machines) should be used. Otherwise, it becomes unclear whether differences in performance come from implemented software or machine itself.
In addition, the core counts of 40 and 72 seem to correspond to the number of logical cores; the reviewer thinks that it is more natural to express them as the number of physical cores (i.e., 20 and 36, respectively) and explain that each physical core has two logical cores using simultaneous multithreading.

(2) Kernel-level RLU implementation.
Figure 6 shows that the kernel-level implementation of the RLU mechanism performed poorly, in comparison to the user-level implementation (Fig. 5) and the other mechanisms. Can the restriction of the kfree() function, described in l. 428, affect the performance? Possible reasons should be enumerated and discussion on each of them should be made. It is desirable that the issue be resolved if possible.

Some additional comments are as follows:

  • Although the writing of English was okay, inconsistencies between singular and plural forms are often found in the manuscript. For example:
    • ... the research community introduced RLU and MV-RLU synchronization algorithms ... (l. 7)
    • RLU and MV-RLU, which are called RCU-style synchronization mechanisms, are ... (l. 8)
    • ... for the disadvantages od RCU that are ... (l. 54)
  • MVCC in the keywords section seems to be unfamiliar to everyone: it should be spelled out as Multi-version Concurrency Control.

Author Response

- Cover Letter - 

Performance Analysis of RCU-style Non-blocking Synchronization Mechanisms on a Manycore based Operating System

ChangHui Kim, Euteum Choi, MinGyun Han, Seongjin Lee, Jaeho Kim 
Gyeongsang National University, Repbulic of Korea

Dear Applied Sciences Editors, 

We would like to submit our revised manuscript entitled “Performance Analysis of RCU-style Non-blocking Synchronization Mechanisms on a Manycore based Operating System” to the Applied Sciences. Please note that this work is a result that has never been published elsewhere. Most modern computer systems are equipped with many-core CPUs. The effective use of these systems is one of the important studies. This study is the result of performance evaluation and analysis on the synchronization schemes for morden many-core systems. We appreciate reviewers’ insightful comments. Below we provide a list of our answers to each reviewers’ comment.


• Reviewer #1
1. The explanation of thread affinity (thread pinning) setting and performance analysis by NUMA effect are supplemented.
Please refer to Section 5.2 and 5.3.1 of our revised manuscript.
   

• Reviewer #2
1. We configured the user and kernel-level benchmark evaluation environments identically and analyzed the results.
Please refer to Section 5.1 and 5.3.1 and Figure 5.

2. An explanation of the reason for the low performance of kernel-level RLU has been added.
Please refer to Section 4.2 and 5.3.2.

3. Typos and grammatical errors were corrected and marked in red fonts.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The reviewer confirmed that the manuscript was properly revised by incorporating all of the comments on the previous version. The reviewer now understood why the kernel-level RLU implementation performed poorly in some conditions.

Back to TopTop