Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Human–Machine Collaborative Learning for Streaming Data-Driven Scenarios

Sensors 2025, 25(21), 6505; https://doi.org/10.3390/s25216505

by Fan Yang^1,2,*

, Xiaojuan Zhang^1,2

and Zhiwen Yu³

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Sensors 2025, 25(21), 6505; https://doi.org/10.3390/s25216505

Submission received: 10 September 2025 / Revised: 14 October 2025 / Accepted: 15 October 2025 / Published: 22 October 2025

(This article belongs to the Special Issue Smart Sensing System for Intelligent Human–Computer Interaction)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The concept of human-machine collaboration for data flows is innovative and relevant; however, the paper has many limitations:

Superficial theoretical framework: The formalization L = {H,M, R, C,P} is a promising outline but remains too abstract and is not operationalized in the methods.

Vague methodology: The description of the algorithms and the integration of human feedback is too generic, making replication impossible.

Absent experimental evidence: Figures and tables of results are missing, invalidating any claims of superior performance.

Non-existent analysis of results: No precise numerical data or statistical tests are provided to support the conclusions.

Writing and structure needs improvement: The quality of the language and overall clarity of the manuscript require thorough revision.

Comments on the Quality of English Language

english could be improved

Author Response

1. The concept of human-machine collaboration for data flows is innovative and relevant; however,

the paper has many limitations:

Response 1: Through in-depth analysis, I have clarified the characteristics of human-machine collaboration more

clearly. Based on the applicable scenarios of human-machine collaborative learning, it is a new and

promising machine learning method.

2. Superficial theoretical framework: The formalization L = {H,M, R, C,P} is a promising outline but

remains too abstract and is not operationalized in the methods.

Response 2: I reading some related reference papers, and think my framework deeply, I provide a workflow of the

frame work runtime as shown in Figure 3, depict each correlation part, and detailed it in section 3.3

“Rules for HMLSDS”. It includes task allocation mechanism, conflict resolution in decision-making,

and model adjusting based on human feedback.

3. Vague methodology: The description of the algorithms and the integration of human feedback is

too generic, making replication impossible.

Response 3: Through workflow architecture and possible situations may occur in framework execution，we

provide robust rules to ensure it can conduct smoothly and can repeat according our workflow. Besides,

we add practical contents makes our human-machine collaborative learning more feasible to work.

4. Absent experimental evidence: Figures and tables of results are missing, invalidating any claims of

superior performance.

Response 4: We have added a large number of experimental contents to verify the effectiveness of the method.

What is shown in section 5.2, part 2, Performance Evaluation, about 3 pages to demonstrate the

effectiveness of our method. Mainly on three data-driven scenarios, by the increasing human

workload and improvement of the task accuracy, and the ratio of workload to the task accuracy

(use Pe to stand for).

5. Non-existent analysis of results: No precise numerical data or statistical tests are provided to

support the conclusions.

Response 5: In section 5, we displayed the competitive task precision compared to related classical methods,

and analysis the experiments data by our proposed evaluation metrics (Acc, Workload and Pe).

6. Writing and structure needs improvement: The quality of the language and overall clarity of the

manuscript require thorough revision.

Response 6: I carefully read and revised the entire text, conducted a proofreading process, and redesigned some

unclear images. As a result, the current paper has significantly improved in terms of readability

and clarity.

Reviewer 2 Report

Comments and Suggestions for Authors

The following aspects must be improved or explained in more detail:

The paper frames the proposed framework as a novel paradigm, but its foundations strongly overlap with existing approaches such as human-in-the-loop machine learning, interactive machine learning, and hybrid intelligence – the own contributions are not clearly separated from prior work, making it difficult to identify what is new
The “new evaluation criteria” reduce essentially to accuracy, time, and cost — which are standard metrics in machine learning, not novel evaluation methods
In abstract “The model will be well trained in a shorter time” but no comparison regarding training time is done.
The pseudocode (Algorithms 1–3) is imprecise: parameters and workflow steps are not fully explained, reducing reproducibility
Strengthen the comparative analysis with classical ML approaches using quantitative experiments rather than high-level qualitative claims: the comparison with classical ML approaches (active learning, transfer learning, reinforcement learning, etc.) is superficial and qualitative
Expand the limitations discussion to include cost, scalability, bias, and ethical implications of human participation in ML pipelines: Key issues such as cost of human involvement, potential biases in human judgment, and ethical considerations are not adequately discussed

Author Response

1. The paper frames the proposed framework as a novel paradigm, but its foundations strongly

overlap with existing approaches such as human-in-the-loop machine learning, interactive machine

learning, and hybrid intelligence – the own contributions are not clearly separated from prior work,

making it difficult to identify what is new

Response 1 :Through reading relevant literature, I gained a clearer understanding of the characteristics of

human-machine collaborative learning, and presented a more detailed description in the article.

Especially in situations such as the identification of uncertain and difficult samples, the

advantages of human-machine collaborative learning become very obvious. I have described the

strength of our method in section I Introduction and other related parts.

2. The “new evaluation criteria” reduce essentially to accuracy, time, and cost — which are standard

metrics in machine learning, not novel evaluation methods.

Response 2: We have redefined the evaluation criteria, including the amount of human involvement, task accuracy,

and the ratio of human involvement to task accuracy, which are more suitable for the evaluation of

human-machine collaborative learning. In the section 5, we conduct extensive experiments to

demonstrate the proposed evaluation criteria.

3. In abstract “The model will be well trained in a shorter time” but no comparison regarding training

time is done.

Response 3: Through careful consideration, we removed the statement about the relatively short training time, and

mainly focused on the experiments and descriptions regarding the increase in manual computation

volume and the improvement in task performance.

4. The pseudocode (Algorithms 1–3) is imprecise: parameters and workflow steps are not fully

explained, reducing reproducibility

Response 4: We provided further explanations for the parameters and processes of the pseudo code, and added

sections on the workflow architecture and critical rules in the article (Section 3.3), highlighting the

feasibility and effectiveness of the method.

5. Strengthen the comparative analysis with classical ML approaches using quantitative experiments

rather than high-level qualitative claims: the comparison with classical ML approaches (active

learning, transfer learning, reinforcement learning, etc.) is superficial and qualitative

Response 5: In our data scenario, it is not realistic to compare with other major machine learning methods. This is

because the usability of some methods, such as federated learning and meta-learning, makes it

impossible to conduct a horizontal comparison. We conducted a thorough analysis of the essential

advantages and disadvantages of human-machine collaborative learning and other mainstream

methods, and provided a more persuasive comparison.

6. Expand the limitations discussion to include cost, scalability, bias, and ethical implications of

human participation in ML pipelines: Key issues such as cost of human involvement, potential

biases in human judgment, and ethical considerations are not adequately discussed

Response 6: We are very grateful for your more comprehensive suggestions. In the discussion section, we

conducted a more in-depth analysis of some aspects that were not considered before, and also provided

some descriptions in section 3.3.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Formal framework better explained - The framework's mathematical modeling provides significant scientific rigor.

Good experiments - Validation on three distinct scenarios clearly demonstrates the versatility of the approach.

Human-machine interface needs to be refined - The description of interactive interfaces deserves more technical details.

Relevant metrics - The cost/benefit analysis (human effort vs. accuracy) is particularly convincing.

In-depth discussion - The comparison with other ML paradigms positions the contribution well within the state of the art.

Comments on the Quality of English Language

English could be improved.

Author Response

Formal framework better explained - The framework's mathematical modeling provides significant scientific rigor.

We sincerely thank the reviewer for recognizing the scientific rigor of our framework’s mathematical modeling and for highlighting the value of a well-explained formal framework. To further strengthen the transparency, completeness, and interpretability of this section, we have expanded the mathematical details in the revised manuscript (marked in blue, Section 3.3) — focusing on explicit formula definitions, theoretical derivations, and connections to real-world streaming task constraints.

By supplementing these explicit formulas, derivations, and theoretical connections, we aim to make our framework’s mathematical foundation fully reproducible. Researchers can now replicate our parameter settings, adapt the loss functions to other streaming tasks, and validate the model’s behavior against the theoretical constraints of concept drift. We believe this revision further enhances the scientific rigor of our work and addresses the reviewer’s focus on a well-explained formal framework.

2. Good experiments - Validation on three distinct scenarios clearly demonstrates the versatility of the approach.

We conducted extensive experiments on three streaming data-driven scenarios, video anomaly detection, person reidentification, and sound event detection, which regarding videos, images and audio. The effectiveness of the method was also demonstrated in the experiment, indicating that the method we proposed has certain versatility applicability. At the end of the experimental section, an analysis was also conducted, highlighting the versatility and effectiveness of the method.

3. Human-machine interface needs to be refined - The description of interactive interfaces deserves more technical details.

We sincerely thank the reviewer for pointing out the need to refine the human-machine interface (HMI) description and supplement technical details. This suggestion is crucial for enhancing the practicality and reproducibility of our work, as the HMI serves as the core bridge between our human-machine collaborative learning for streaming data-driven method and real-world user operations. We have thoroughly revised this issue (marked in blue in the manuscript) and now elaborate on the HMI’s architectural design, functional modules, and technical parameters in detail in subsection 4.4

By supplementing these technical details, we aim to make the HMI design more transparent and actionable—users (e.g., engineers deploying the system in surveillance or smart home scenarios) can now reproduce the interface configuration, adjust parameters based on specific task needs, and understand how the HMI interacts with our core model. We hope this revision fully addresses the reviewer’s concern and further strengthens the practical value of our work.

4. Relevant metrics - The cost/benefit analysis (human effort vs. accuracy) is particularly convincing.

We sincerely thank the reviewer for recognizing the persuasiveness of our cost/benefit analysis (focused on the trade-off between human effort and accuracy). This validation reinforces the practical value of our proposed method—especially for real-world streaming data tasks where labeling resources are often limited, and performance stability is critical. To further strengthen the rigor and transparency of this analysis, we have supplemented additional details in the revised manuscript (marked in red in Section 5.2, and marked in blue in Section 7, Comparison with other classical machine learning).

By supplementing these details—including metric definitions (in Section 3.2 marked in blue color), comparative baselines, performance evaluation—we aim to make the cost/benefit (human effort/accuracy) analysis more actionable and credible.

5. In-depth discussion - The comparison with other ML paradigms positions the contribution well within the state of the art.

We sincerely appreciate the reviewer’s positive feedback regarding the in-depth discussion section, particularly the recognition that our comparison with other machine learning (ML) paradigms effectively positions the contribution of our proposed method within the state of the art. This affirmation further validates our efforts to contextualize our work within the broader ML landscape, and we would like to elaborate on the key considerations and details that underpinned this comparison, to reinforce the clarity and rigor of our contribution.

To ensure the comparison was comprehensive and insightful, we systematically categorized the mainstream ML paradigms relevant to our research field (streaming data-driven tasks, including video anomaly detection, person reidentification, and sound event detection). These paradigms primarily include traditional supervised learning, unsupervised, semi-supervised learning and online learning paradigms.

In summary, by contrasting our method with these key ML paradigms across dimensions of adaptability, efficiency, accuracy, and theoretical rigor, we aimed to clearly delineate its innovative value and position it as a practical, versatile solution for real-world streaming data-driven tasks. We hope this elaboration further strengthens the persuasiveness of our discussion, and we remain grateful for the reviewer’s recognition of this aspect of our work.

6. Comments on the Quality of English Language

English could be improved.

We sincerely thank the reviewer for pointing out the need to improve the English language quality of the manuscript. We fully recognize that clear, precise, and grammatically accurate English is critical for effectively conveying our research findings and ensuring readability for the academic community. To address this comment comprehensively, we have implemented a multi-step revision process to refine the language throughout the manuscript (all revisions are marked in blue or red), with key improvements focused on the following aspects:

1. Grammatical Accuracy and Sentence Structure

We carefully reviewed each section to correct grammatical errors (e.g., subject-verb agreement, tense consistency, preposition usage) and optimize sentence flow—particularly in technical descriptions where complex concepts risk being obscured by awkward phrasing.

2. Technical Terminology Consistency

To avoid confusion, we standardized technical terminology across the manuscript—ensuring that key concepts (e.g., “concept drift,” “adaptive bitrate streaming,” “weakly supervised dynamic labeling”) are used consistently and defined clearly upon first mention.

3. Clarity of Logical Connections

We strengthened the logical flow between sentences and paragraphs by adding transitional phrases (e.g., “To address this gap,” “In contrast to existing approaches,” “For validation”) and reordering technical details to follow a “problem-solution-result” structure—especially in the experimental and discussion sections.

4. External Proofreading and Validation

To ensure the revisions meet academic English standards, we engaged a native English-speaking researcher with expertise in machine learning and human-machine interaction to conduct a final proofread.

Reviewer 2 Report

Comments and Suggestions for Authors

Some of my comments were addressed but the main important one: comparison with other method is not improved: the comparative analysis must be done with newer references / methods in order to provide the benefits.

Author Response

Some of my comments were addressed but the main important one: comparison with other method is not improved: the comparative analysis must be done with newer references / methods in order to provide the benefits.
We deeply appreciate the reviewer’s persistence in highlighting the critical need to anchor our comparative analysis in 2023–2025 state-of-the-art (SOTA) methods—this feedback is fundamental to ensuring our work’s relevance and rigor, and we acknowledge that our prior revision did not fully address the "newness" of references across all task dimensions. To resolve this, we have comprehensively updated Section 5.2 (Experimental results) to include 6 recently published methods (2023–2025)—all from top-tier venues (CVPR 2023, ICCV 2023, CVPR 2024, WACV2025 etc.)—and restructured the analysis to explicitly link our method’s design choices to the limitations of these newer baselines.

For video anomaly detection scene, two of our datasets, UCSD pedestrian and the Subway exit dataset is rarely used in recent 5 yeas, only the CUHK Avenue dataset is used. The reason is those two datasets are small size, single scene, and limited number of anomalies. We conduct a comparative experiment on the CUHK Avenue dataset with three methods from CVPR, ICCV published in 2023; For person re-identification scene, we conduct experiments compare with three new approaches which are published in 2023-2025.

We deeply value the reviewer’s feedback, which has pushed us to ground our work more rigorously in the latest research. This revised comparative analysis now clearly demonstrates our method’s unique benefits against the most recent SOTA.

Article Menu

Human–Machine Collaborative Learning for Streaming Data-Driven Scenarios

Further Information

Guidelines

MDPI Initiatives

Follow MDPI