Next Article in Journal
A Novel Unsupervised Spectral Clustering for Pure-Tone Audiograms towards Hearing Aid Filter Bank Design and Initial Configurations
Next Article in Special Issue
Building a Production-Ready Multi-Label Classifier for Legal Documents with Digital-Twin-Distiller
Previous Article in Journal
Properties of Emulsion Paint with Modified Natural Rubber Latex/Polyvinyl Acetate Blend Binder
 
 
Article
Peer-Review Record

Evaluating Human versus Machine Learning Performance in a LegalTech Problem

Appl. Sci. 2022, 12(1), 297; https://doi.org/10.3390/app12010297
by Tamás Orosz 1,*, Renátó Vági 1,2, Gergely Márk Csányi 1, Dániel Nagy 1, István Üveges 1,3, János Pál Vadász 1,4 and Andrea Megyeri 5
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2022, 12(1), 297; https://doi.org/10.3390/app12010297
Submission received: 13 December 2021 / Revised: 21 December 2021 / Accepted: 22 December 2021 / Published: 29 December 2021
(This article belongs to the Special Issue Development and Applications of AI on Legal Tech)

Round 1

Reviewer 1 Report

The paper describes a small experiment, where the human-level performance of the legal experts compared on a monotonous and time-consuming task.

 
The performance of the legal experts was compared with a machine learning-based classifiers performance.

The introduction of the paper shows the motivation behind the experiment, it was interesting to see that the overall performance of the machine learning classifier is not so high, about 50% but its significantly higher than the average experts. 


The paper shows the similar contributions of the field, which used only relatively short text to estimate the human-level performance, so the novelty of the paper is clearly formulated. 


The paper shows the results deeply and it validates result of the proposed algorithm. The paper can be interesting from the business point of view, where the managers has to estimate that point when a machine-learning based solution can be applicable in a workflow.

I have some minor remarks, which should be corrected before the acceptance:

Comments: 
---------

line 19: use manual instead of handcrafted;

line 75: use in the the way instead of in that;

line 86: in business processes instead of in the business processes;

line 96: use can the machine instead of can machine

line 97: Please, use of legal experts and laymen or non-expert lawyers, instead of the legal experts, and the laymen or non-expert lawyers  

line 116-118,  hard to understand these sentences, please correct it

line 143: hard to understand this sentence, which start with Because, please correct it

line 230: Figure ??

line 232: Figure ?? van

line 322: also should take into account is missing

Author Response

Dear Reviewer,

 Thank you for your positive feedback and thank you for highlighting these typos and errors, We have proofread the manuscript and modified it according to your suggestions, please find the modifications in the new version of the paper, they are marked by red color.

Best regards,

 Tamás Orosz

Reviewer 2 Report

The paper deals with a comparison between a machine learning based algorithm and the human-level performance. The comparison is based on Hungarian Jurispudence documents, which are long, this is the main difference between this survey and other, previously published comparisons. The authors interested to estimate that point, when the machine learning methodology can be applied instead of human annotators. The paper shows two cases, first, when these algorithms can improve the discoverability of the database or second, when they can replace the human annotators. 
The paper describes and  shows the results of the conducted survey. I have the following questions and comments to the authors:

 

in 2.2 can you explain it in more details that how is hungarian judicial practice, how is it differs from the english/american judicial practice? Is this difference has some important consequences to the topic of the paper?

are the documents in Hungarian languge? How is this language differs from English? Most of the cited documents are in english or chinese, Are there some differences between the applied training methods when you examines an agglutinative language and not english or other indo-european languages?

in section 2.3 can you explain in more details that what was the basic concepts behind the application of the scoring system?

how can this study helps to develop further the proposed algorithms?

Author Response

Dear Reviewer,

Thank you for reviewing the manuscript and your positive comments. Please, find our answers highlighted with blue color:

1) in 2.2 can you explain in more detail that how is Hungarian judicial practice, how is it differs from the English/American judicial practice? Is this difference has some important consequences on the topic of the paper?

Thank you for your comment. We extended the description of the Hungarian legal system in 2.2: The legal system in Hungary is a limited precedent-based system, it formally distinguishes six different groups of matters, in other words, law areas: criminal law, military criminal law, administrative law, labor law, civil law, and economic law.

2) are the documents in the Hungarian language? How is this language differ from English? Most of the cited documents are in English or Chinese, Are there some differences between the applied training methods when you examine an agglutinative language and not English or other Indo-European languages?

Thank you for your comment, we corrected the text and inserted the following text into chapter 2.2: The published case law is entirely in Hungarian, due to the special agglutinative property of the Hungarian language, which makes most natural language processing tasks quite difficult.

3) in section 2.3 can you explain in more detail that what was the basic concepts behind the application of the scoring system?

We extended the section with the answer to your question: Applying this scoring system the area, which represents the value of the information, is in the same range as in the Figure. 1

4) how can this study help to develop further the proposed algorithms?

One of the main results of this study is that the label set should not only be reviewed from a legal perspective but also other domain knowledge should be taken into account to increase the discoverability, usability, and value of the legal database for further development of the original project.

Best regards,

  Tamás Orosz

Reviewer 3 Report

There are some weaknesses through the manuscript which need improvement. Therefore, the submitted manuscript cannot be accepted for publication in this form, but it has a chance of acceptance after a minor revision. My comments and suggestions are as follows:

1- Abstract gives information on the main feature of the performed study, but some details about the ML algorithm must be added.

2- Authors must clarify necessity of the performed research. Objectives of the study must be clearly mentioned in introduction.

3- The literature study must be enriched. In this respect, authors must read and refer to the following papers: (a) application of ML: https://doi.org/10.1016/j.jmrt.2021.07.004 (b) https://doi.org/10.1016/j.ipm.2021.102798 text classification: and other research works.

4- It would be nice, if authors could add some figures (schematic) to show concept and some conditions.

5- How the main question of the study are considered?

6- For using AI methods instead of human activities authors can refer to industrial applications, such as https://doi.org/10.1007/s10845-019-01481-0 and https://doi.org/10.1016/j.promfg.2020.01.421.

7- Details of accuracy calculation must be presented in the manuscript. In addition, error in calculation must be considered and discussed.

8- In its language layer, the manuscript should be considered for English language editing. There are sentences which have to be rewritten.

9- The conclusion must be more than just a summary of the manuscript. List of references must be updated based on the proposed papers. Please provide all changes by red color in the revised version.

 

 

Author Response

Dear Reviewer,

 Thank you for your comments and your positive feedback. Please find our answers to your remarks regarding the manuscript highlighted with blue color and in the paper with red color.

1- Abstract gives information on the main feature of the performed study, but some details about the ML algorithm must be added.

Thank you for your comment, we modified the abstract in the following way: 
The machine-learning-based application used binary SVM-based classifiers to resolve the multi-label classification problem. The used methods were encapsulated and deployed as a digital-twin into a production environment.

2 - Authors must clarify necessity of the performed research. Objectives of the study must be clearly mentioned in introduction.

Thank you for your comment we extended the introduction with the following answer:

 The goal of the survey is to highlight that when and how a machine learning-based application can be applied in business processes. When can these methods replace humans in data annotation tasks and how can they improve the quality and the discoverability of a legal database

3 - The literature study must be enriched. In this respect, authors must read and refer to the following papers: (a) application of ML: https://doi.org/10.1016/j.jmrt.2021.07.004 (b) https://doi.org/10.1016/j.ipm.2021.102798 text classification: and other research works.

Citations were added to the paper.

4 - It would be nice, if authors could add some figures (schematic) to show concept and some conditions.

This study wants to focus on the applicability of the machine learning methods to an existing business process. Only a subsection describes here the deployed machine learning-based system which is the basis of the comparison, another paper will describe the details of the proposed system and the selection of the applied methodologies.

5 - How the main question of the study are considered?

 The question was answered by Fig 7 and the results and discussion section, which shows that the machine learning algorithm can reach the average performance of the human experts. The Conclusions of the paper were extended to answer better your question.

6 - For using AI methods instead of human activities authors can refer to industrial applications, such as https://doi.org/10.1007/s10845-019-01481-0 and https://doi.org/10.1016/j.promfg.2020.01.421.

Referring to ML-based industrial applications is added to the Introduction: "Generally, more and more AI-based solutions are created to replace human activities for industrial applications.

7 - Details of accuracy calculation must be presented in the manuscript. In addition, error in calculation must be considered and discussed.

Thank you for your valuable comment. We modified the first paragraph in the  3.2 Accuracy section to present the calculated results more clearly. The modifications are as follows:
"We applied different metrics to compare the results.
Firstly, we calculated the accuracy of the labeled documents (Figure~3). This accuracy means the proportion of the documents that were completely or partially labeled correctly by each group. A document was considered as partially labeled when at least one correct label was found for a given judgment.}"

Thank you for your remark on highlighting the errors in calculation. Since each group (editor without computer assistance, editor with computer assistance, etc.) consisted of three members, the calculation of standard deviation of the results would not be meaningful.

8 - In its language layer, the manuscript should be considered for English language editing. There are sentences which have to be rewritten.

Thank you for your comment, we considered another of our colleagues to correct the typos and improve the language of the paper.

9 - The conclusion must be more than just a summary of the manuscript. List of references must be updated based on the proposed papers. Please provide all changes by red color in the revised version.

The end of the Conclusion was extended and modified in the following way to highlight some more insights and future recommendations:

The study results show that the applied machine learning algorithm can reach the average performance of human experts. Moreover, machine learning methodologies can be advantageous on those monotonous tasks, where finding the correct solution needs deep focus and unique expertise, or hard to define the exact solution, like in the case of the law.
Another insight gained by this study is that the label set should be reviewed from a legal perspective, and other domain knowledge should be taken into account to increase the agreement between the legal experts and create a new ontology for the labeling system. This new ontology and the newly trained models can further increase the legal database's discoverability, usability, and value.

Best regards,

 Tamás Orosz

 

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Back to TopTop