Next Article in Journal
Exploring the Continuous Usage Intention of Online Learning Platforms from the Perspective of Social Capital
Previous Article in Journal
A Sentence Classification Framework to Identify Geometric Errors in Radiation Therapy from Relevant Literature
 
 
Article
Peer-Review Record

System Design to Utilize Domain Expertise for Visual Exploratory Data Analysis†

Information 2021, 12(4), 140; https://doi.org/10.3390/info12040140
by Tristan Langer and Tobias Meisen *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Information 2021, 12(4), 140; https://doi.org/10.3390/info12040140
Submission received: 25 February 2021 / Revised: 17 March 2021 / Accepted: 20 March 2021 / Published: 24 March 2021
(This article belongs to the Section Information Systems)

Round 1

Reviewer 1 Report

The article has been greatly improved. The state of the art section is well elaborated, and concept are adequately clarified. The use case is also satisfactorily discussed.

Still some corrections remain necessary:

  • P.2: The 2 advantages of the new methos seems too general and thus do not evidence the originality of the approach
  • some formal improvements are needed (e.g. "trying to parsing" p.7)
  • Figures, especially Figure 3, are not easily legibles.

Author Response

Reviewer: P.2: The 2 advantages of the new methos seems too general and thus do not evidence the originality of the approach

Reply: Line 53. We adjusted the advantages to be more specific about the difference to existing approaches.

Reviewer: some formal improvements are needed (e.g. "trying to parsing" p.7)

Reply: We checked the document and made formal improvements.

Reviewer: Figures, especially Figure 3, are not easily legibles.

Reply: We edited all the graphics (e.g. by increasing image size or font size) to be more legible. 

Reviewer 2 Report

Dear authors, I don’t see real progress in paper. Paper is till now very generic without concreate description of algorithms etc. Also review of literature is not satisfactory. All my previous concerns are valid in this version.

Author Response

The opinion is in contrast to the opinions of the other reviewers. We have extensively revised our paper based on the last reviews. Unfortunately, the changes were not addressed and no further specific points for improvement were mentioned.

Reviewer 3 Report

The paper presents a conceptual system design implementing a machine learning based guidance system able to recommend analysis operations based on recorded user interactions and analysis context. The usefulness of the approach has been demonstrated by applying the implemented prototype to an exemplary use case.

The paper is well written and has enough details and figures that ease the reading of the paper.  In the following I will provide my comments about the paper.

Introduction: The main goals of the paper are clearly explained. However, other details about the evaluation could improve this section.

Section 2 clearly describes the research questions underlying the study and the description of the problems.

Section 3: Line 185. The authors could also discuss the user intent understanding approach presented in https://doi.org/10.1016/j.jvlc.2015.10.022 to

Section 4: Line 279. Further details are required for the threshold. What kind of threshold are the authors referring to? Does the tool use this threshold? The authors introduce this threshold only once without describing which type of threshold it represents and if the proposed methodology uses it.

Section 5: The proposal focuses on a visual tool in the context of Exploratory data analysis (EDA) which seems to be also related to the data profiling research area. In fact, in this area, domain experts are developing new visual tools capable of extracting knowledge from data by exploiting the metadata that is valid in them. I suggest the authors to include recent work in this area in order to enrich the research opportunities (how to extend the proposed approach to data profiling tasks). Some references in this area are:

Abedjan, Z., et al. Profiling relational data: A survey. The VLDB Journal 24(4), 557–581 (2015)

Caruccio et Al. "Mining relaxed functional dependencies from data." Data Mining and Knowledge Discovery

Po, L., et al. "Linked Data Visualization: Techniques, Tools, and Big Data." Synthesis Lectures on Semantic Web: Theory and Technology 10.1 (2020)

Polese et Al. Visualization of (multimedia) dependencies from big data In Multimedia Tools and Applications

Zernichow, B. et al. A Visual Data Profiling Tool for Data Preparation. In DATA ANALYTICS: International Conference on Data Analytics, Barcelona, Spain.

Minor

- "Insight actions are actions that are used to indicate insight" This sentence shows a redundant concept. It is clear that "Insight actions" are actions that indicate insights.

- The representations of the time-series in Figure 6 are very simple. In this scenario, it would be useful to include more advanced charts on which you can pin bookmarks, highlight sections and customize individual parts (e.g., Iguanachart).

- "Those interactions are interesting for a graphical analysis of the interactions because they give" -> "Those interactions are interesting for a graphical analysis because they give"

Author Response

Reviewer: Introduction: The main goals of the paper are clearly explained. However, other details about the evaluation could improve this section.

Reply: Line 59. We added details about the evaluation to the introduction.

Reviewer: Section 3: Line 185. The authors could also discuss the user intent understanding approach presented in https://doi.org/10.1016/j.jvlc.2015.10.022 to

Reply:  Line 200. That is a good point. We added the suggested paper to section.

Reviewer: Section 4: Line 279. Further details are required for the threshold. What kind of threshold are the authors referring to? Does the tool use this threshold? The authors introduce this threshold only once without describing which type of threshold it represents and if the proposed methodology uses it.

Reply: Line 282. We agree that this section was a bit vague. We revised the section to be more precise about identification of similar domains and the threshold.

Reviewer: Section 5: The proposal focuses on a visual tool in the context of Exploratory data analysis (EDA) which seems to be also related to the data profiling research area. In fact, in this area, domain experts are developing new visual tools capable of extracting knowledge from data by exploiting the metadata that is valid in them. I suggest the authors to include recent work in this area in order to enrich the research opportunities (how to extend the proposed approach to data profiling tasks). Some references in this area are:

Abedjan, Z., et al. Profiling relational data: A survey. The VLDB Journal 24(4), 557–581 (2015)

Caruccio et Al. "Mining relaxed functional dependencies from data." Data Mining and Knowledge Discovery

Po, L., et al. "Linked Data Visualization: Techniques, Tools, and Big Data." Synthesis Lectures on Semantic Web: Theory and Technology 10.1 (2020)

Polese et Al. Visualization of (multimedia) dependencies from big data In Multimedia Tools and Applications

Zernichow, B. et al. A Visual Data Profiling Tool for Data Preparation. In DATA ANALYTICS: International Conference on Data Analytics, Barcelona, Spain.

Reply: Line 464. Thank you for the helpful suggestion and references. We added a paragraph about the potentials of data profiling to the discussion section and will look into this topic in our future work.

Reviewer:  - "Insight actions are actions that are used to indicate insight" This sentence shows a redundant concept. It is clear that "Insight actions" are actions that indicate insights.

Reply: Line 250. We edited the sentence.

Reviewer:  - The representations of the time-series in Figure 6 are very simple. In this scenario, it would be useful to include more advanced charts on which you can pin bookmarks, highlight sections and customize individual parts (e.g., Iguanachart).

Reply: We agree that the visualization is simple. We are working on extending our initial prototype with more advanced visualizations which we will present in future publications.

Reviewer:  - "Those interactions are interesting for a graphical analysis of the interactions because they give" -> "Those interactions are interesting for a graphical analysis because they give"

Reply: Line 244. We edited the sentence.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The paper describe utilization of Visual Analysis for exploratory data analysis. First my comments is, if visual analysis is method for data analytic or suit better for domain experts. My personal opinion is, that this is tool, which has to be used mainly by domain experts.(But I can consider, that this could be my personal opinion).

Where I see my biggest problem is, that all technology used is described on very generic level without any sufficient technological details. There is mentioned usage machine learning, semantic description, etc. but this all without any technological details. Also description of visualization and exploratory work is lacking in details

Reviewer 2 Report

The paper presents a conceptual design of a system able to extract and utilize domain expertise for guiding the steps performed during visual exploratory data analysis. The presented system design makes use of analytic provenance and machine learning, it has been implemented as a prototype to test its feasibility and it has been shown on an exemplary use case.

The paper is very well written and details of the main modules of the system design as well as the techniques employed for the presented methodology. I therefore recommend to accept the paper for publication with minor revision. Furthermore, even if the presented work has been demonstrated by a specific use case, I found it general enough to be applied to other domains and applications thus encouraging software development teams to carefully design and implement a system in order to guide visual exploratory data analysis.

I recommend to improve the Section 2 “Related Work” by highlighting what is your system doing with respect to the related work presented and how do you expect it can complement or enhance the current tools. I suggest to consider to move here some paragraphs from the “Research Opportunity” section. Finally, I invite you to add a table summarizing the main features of each tools compared to your approach.

I kindly suggest considering to remove the “Research Opportunities” section and port some content to the “Related Work” section (as stated below) and other appropriate content to the “Conclusion and Future Work” section.

Minor comment:

  • Page 4, lines 160-161 I would add the following sentence:
    "The data is stored in graph-based representation that describes view states and interactions which perform transitions between views (see Table 1 for an example of such kind of interactions and Figure 3 for an example of a graph-based representation.)."

Reviewer 3 Report

The paper is about a new approach of exploratory data analysis that intends to ease the integration of domain knowledge. The major concern with this contribution is a rather poor level of specific information. The discussion on related works lacks a coherent structure and critical perspective. Throughout the article references are made to several concepts that are described in very general terms, such as such as “analytics provenance tries … meaningful way” (lines 21-23). As a result, the articles fails to show a concrete progress in the context of the proposed research question, and to evidence the originality of its contribution with respect to the state of the art. It does not present any concrete methodology that could be easily reproduced by others and could demonstrate a significant impact on the community.


From a formal point of view, many sentences are unclear or need formal corrections. For instance: (a) lines 8 to 13 in the abstract are obviously confused; (b) the introduction seems to provides only very general information; (c) the assumption lines 44-46 looks quite straightforward (d) the notion of “amount of insight” line 51 mixes quantitative and qualitative assessments; (e) lines 316-318 need major rewriting.

For all those reasons, the paper is not providing a clear and elaborated content ready for publication in a scientific journal.

Back to TopTop