Next Article in Journal
An Efficient Location-Based Forwarding Strategy for Named Data Networking and LEO Satellite Communications
Next Article in Special Issue
Integrating Chatbot Media Automations in Professional Journalism: An Evaluation Framework
Previous Article in Journal
FakeNewsLab: Experimental Study on Biases and Pitfalls Preventing Us from Distinguishing True from False News
Previous Article in Special Issue
Aesthetic Trends and Semantic Web Adoption of Media Outlets Identified through Automated Archival Data Extraction
 
 
Article
Peer-Review Record

Modeling and Validating a News Recommender Algorithm in a Mainstream Medium-Sized News Organization: An Experimental Approach

Future Internet 2022, 14(10), 284; https://doi.org/10.3390/fi14100284
by Paschalia (Lia) Spyridou *, Constantinos Djouvas and Dimitra Milioni
Reviewer 1:
Reviewer 2:
Future Internet 2022, 14(10), 284; https://doi.org/10.3390/fi14100284
Submission received: 8 August 2022 / Revised: 16 September 2022 / Accepted: 26 September 2022 / Published: 29 September 2022
(This article belongs to the Special Issue Theory and Applications of Web 3.0 in the Media Sector)

Round 1

Reviewer 1 Report

The paper is interesting and makes an excellent match to the special issue and broadly to the journal's research focus. The use of language is good. However, there are some drawbacks that should be improved.

- In some cases, there is no consistency in titles/names or/and some terminology is confusing. For example, the term "environment" that is presented in lines 407, 423, 459... and more, is misleading. A term like "news collection", or "view", or "screen" would be more appropriate. The frontpage of the news portal being evaluated sometimes is referred as "editor's agenda" (most frequent), as "frontpage/editorial agenda" (line 419), as "frontpage" (lines 439,443), or simply as "agenda" (figure 7).

- On the other hand, there are many wordy descriptions, and there is no information about important research elements/items, like the news recommendation algorithm. It would help to provide some technical details for the algorithm, from the publication/documentation of the stated research project (incorporating also the relevant references). If this is not possible, try to provide the functional description of the algorithm from the perspective of a user operating the recommendation system.

- One more important issue is the presentation of the results. For example, the (average) reader cannot understand the results presented in Figure 8. Something is wrong or missed by the reviewer? For the "Public Health", "Science & Technology", and "Greece" categories the system recommended no articles at all? It would be very helpful to clarify how many users were implicated in each category (and overall in the experiment) and how many articles were involved in these categories. Also, there is room for clarifications concerning the “User type” and “myNews Type” labels (either by updating these label or by enhancing the provided descriptions in the caption and the associated text). Especially for the “myNews  Type”, given that the recommender produces a collection of news items, it is important to know how the articles were collected and assigned to each class. Likewise, I am not accustomed to the use of confusion matrices with frequency values. What the frequency term stands for? (Please elaborate on in the revised paper). Again, a clarification would help to maintain proper use of terms.

- Overall, I would expect some justification concerning the use of the selected evaluation metrics. For instance, I can understand that F1-Score was preferred (e.g., over accuracy) to avoid biasing with regard to the sample distribution in classes. It would be useful to provide such information for all evaluation measures. Likewise, the Euclidean distance measure provides useful insights but we need to know the exact configuration of the estimations. In case that no normalization was used, related metrics like MSE or MAE would make good supplements.

Minor points

- Texts are somewhat wordy, without being specific to the current research. For example, Sections 1 to 3 could be of 3-4 pages long, instead of 6. More important, I feel that some descriptions are redundant and confuse/tire the reader. For example, lines 421 to 435 can be omitted.

 

Author Response

Please see attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors present an evaluation strategy for News Recommending Systems and implement it on a real-world case of a news organization, via a dedicated plugin that gathers information from collaborative and ordinary users. The topic of the research is interested and in relation to the special issue, as it strengthens the journalistic aspects of applying novel algorithmic applications in news reporting, through an interdisciplinary approach.

I have only a few comments and suggestions regarding the presentation and justification of the research.

The literature review is very informative and comprehensive, covering different aspects of the topics of automating editorial decisions and news distribution personalization. One thing that could be further clarified is the description of algorithmic processes. A generic definition of algorithms is given in l.156, while big data and machine learning are referred to, but not extensively introduced. Since algorithmic decision-making is in the core of the presented research, I propose that there should be a distinction between rule-based and data-driven decision making. In rule-based systems, the automated process is carefully designed, so the automated decisions directly reflect the choices and the criteria that are set by the creators. In data-driven systems, algorithmic bias is less direct, as it is caused by the attributes of the available data that are used to build the decision-making model through an algorithmic process. So, the way in which design choices introduce bias in the automated process and reflect a human curated agenda is distinguishable depending on the system architecture. In my opinion there should be some clarification regarding these issues.

Concerning the NRS under test. I understand that the authors treat it as a black box, to examine its behavior. Given that its operation principles can be quite complex, as it is described in the problem definition, having some knowledge on its design could provide some further insight. Since it is the deliverable of a research program, as mentioned, is it possible that there is some documentation? This knowledge could be combined with the research results of this paper to provide more robust conclusions. For example, using a dedicated browser means that browsing user preferences are excluded, in case the algorithm takes them into consideration, which can make it difficult to conduct reverse engineering to the algorithmic design.

Finally, there seems to be a strong emphasis in the importance of news category. I understand from the introduction that this is an important, but not the sole criterion for personalization. Is it possible that the algorithm also considers a weighting system of the articles, meaning that some articles are ranked as being of greater importance and they can appear in My News area even though they do not appear relevant in terms of user behavior and category?

In Figures 9,10 it is unclear to me what the values of the x-axis (session) refer to exactly.

I suppose section 4.1 is not supposed to be in bold.

Overall, I believe that it is an interesting and well presented work that could be improved with further clarifications.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors conducted an effort to reply to the review remarks and the manuscript has been improved a lot. However, there are still some issues that should be addressed. Most notably, the metrics related to figure 9 and the presentation of the results are not clear. More specific,

- F1 score, precision, recall, and accuracy are inconsistent with the corresponding matrix.
- It is unclear how the matrix presented in figure 9 has been composed of the data of figure 8.

In addition to this, the term "SigmaLive" first occurs in line 514. What is "SigmaLive "?

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

The paper has improved (again) a lot. One thing of concern is figure 9 and the corresponding metrics that may confuse the reader. True Negatives sound somewhat strange to me in this context, as well as the huge gap between accuracy and f1-score. Personally, I would only keep accuracy and remove figure 9.

It is mainly subjective, so I choose "accept in present form".

Back to TopTop