We, the authors, wish to make the following corrections to our paper [
1]. We found that in our experiment the attention mechanism applied in our proposed models ASLM-LHS and ASLM-AP mistakenly utilized the user’s future events including the ground-truth event and the ones with which the user would interact after as the prior knowledge to predict the user’s next event. This issue resulted from the overlapped input data
to the encoder and the ground-truth events
as the labels which were fed into the models based on Section 3.1. In the RNN’s architecture without the attention mechanism, the future events will not be used, while, in our proposed model with the attention-based encoder–decoder architecture, the encoder will first process all the events in the input data to extract user’s sequential patterns and then generate the final output vector
, which contains the context information from all events. Afterwards, the attention mechanism will use the entire information provided by
to calculate the relevance scores, and therefore the information from future events will be included during the calculation. This issue led to incorrect experimental results. Thus, we resolved this issue by only providing the events with which the user interacted before the ground-truth event to the models as the input data (e.g.,
) and using their subsequent ground-truth event as the label to predict (e.g.,
). We conducted the experiment again and would like to correct the values in Tables 5–9, Figures 8 and 9, and their corresponding sentences in the main text. This correction does not affect the scientific conclusions.
2. Changes in the Statements Corresponding to the Tables
The statements of Section 5.1 corresponding to these tables would be changed to:
(The statements of
Table 1,
Table 2 and
Table 3 remain the same as those in the original paper.) For all datasets, ASLM-LHS outperforms all baselines with moderate margins.
Table 4 shows the validation set performances of three different methods under the metric of Recall@K and MRR@K in Reddit and Last.fm datasets. Please note that, for the Reddit dataset, the validation set performance of ASLM-LHS is better than that of ASLM-AP, and, for Reddit and Last.fm datasets, the validation set performances of ASLM-AP and ASLM-LHS are all better than the test set performances with moderate margins, respectively.
Table 5 shows the validation set performances of three different methods in 1/10 Reddit and 1/80 Last.fm datasets. The validation set performances of the two ASLM models in the 1/10 Reddit dataset are all better than the test set performances, while, in the 1/80 Last.fm dataset, the majority of the validation set performances of the two ASLM models are better than the test set performances.
The statements of Section 6.1 corresponding to these tables would be changed to:
For Group A, in
Table 1, we observe that ASLM-LHS consistently outperforms all the baselines under all measurements for testing cases in both Reddit and Last.fm datasets with moderate margins; In most cases, ASLM-AP outperforms all the baselines in both datasets, except in Recall@10 and Recall@20 scores in the Reddit test set. Specifically, ASLM-LHS improves 9.12% and 22.10% in Recall@5 and MRR@5 scores compared with the II-RNN-LHS method for test cases in the Reddit dataset, respectively. ASLM-AP improves 4.36% and 7.33% in Recall@5 and MRR@5 compared with the II-RNN-AP method for test cases in the Last.fm dataset, respectively. The reason ASLM-AP’s Recall@10 and Recall@20 scores are lower than those of II-RNN-LHS in the Reddit dataset is that the average pooling method only utilizes the average of the embedding of each event
as the representation of the current session
, and the important context information, e.g., the sequential patterns and the user’s intent captured by the attention mechanism, will be lost when this representation is stored in the user’s long-term session representations. It results in the fact that the abundant context information cannot be fully utilized by ASLM-AP, and thus its performance declined. However, the performance of ASLM-AP is still better than II-RNN-AP, which also utilizes the average pooling method. The good performance of ASLM-LHS is attributed to the attention mechanism and the bidirectional-LSTM we employed in the attention-based layer and the last hidden state method. One of the main characteristics of the attention mechanism is that it calculates the importance of each given event and captures the user’s short-term intent with it. The bidirectional-LSTM can extract the sequential patterns from each given event in both the forward and the backward directions. Both ASLM-LHS and II-RNN-LHS utilize the last hidden state method which retains the context information from the current session and regards it as one part of the user’s long-term session representations, compared with the average pooling method. Therefore, the model’s subsequent training process benefits from this method by receiving the valuable context information from history sessions. In addition, in Table 1 in the original paper, we observe that, in the Reddit dataset, the number of sessions per user (namely, 62.15) and the average number of events in a session (namely, 3.00) are much smaller than those in the Last.fm dataset (namely, 645.62 and 8.10), which shows that ASLM-LHS can perform better when the user’s history information is less adequate.
Table 1 and
Table 4 show the results that the performance of ASLM for validation cases and that for testing cases in Reddit and Last.fm datasets are at the same level, which assure us the validity of our model.
For Group B, similar results can be seen in
Table 2 and
Table 3 that ASLM-LHS and ASLM-AP outperform all the baselines under all measurements in 1/10 Reddit, 1/80 Last.fm, and Tmall datasets. For the Tmall dataset, the evaluation result of ASLM-LHS shown in
Table 3 improves 70.86%, 49.06%, and 114.50% in R@10, R@20, and MRR@20 scores compared with the SWIWO-I method, respectively. Please note that, for the Tmall dataset, as shown in Table 1 in the original paper, there is scarcely enough sessions for each user (namely, 3.99 per user) and number of average session length (2.99, per user) compared with those in Reddit and Last.fm. For 1/10 Reddit and 1/80 Last.fm, since the user history information is reduced dramatically compared with the full Reddit and Last.fm datasets, most of the models’ performances significantly decrease accordingly. However, both ASLM models still outperform the strongest baselines, as shown in
Table 2. It demonstrates that ASLM can reach better performance even when it is severely short of both long-term and short-term user behavior.
When K changes from 5 to 20, for the Reddit and 1/10 Reddit datasets, there is a downward trend in the relative scores of ASLM-LHS and ASLM-AP. For the Last.fm dataset, there is an upward trend in those of ASLM-LHS. The reason for the upward trend is that there is more abundant user history information in the Last.fm dataset, and, with the increase of K, ASLM-LHS still has the potential to further capture the context information implied by the Last.fm dataset, which results in some space to improve its performance.
In terms of the two versions of ASLM (ASLM-AP using average pooling and ASLM-LHS using the last hidden state),
Table 1,
Table 2 and
Table 3 show that, for the Reddit dataset, the test performance of ASLM-LHS has an obvious advantage compared to that of ASLM-AP, while they are at the same level for the other four datasets. As mentioned above, the model’s subsequent training process benefits from the last hidden state method by receiving the valuable context information from history sessions.
3. Changes in Figures
Figures 8 and 9 in the original paper would be changed to
Figure 1 and
Figure 2 in this correction, respectively.
4. Changes in the Statements Corresponding to the Figures
The statements of Section 6.2 corresponding to these figures would be changed to:
In addition to the overall recommendations we reported in Sections 5 and 6.1 in the original paper, we evaluate the effectiveness of ASLM-LHS on recommendations for the first n time steps, for n = 1, ...,5, L, where L is the maximum session length and L = 19 in our experiment.
Figure 1 shows the comparison of performances of Recall@5 at the first
n recommendations of each session for Reddit and Last.fm datasets between ASLM and II-RNN.
Notice that, for Reddit and Last.fm, where abundant user history behavior is given, our attention-based ASLM-LHS model achieves and when , respectively, and, under most circumstances, each score of the ASLM-LHS model increases throughout the session. ASLM-LHS has an advantage over II-RNN-LHS and II-RNN-AP at the start of a new session in Reddit and Last.fm datasets, with the improvements of 7.00% and 3.12%, respectively. When more events have been investigated in the current session, for the Reddit dataset, the advantage becomes more obvious, while, for the Last.fm dataset, since the performances of ASLM-LHS and II-RNN-AP are identical when , the advantage becomes greater when n increases from 3.
Similar results are shown in
Figure 2 for the Tmall dataset. In terms of the Tmall dataset, which is severely short of both long-term and short-term user behavior, ASLM-LHS still provides 0.5063 in
with just a slight difference compared with the overall
(i.e., 0.5074). Due to the lack of history information from each user in the Tmall dataset, its performance is not stable at the start of a session when
n is between 1 and 3, however it gets better when
n increases from 3. As a result, the attention-based short-term intent layer can thoroughly reveal the variation in the user’s short-term intent between events from the current session
. To sum up, ASLM significantly alleviates the cold-start problem.
The authors would like to apologize for any inconvenience caused. The change does not affect the scientific conclusions. The manuscript will be updated, and the original version will remain online on the article webpage.
5. Change in the Description of the Last.fm Dataset
Since the link to the Last.fm dataset described in Section 4.1 in the original paper is not valid any more, the statement of Section 4.1 corresponding to the link to the Last.fm dataset would be changed to: Last.fm dataset:
https://blog.csdn.net/hopygreat/article/details/96444827.
Author Contributions
Conceptualization, R.H., S.M., M.S. and H.E.; Methodology, R.H.; Software, R.H.; Validation, R.H., S.M., M.S., H.E. and Z.O.; Formal Analysis, R.H.; Investigation, R.H.; Resources, R.H., S.M., M.S. and H.E.; Data Curation, R.H.; Writing—Original Draft Preparation, R.H.; Writing—Review & Editing, R.H., S.M., M.S., H.E. and Z.O.; Visualization, R.H.; Supervision, S.M., M.S., H.E. and Z.O.; Project Administration, Z.O.; Funding Acquisition, Z.O. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by National Natural Science Foundation of China (Grant No. 61702046).
Data Availability Statement
Conflicts of Interest
The authors declare no conflict of interest.
Reference
- Huang, R.; McIntyre, S.; Song, M.; E, H.; Ou, Z. An Attention-Based Recommender System to Predict Contextual Intent Based on Choice Histories across and within Sessions. Appl. Sci. 2018, 8, 2426. [Google Scholar] [CrossRef] [Green Version]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).