A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform

Lu, Junwen; Zhan, Xinrong; Zhan, Xintao; Shi, Lihui

doi:10.3390/electronics11244198

Open AccessArticle

A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform

by

Junwen Lu

,

Xinrong Zhan

^*

,

Xintao Zhan

and

Lihui Shi

School of Computer and Information and Engineering, Xiamen University of Technology, Xiamen 361024, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(24), 4198; https://doi.org/10.3390/electronics11244198

Submission received: 21 November 2022 / Revised: 10 December 2022 / Accepted: 12 December 2022 / Published: 15 December 2022

(This article belongs to the Section Networks)

Download

Browse Figures

Versions Notes

Abstract

:

In online social networks, finding high-influence nodes is a crucial component of complex network research. A new book impact evaluation method based on user rating is proposed in this research for the social network created by the buying and selling behaviors on the e-commerce platform. It intends to rank the book nodes in accordance with customer feedback data following user purchases. The method calculates the influence score of a book by predicting its popularity based on user evaluations of the book. To verify the validity and accuracy of the method, the research analyzes a real review dataset from Amazon, a large e-commerce platform, and designs two comparison experiments with different time spans and compares them with five other web analytics metrics. The experimental findings show that the method is efficient and precise in evaluating the influence of book nodes.

Keywords:

node importance; complex networks; electronic commerce; product reviews

1. Introduction

The e-commerce sector is still growing as a result of advances in Internet technology and methods, particularly as infrastructure conditions such as 5G, mobile payment, and smartphones gradually mature. Online social networks have gained a lot of traction in recent years and have evolved into a vital component of many different applications that also plays a crucial function in the e-commerce platform [1]. Users can review and comment on the things they have purchased on e-commerce platforms based on online social networks [2], such as Amazon and eBay.

Both researchers and policymakers need to rely on publicly available information to study the e-commerce platform market. However, many studies on e-commerce platforms are still hampered by the issue of data monopoly [3]. Despite widespread interest in e-commerce platforms, the companies that operate them retain control of their data and are frequently opaque and hesitant to share it with researchers. As a result, researchers are frequently limited to conducting experiments based on publicly available data.

Product prices are the most obvious public information. For instance, Chevalier and Goolsbee [4] studied the price sensitivity of online consumers. Baye et al. [5] revealed the relationship between online search cost and price. Hollenbeck et al. [6] studied how prices change with a dramatic increase in consumer information provided by online reputation mechanisms. One significant flaw in these studies is that, while price data are abundant and readily available, there are no corresponding quantity data. We use Amazon as an example to demonstrate why this happens. Amazon, the world’s largest e-commerce platform, does not directly display product sales volumes. Amazon provides more growth space for new products and makes users pay more attention to product reviews to weaken the influence of sales volume on consumers’ purchase decisions, so it provides users with a real-time “best seller rank” [7]. It is a key indicator of how well a product sells. It is used to replace the number of goods sold in a given period. However, acquiring this commodity sales ranking necessitates constantly tracking the data changes of the Amazon platform, which is difficult to obtain and has a long acquisition cycle [3]. On the other hand, it is rather simple to obtain databases of reviews from various e-commerce platforms. We consequently had to think about how to evaluate the extremely large-scale review data of enormous e-commerce platforms with the least amount of time complexity to find high-impact products.

The following is a summary of the important contributions of this paper.

We propose a new book impact evaluation method, a book influence evaluation method based on user ratings of the e-commerce platform (URBI), whose calculation cost is only O(n), where n is the total number of comments.
We ran the experiment with a dataset of Amazon reviews, a real, large-scale e-commerce platform. The effectiveness of our URBI method was confirmed by designing two experiments with different time spans for comparison reference and comparing it to other five node influence assessment methods.

The following describes the structure of this article. The second part talks about related work. The preliminary research preparation is described in Section 3. Our URBI approach is shown in Section 4. In Section 5, by creating two experiments with various time lengths and contrasting them with five other network analysis measures, we examine the efficacy of our suggested approach. The conclusion of the essay is contained in the Section 6.

2. Related Work

Recent research has found that a variety of systems in the actual world take the shape of network structures, including relational networks in social systems [8,9], protein networks [10,11], and collaborative networks [12]. The identification of key network nodes is a critical topic. As a result, screening nodes with high influence in networks has become a focal point of attention, particularly on how to evaluate the importance of nodes in complex networks [13].

Network structure and operation are more affected by key nodes than by other nodes. Many fields [14] can benefit from key node identification, including community structure networks [15,16], disease [17], social network service [18,19,20], resource allocation [21], biological information [22], and so on [23]. For instance, Wang et al. [24] presented a new approach to extract community structure from the network by employing co-inversion of the original community whose degree of adjacent vertices is less than its degree. They believe that some significant nodes play a central role in a community. Cui et al. [25] discovered that different communities overlap in some real-world situations and proposed an ACC algorithm to detect overlapping community structures in complex networks based on the clustering coefficients of two adjacent maximal subgraphs. Li et al. [26] proposed a new method for detecting overlapping communities in a weighted network by using seed communities. Through simulation experiment results, Nian et al. [27] demonstrated that node activity will also affect immunity, with the strongest immunological effect depending on node activity.

The relevance of nodes in complex networks can be assessed using a variety of techniques, such as degree centrality (DC) [28], eigenvector centrality (EC) [29], closeness centrality (CC) [30], betweenness centrality (BC) [31], PageRank (PC) [32], H-index [33], and so on [34].

In network analysis, DC is the most direct measure of node centrality. The higher the degree of a node, the more important it is in the network. According to EC, the importance of a node is determined by the number and importance of its neighbors. CC measures how close a node is to other nodes in the network. CC is extremely sensitive to the network structure, and even minor changes will cause the node order to change. The number of shortest paths through a node is used to calculate the importance of a node. Path concentration is used by both CC and BC to determine the importance of a node. PC evaluates the impact of nodes using iterations of information about neighbors, and it is stable on scale-free networks but very sensitive to random networks.

In summary, the aforementioned approach ignores the impact of the information that nodes themselves carry on how important nodes in complex networks are assessed. Particularly, the above method does not consider the user’s assessment information of the book when evaluating the relevance of book nodes in the user-book heterogeneous graph network created by the review data set of the e-commerce platform. Consequently, there is room for improvement in the outcomes provided by the current approaches.

3. Preliminary

Given a graph G = (V, E), where V stands for nodes and E stands for edges. In this section, we will discuss degree centrality (DC), eigenvector centrality (EC), closeness centrality (CC), betweenness centrality (BC), and PageRank (PC).

3.1. Degree Centrality (DC)

DC is the most direct measure of node centrality in network analysis. The higher the degree of a node, the higher its degree of centrality, and the more important the node in the network. The following is the definition of degree centrality DC:

D C (i) = k_{i} = \sum_{j} α_{i j}

(1)

In this equation,

k_{i}

stands for the degree of the node i,

a_{i j}

indicates the link from node i to node j, and DC(i) for the centrality score of node i.

3.2. Eigenvector Centrality (EC)

The basic idea behind EC is that the relevance of a node is determined by both the number of its neighbors and the importance of its neighbors. The centrality of one node is a function of the centrality of neighboring nodes. In other words, the more significant the neighbor node is, the more significant the current node is. Given an n × n matrix A,

x_{j}

denotes the value of the

j th

term of the normalized maximum eigenvector. EC is defined as follows:

E C (j) = \frac{1}{λ} \sum_{i = 1}^{| V |} (α_{i j} x_{i})

(2)

where

λ

is the maximum eigenvalue of matrix A, and EC(j) is the centrality score of node j.

3.3. Closeness Centrality (CC)

CC shows the proximity of the node to other nodes in the network. The closer a node is to other nodes, the larger its proximity centrality. CC is used to discover nodes that can efficiently spread information through the graph. The closeness centrality algorithm determines the sum of the distances between all node pairs for each node based on computing the shortest path between all node pairs and then calculating the reciprocal of the result to obtain the proximity centrality score of the node:

C C (j) = \sum_{i} {(\frac{α_{j i}}{| V | - 1})}^{2} (\frac{1}{d_{j i}})

(3)

where

α_{j i}

denotes the link between node j and node i, and

d_{j i}

represents the shortest distance between node j and node i. The centrality score of node j is represented by CC(j).

3.4. Betweenness Centrality (BC)

BC is an index that describes the importance of a node by the number of shortest paths through a node. BC computes the number of shortest pathways via a point. The more shortest pathways via a point, the greater its betweenness centrality:

B C (j) = \sum_{i, k \neq j} (\frac{d_{i k} (j)}{N_{i k}})

(4)

where

d_{i k}

is the number of pathways from node i to node k via node j, and

N_{i k}

is the total number of paths via node j. The centrality score of node j is given by BC(j).

3.5. PageRank (PC)

The basic idea of the PC algorithm is to define a random walk model on a directed graph, that is, a first-order Markov chain. It depicts the behavior of a random walker who visits each node along a directed graph at random. In the limit situation, the chance of accessing each node converges to the stationary distribution, and the stationary probability value of each node equals its value of PC, which represents the relevance of the node. PC is defined recursively, and it can be calculated using an iterative process:

P C {(j)}^{t} = \sum_{i = 1}^{| V |} (α_{j i} \frac{P C {(i)}^{t - 1}}{k_{i}})

(5)

where

k_{j}

is an output of node j,

α_{j i}

denotes the link between node j and node i,

P C {(j)}^{t}

is the importance of node j at step t, and PC(j) denotes the centrality score of node j.

4. Proposed Method

This study offers a book influence method based on user ratings of e-commerce platform (URBI), intending to estimate the influence ranking of each book in the e-commerce platform. The goal of this strategy is to uncover hidden book influence information in book review datasets by analyzing e-commerce platform review datasets. Using the review dataset, the user–book heterogeneous graph network is built, and the relationship between user rating information and book influence is investigated with very little time complexity.

STEP 1: Construct network.

Each review in the book review dataset of the e-commerce platform contains the user and the book to be evaluated. As a result of traversing the review data set, a user–book heterogeneous graph network can be built. The user–book network is represented by the graph G = (V, E), where V represents the node and E represents the edge.

STEP 2: Tag nodes.

In the e-commerce platform, the quantity of user comments far outnumbers the number of books. Calculating the influence of all nodes in the user-book heterogeneous graph network will take a long time. As a result, we tagged the book node and the user node separately, and we only calculated the influence of book nodes.

STEP 3: Calculate influence of nodes.

Comments from users on a book on the e-commerce platform show the purchasing behavior of the user and appraisal of the book. However, we do not know how much of the book the customer purchased, and many people do not leave a remark after purchasing the book. As a result, the number of reviews can only partly reflect the impact of a book. In order to better quantify the influence of books without knowing the particular quantity of books purchased by users, we included user ratings of books to forecast the chance of users purchasing books again. Using Amazon as an illustration, customers on Amazon can give the book they purchase a rating between 1 and 5, and the related reviews range from low to high. We assign a rating factor to each rating that reflects the probability that the user with the corresponding rating will buy the book once more. The URBI measure is defined as follows:

U R B I (j) = \sum_{i \in N_{j}} α_{j i} R_{j i}, j \in C

(6)

where C represents all book nodes,

N_{j}

represents all book neighbor nodes of node j,

a_{j i}

represents the connection between node j and node i,

R_{j i}

represents the rating of user i coefficient on book j, and URBI(j) represents the influence score node j.

COMPLEXITY ANALYSE

Assume there are n nodes in the user–book network, n1 being the book node, n2 being the user node, m being the comment side, and k being the rating. The time complexity of the method is divided into three parts:

1.: Mark the type of all nodes (book node or user node). The time complexity is O(n);
2.: Count the number of each rating of each book node. The time complexity is O(km);
3.: Calculate the influence of each book. The time complexity is O(n1).

As a result, the overall time complexity of URBI is O(n + n1 + km). In the real user–book network, the number of book nodes is far smaller than the number of comments, and the number of book rating categories is negligible compared with the number of comments. Therefore, the time complexity of URBI is O(m).

5. Experiments

We compared the suggested URBI approach against DC, EC, CC, BC, and PC in our experiment. The above six methods are described in Table 1. To validate the effectiveness of the proposed method, we examined the accuracy of six techniques in rating the Top10, Top50, Top100, Top200, Top300, Top400, Top500, and Top1000.

5.1. DataSet

We used the Amazon Review Data (2018) dataset [35] collated and published by Ni et al., which contained 233.1 million reviews from May 1996 to October 2018. Datasets include reviews (rating, text, and vote), product metadata (description, category information, price, brand, and image characteristics), and links (also browse/buy charts).

The dataset was split up among them according to the type of commodity. For the experiment, we chose the dataset of books with the most review data and the most items for the experiment.

5.2. Ground Truth

Amazon publishes a measure called “best seller rank”, the exact formula of which is a trade secret, but it converts actual sales over a specific time into a sequential ranking of products [36]. Amazon provides a “sales ranking” attribute for each book on the site. The sales list reflects the total sales of that book on the site relative to the sales of other books on the site. Note that the smaller the sales ranking value, the higher the sales volume of the item in a certain time. Chevalier and Goolsbee reported the following: According to Amazon, the top 10,000 books are ranked based on the previous 24 h and are updated hourly. The sales rankings are updated every day for books in the top 10,000–100,000 positions; they are updated every month for books in the top 100,000 positions [4]. Books that have not been acquired within the last month will not be sorted based on the aforementioned method. However, there is a ranking for thousands of books that almost certainly sell fewer than one per month. Clay et al. [37] claimed that for these rarely purchased books, Amazon’s ranking is based on total sales since the inception of Amazon. So, except for books that are very highly ranked (and sell very little) on Amazon, the rankings represent a snapshot of the book’s current sales. In other words, “Sales rankings” represent the real-time ranking information for a book, showing how well that book has sold compared to other books in a given time.

We used a public dataset on the Kaggle community: Amazon sales rank data for print and kindle books https://www.kaggle.com/c/asap-aes/data (accessed on 9 June 2022). For the dataset, authors collected sales rankings for authors published on Amazon.com worldwide through NovelRank.com. The data collection period was from 1 January 2017 to 29 June 2018. Data can be collected as often as every hour and as often as every 24 h. We select the average “sales ranking” of all books in a certain time as the ground truth.

5.3. Experimental Results and Analyses

5.3.1. Exp-1: Effectiveness (The Average of Sales Rank Every 7 Days Is Taken as the Ground Truth)

We selected Amazon sale rank data for print and kindle books data set of “sales rank” data in the first 7 days of each of the 12 months in 2017 and took its average value as the ground truth. Due to the lag between product reviews and product sales time, user reviews in the three months after product sales time in the Amazon Review Data (2018) dataset were selected as the experimental dataset. The specific information of the experimental dataset is shown in Table 2.

We measured the accuracy by comparing the books ranked Top10, Top50, Top100, Top200, Top300, Top400, Top500 and Top1000 retrieved by each method with the books in the corresponding ground truth.

In the URBI method proposed by us,

A_{1}

is 0,

A_{2}

is 0,

A_{3}

is 0.1,

A_{4}

is 0.1,

A_{5}

is 0.8, and

A_{i}

is the review coefficient of the user score i, which means we speculate that the user who gives the book score of 5 has a high probability to pay for the book again, and the probability of buying multiple books in one transaction is also higher. It is also speculated that users who give the book a score of 4 or 3 are less likely to buy the book again, while users who give the book a score of 2 or even 1 are not likely to buy the book again. A total of 12 experiments were conducted. The experimental results are shown in Figure 1, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12, where the abscissa represents the Top-N books and the y-coordinate represents the accuracy rate compared with the ground truth.

According to the experimental results shown in Figure 1, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12, as the value of N in Top-N gradually increases, DC, PC, and URBI methods are obviously superior to the other three methods, so we focus on these three methods. It can be seen from the third section that we build a user–book heterogeneous graph network through the Amazon review set. All algorithms run on this heterogeneous graph network. We will mark the book node and user node respectively. In the experiment, we focus on the book node, while the user node is the neighbor node of the book node, providing different influence information for the book node according to different algorithms.

DC is concerned with the correlation of degree in the graph. In the graph network constructed in this paper, the influence of book nodes calculated by DC is equivalent to the number of users who have made comments. PC evaluates the influence of nodes by iterating neighbor information, that is, it obtains the influence of book nodes by iterating comments on user information. By analyzing the 12 groups of experiments conducted this time, we can also find that the accuracy of DC and PC methods is very close under each Top-N, mainly because of the attribute of the degree of book node, that is, the number of comments plays an important role in the user-book network.

However, the URBI method is different from the above two methods. Not only URBI considers the impact of the number of user reviews on the book, but DC and PC can indeed represent the influence of a book to some extent according to the number of user comments. However, in this graph network, the base when a user buys a book cannot be reflected, nor can they predict whether a user will buy again because they are both based on the number of user comments on a book. The URBI method makes a prediction of users’ purchase behavior based on the existing user ratings of the book. Through the analysis of the 12 groups of experiments conducted this time, it can be seen that when the value of N is small in the Top-N, the results of the URBI method are similar to those of the DC and PC methods, and as the value of N gradually increases, the accuracy of the URBI method is steadily higher than that of the DC and PC methods.

5.3.2. Exp-2: Effectiveness (The Average of Sales Rank for Each of the Three Months Is Regarded as the Ground Truth)

In Exp-1, we demonstrated the effectiveness of the URBI method in predicting the book impact ranking over a short time horizon, that is, using three months of review data to predict the book impact ranking for seven consecutive days. Therefore, in the second part of the experiment, we use the review data of six months to predict the book influence ranking for three consecutive months so as to prove that the URBI method is also suitable to predict the book influence ranking in a long time range. We divide the “sales rank” data of Amazon sales rank data for the print and Kindle books dataset in 12 months in 2017 into four parts, and take the average value as the ground truth. Meanwhile, user reviews in the six months after product sales time in the Amazon Review Data (2018) dataset are selected as the experimental dataset, and the specific information of the experimental dataset is shown in Table 3.

It can be seen from Exp-1 that EC, CC, and BC methods have poor effects, so only DC, PC, and URBI methods are compared in this experiment. We also compare the books ranked Top10, Top50, Top100, Top200, Top300, Top400, Top500, and Top1000 retrieved by each method with the books in the corresponding ground truth to measure the accuracy. The setting of the user rating coefficient in the URBI method is consistent with that in Exp-1. The experimental results are shown in Figure 13, Figure 14, Figure 15 and Figure 16.

As shown in Figure 13, Figure 14, Figure 15 and Figure 16, we used six months of review data to predict the book influence ranking for three consecutive months. Compared with Exp-1, Exp-2 had more book nodes, and the number of user nodes and edges nearly doubled. In this case, the performance of the URBI method is the same as that of Exp-1. In the Top N, the value of N is small, and the results of the URBI method are similar to those of the DC and PC methods. With the gradual increase in the value of N, the accuracy of the URBI method is steadily higher than that of the DC and PC methods.

6. Conclusions

In this paper, we propose a book influence evaluation method based on user ratings of an e-commerce platform (URBI). In order to verify the effectiveness of the proposed method, we designed two experiments with different time spans for comparison and compared the proposed method with five other node influence evaluation methods. The experimental results show the effectiveness of the method.

In a nutshell, the URBI approach analyzes the impact of each book with a very low time complexity depending on the book rating by the user. Experiments using real-world e-commerce platform Amazon book review data reveal that our URBI method outperforms the other five methods. We feel that the number of comments on a book can indicate its influence to some extent, but that the influence of a book can be better portrayed if the user ratings of a book are introduced on this premise.

Author Contributions

Conceptualization, X.Z. (Xinrong Zhan) and J.L.; Methodology, X.Z. (Xinrong Zhan); Writing—original draft, X.Z. (Xinrong Zhan); Writing—review and editing, J.L. and X.Z. (Xinrong Zhan); Supervision, J.L., X.Z. (Xinrong Zhan), X.Z. (Xintao Zhan) and L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the 2022 Central Government Guided Local Development Science and Technology Special Project (2022L3029).

Data Availability Statement

The Amazon Review Data(2018) dataset is available at https://nijianmo.github.io/amazon/index.html, accessed on 6 June 2022; the Amazon sales rank data for print and kindle books dataset is available at https://www.kaggle.com/datasets/ucffool/amazon-sales-rank-data-for-print-and-kindle-books, accessed on 9 June 2022. Experimental data and code related to this paper can be obtained by contacting the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, G.; Zhu, F.; Zheng, K.; Liu, A.; Li, Z.; Zhao, L.; Zhou, X. TOSI: A trust-oriented social influence evaluation method in contextual social networks. Neurocomputing 2016, 210, 130–140. [Google Scholar] [CrossRef] [Green Version]
Jin, L.; Chen, Y.; Wang, T.; Hui, P.; Vasilakos, A.V. Understanding user behavior in online social networks: A survey. IEEE Commun. Mag. 2013, 51, 144–150. [Google Scholar] [CrossRef]
He, S.; Hollenbeck, B. Sales and Rank on Amazon.com. 2020. Available online: https://ssrn.com/abstract=3728281 (accessed on 3 June 2022).
Chevalier, J.; Goolsbee, A. Measuring prices and price competition online: Amazon. com and BarnesandNoble. com. Quant. Mark. Econ. 2003, 1, 203–222. [Google Scholar] [CrossRef]
Baye, M.R.; Morgan, J.; Scholten, P. Information, search, and price dispersion. Handb. Econ. Inf. Syst. 2006, 1, 323–375. [Google Scholar] [CrossRef]
Hollenbeck, B. Online reputation mechanisms and the decreasing value of chain affiliation. J. Mark. Res. 2018, 55, 636–654. [Google Scholar] [CrossRef]
Sharma, A.; Liu, H.; Liu, H. Best Seller Rank (BSR) to Sales: An empirical look at Amazon.com. In Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C), Macau, China, 1–14 December 2020; pp. 609–615. [Google Scholar] [CrossRef]
Deville, P.; Song, C.; Eagle, N.; Blondel, V.D.; Barabási, A.L.; Wang, D. Scaling identity connects human mobility and social interactions. Proc. Natl. Acad. Sci. USA 2016, 113, 7047–7052. [Google Scholar] [CrossRef] [Green Version]
Wang, D.; Song, C. Impact of human mobility on social networks. J. Commun. Netw. 2015, 17, 100–109. [Google Scholar] [CrossRef]
Zhao, Z.Q.; Yu, Z.G.; Anh, V.; Wu, J.Y.; Han, G.S. Protein folding kinetic order prediction from amino acid sequence based on horizontal visibility network. Curr. Bioinform. 2016, 11, 173–185. [Google Scholar] [CrossRef]
Hahn, K.; Massopust, P.R.; Prigarin, S. A new method to measure complexity in binary or weighted networks and applications to functional connectivity in the human brain. BMC Bioinform. 2016, 17, 87. [Google Scholar] [CrossRef]
Clough, J.R.; Evans, T.S. What is the dimension of citation space? Phys. A Stat. Mech. Its Appl. 2016, 448, 235–247. [Google Scholar] [CrossRef] [Green Version]
Wen, T.; Jiang, W. Identifying influential nodes based on fuzzy local dimension in complex networks. Chaos, Solitons Fractals 2019, 119, 332–342. [Google Scholar] [CrossRef] [Green Version]
Lü, L.; Chen, D.; Ren, X.L.; Zhang, Q.M.; Zhang, Y.C.; Zhou, T. Vital nodes identification in complex networks. Phys. Rep. 2016, 650, 1–63. [Google Scholar] [CrossRef] [Green Version]
Zhang, X.; Zhu, J.; Wang, Q.; Zhao, H. Identifying influential nodes in complex networks with community structure. Knowl.-Based Syst. 2013, 42, 74–84. [Google Scholar] [CrossRef]
Kitsak, M.; Gallos, L.K.; Havlin, S.; Liljeros, F.; Muchnik, L.; Stanley, H.E.; Makse, H.A. Identification of influential spreaders in complex networks. Nat. Phys. 2010, 6, 888–893. [Google Scholar] [CrossRef] [Green Version]
Yang, R.; Wang, B.H.; Ren, J.; Bai, W.J.; Shi, Z.W.; Wang, W.X.; Zhou, T. Epidemic spreading on heterogeneous networks with identical infectivity. Phys. Lett. A 2007, 364, 189–193. [Google Scholar] [CrossRef] [Green Version]
Tang, J.; Zhang, R.; Wang, P.; Zhao, Z.; Fan, L.; Liu, X. A discrete shuffled frog-leaping algorithm to identify influential nodes for influence maximization in social networks. Knowl.-Based Syst. 2020, 187, 104833.1–104833.12. [Google Scholar] [CrossRef]
Zengin Alp, Z.; Şule Gündüz Öğüdücü. Identifying topical influencers on twitter based on user behavior and network topology. Knowl.-Based Syst. 2018, 141, 211–221. [Google Scholar] [CrossRef]
Alp, Z.Z.; Şule Gündüz Öğüdücü. Influence Factorization for identifying authorities in Twitter. Knowl.-Based Syst. 2019, 163, 944–954. [Google Scholar] [CrossRef]
Huang, X.; Vodenska, I.; Wang, F.; Havlin, S.; Stanley, H.E. Identifying influential directors in the United States corporate governance network. Phys. Rev. E 2011, 84, 046101. [Google Scholar] [CrossRef]
Lei, X.; Yang, X.; Fujita, H. Random walk based method to identify essential proteins by integrating network topology and biological characteristics. Knowl.-Based Syst. 2019, 167, 53–67. [Google Scholar] [CrossRef]
Tao, Z.; Bing-Hong, W. Catastrophes in Scale-Free Networks. Chin. Phys. Lett. 2005, 22, 1072. [Google Scholar] [CrossRef] [Green Version]
Wang, X.; Li, J. Detecting communities by the core-vertex and intimate degree in complex networks. Phys. A Stat. Mech. Its Appl. 2013, 392, 2555–2563. [Google Scholar] [CrossRef]
Cui, Y.; Wang, X.; Li, J. Detecting overlapping communities in networks using the maximal sub-graph and the clustering coefficient. Phys. A Stat. Mech. Its Appl. 2014, 405, 85–91. [Google Scholar] [CrossRef]
Li, J.; Wang, X.; Eustace, J. Detecting overlapping communities by seed community in weighted complex networks. Phys. A Stat. Mech. Its Appl. 2013, 392, 6125–6134. [Google Scholar] [CrossRef]
Nian, F.; Hu, C.; Yao, S.; Wang, L.; Wang, X. An immunization based on node activity. Chaos Solitons Fractals 2018, 107, 228–233. [Google Scholar] [CrossRef]
Bonacich, P. Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 1972, 2, 113–120. [Google Scholar] [CrossRef]
Bonacich, P.; Lloyd, P. Eigenvector-like measures of centrality for asymmetric relations. Soc. Netw. 2001, 23, 191–201. [Google Scholar] [CrossRef]
Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef] [Green Version]
Newman, M.J. A measure of betweenness centrality based on random walks. Soc. Netw. 2005, 27, 39–54. [Google Scholar] [CrossRef]
Brin, S.; Page, L. The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 1998, 30, 107–117. [Google Scholar] [CrossRef]
Lü, L.; Zhou, T.; Zhang, Q.M.; Stanley, H.E. The H-index of a network node and its relation to degree and coreness. Nat. Commun. 2016, 7, 1–7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhou, J.; Yu, X.; Lu, J.A. Node Importance in Controlled Complex Networks. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 437–441. [Google Scholar] [CrossRef]
Ni, J.; Li, J.; McAuley, J. Justifying Recommendations Using Distantly-Labeled Reviews and Fine-Grained Aspects; Association for Computational Linguistics: Hong Kong, China, 2019; pp. 188–197. [Google Scholar] [CrossRef]
He, S.; Hollenbeck, B.; Proserpio, D. The Market for Fake Reviews. Mark. Sci. 2022, 41, 896–921. [Google Scholar] [CrossRef]
Clay, K.; Krishnan, R.; Wolff, E. Prices and Price Dispersion on the Web: Evidence from the Online Book Industry. J. Ind. Econ. 2001, 49, 521–539. [Google Scholar] [CrossRef]

Figure 1. This figure depicts the experimental results of dataset No.1, which contains 6665 book nodes.

Figure 2. This figure depicts the experimental results of dataset No.2, which contains 6528 book nodes.

Figure 3. This figure depicts the experimental results of dataset No.3, which contains 6647 book nodes.

Figure 4. This figure depicts the experimental results of dataset No.4, which contains 6818 book nodes.

Figure 5. This figure depicts the experimental results of dataset No.5, which contains 7105 book nodes.

Figure 6. This figure depicts the experimental results of dataset No.6, which contains 7435 book nodes.

Figure 7. This figure depicts the experimental results of dataset No.7, which contains 7842 book nodes.

Figure 8. This figure depicts the experimental results of dataset No.8, which contains 8102 book nodes.

Figure 9. This figure depicts the experimental results of dataset No.9, which contains 8213 book nodes.

Figure 10. This figure depicts the experimental results of dataset No.10, which contains 8495 book nodes.

Figure 11. This figure depicts the experimental results of dataset No.11, which contains 9075 book nodes.

Figure 12. This figure depicts the experimental results of dataset No.12, which contains 9413 book nodes.

Figure 13. This figure depicts the experimental results of dataset No.13, which contains 9174 book nodes.

Figure 14. This figure depicts the experimental results of dataset No.14, which contains 9972 book nodes.

Figure 15. This figure depicts the experimental results of dataset No.15, which contains 9128 book nodes.

Figure 16. This figure depicts the experimental results of dataset No.16, which contains 10,724 book nodes.

Table 1. Description of the six methods used in the experiment.

Method	Explanation
DC	The higher a node’s degree, the more important it is in the network.
EC	The importance of a node is determined by the number and importance of its neighbors.
CC	The closer a node is to other nodes, the larger its closeness centrality.
BC	The more shortest paths through the node, the greater its between centrality.
PC	PC evaluates the influence of a node by iterating neighbor information.
URBI	URBI assesses the influence of book nodes by forecasting users’ buying behavior based on their book ratings for review networks.

Table 2. Dataset related information in Exp-1.

Dataset	Sale Rank Time	Review Time	Book Nodes	User Nodes	Links
No. 1	2017.01.01–2017.01.07	2017.01–2017.03	6665	162,802	185,008
No. 2	2017.02.01–2017.02.07	2017.02–2017.04	6528	128,209	145,610
No. 3	2017.03.01–2017.03.07	2017.03–2017.05	6647	132,974	151,066
No. 4	2017.04.01–2017.04.07	2017.04–2017.06	6818	131,930	149,784
No. 5	2017.05.01–2017.05.07	2017.05–2017.07	7105	148,561	169,167
No. 6	2017.06.01–2017.06.07	2017.06–2017.08	7435	165,332	189,044
No. 7	2017.07.01–2017.07.07	2017.07–2017.09	7842	188,745	206,126
No. 8	2017.08.01–2017.08.07	2017.08–2017.10	8102	188,216	215,297
No. 9	2017.09.01–2017.09.07	2017.09–2017.11	8213	161,416	185,307
No. 10	2017.10.01–2017.10.07	2017.10–2017.12	8495	154,251	178,178
No. 11	2017.11.01–2017.11.07	2017.11–2018.01	9075	167,453	194,818
No. 12	2017.12.01–2017.12.07	2017.12–2018.02	9413	176,501	206,043

Table 3. Dataset-related information in Exp-2.

Dataset	Sale Rank Time	Review Time	Book Nodes	User Nodes	Links
No. 13	2017.01–2017.03	2017.01–2017.06	9174	326,476	390,100
No. 14	2017.04–2017.06	2017.04–2017.09	9972	358,632	432,304
No. 15	2017.07–2017.09	2017.07–2017.12	9128	303,024	358,101
No. 16	2017.10–2017.12	2017.10–2018.03	10,724	317,820	384,000

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lu, J.; Zhan, X.; Zhan, X.; Shi, L. A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform. Electronics 2022, 11, 4198. https://doi.org/10.3390/electronics11244198

AMA Style

Lu J, Zhan X, Zhan X, Shi L. A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform. Electronics. 2022; 11(24):4198. https://doi.org/10.3390/electronics11244198

Chicago/Turabian Style

Lu, Junwen, Xinrong Zhan, Xintao Zhan, and Lihui Shi. 2022. "A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform" Electronics 11, no. 24: 4198. https://doi.org/10.3390/electronics11244198

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Book-Influence-Evaluation Method Based on User Ratings of E-Commerce Platform

Abstract

1. Introduction

2. Related Work

3. Preliminary

3.1. Degree Centrality (DC)

3.2. Eigenvector Centrality (EC)

3.3. Closeness Centrality (CC)

3.4. Betweenness Centrality (BC)

3.5. PageRank (PC)

4. Proposed Method

5. Experiments

5.1. DataSet

5.2. Ground Truth

5.3. Experimental Results and Analyses

5.3.1. Exp-1: Effectiveness (The Average of Sales Rank Every 7 Days Is Taken as the Ground Truth)

5.3.2. Exp-2: Effectiveness (The Average of Sales Rank for Each of the Three Months Is Regarded as the Ground Truth)

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI