Next Article in Journal
Weak ψ-Contractions on Directed Graphs with Applications to Integral Equations
Next Article in Special Issue
A Blockchain-Based Access Control System for Secure and Efficient Hazardous Material Supply Chains
Previous Article in Journal
Existence of Solutions for a Viscoelastic Plate Equation with Variable Exponents and a General Source Term
Previous Article in Special Issue
Blockchain-Based Unbalanced PSI with Public Verification and Financial Security
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Enhanced Credit Risk Evaluation by Incorporating Related Party Transaction in Blockchain Firms of China

1
School of Management and Engineering, Nanjing University, 22 Hankou Road, Nanjing 210093, China
2
Nanjing Institute of Digital Financial Industry Co., Ltd., 6 Tianpu Road, Nanjing 211899, China
3
Treasury Department, CITIC Group Corporation, 10 Guanghua Road, Beijing 100020, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(17), 2673; https://doi.org/10.3390/math12172673
Submission received: 29 June 2024 / Revised: 19 August 2024 / Accepted: 26 August 2024 / Published: 28 August 2024
(This article belongs to the Special Issue Applied Mathematics in Blockchain and Intelligent Systems)

Abstract

:
Related party transactions (RPTs) can serve as channels for the spread of credit risk events among blockchain firms. However, current credit risk-assessment models typically only consider a firm’s individual characteristics, overlooking the impact of related parties in the blockchain. We suggest incorporating RPT network analysis to improve credit risk evaluation. Our approach begins by representing an RPT network using a weighted adjacency matrix. We then apply DANE, a deep network embedding algorithm, to generate condensed vector representations of the firms within the network. These representations are subsequently used as inputs for credit risk-evaluation models to predict the default distance. Following this, we employ SHAP (Shapley Additive Explanations) to analyze how the network information contributes to the prediction. Lastly, this study demonstrates the enhancing effect of using DANE-based integrated features in credit risk assessment.

1. Introduction

Since 2012, over 60% of companies listed on China’s A-share stock market have reported related party transactions (RPTs), with the percentage climbing to 69.95% in 2015. The value of these transactions has been on an upward trajectory, reaching a peak of CNY 93,419.77 billion in 2018. This figure represented 19.98% of total revenue and 3.6% of total assets for all A-share listed companies. RPTs typically encompass various activities, including buying and selling goods and assets, exchanging labor services, providing guarantees, managing funds, leasing and acting as agents. The RPTs in blockchain firms tend to involve substantial amounts and a wide range of transaction types. These related party connections often serve as a means of controlling shareholders to maximize their benefits. The resulting complex RPT networks create pathways for credit risk to spread throughout the blockchain.
Credit risk events stemming from RPTs have significant impacts on both local and national financial systems. (A notable example is the Mingtian Group’s illegal appropriation of CNY 1500 billion from Baoshang Bank, in which it held an 89% stake. This led to a severe credit crisis, resulting in joint regulation of Baoshang Bank by China’s central bank and banking regulator on 24 May 2019. The incident caused interbank interest rates to spike, with DR007 and R007 reaching 2.86% and 3.63%, respectively, on 28 May 2019, widening the spread to 77 basis points. Notably, the rating agency failed to mention the large shareholder’s illegal fund appropriation since 2015 in Baoshang Bank’s tier 2 capital bond prospectus.) Given these facts, RPTs have become a crucial factor in credit risk assessment. Regulatory authorities have focused on illegal RPTs—particularly fund misappropriation and profit manipulation by major shareholders—leading to ongoing improvements in the legal framework. (Various regulations, including Article 216 of China’s Company Law and guidelines from the Shanghai and Shenzhen Stock Exchanges, now govern the identification, disclosure and decision-making processes for RPTs.) Despite these regulatory efforts, the diverse nature and complex relationships involved in RPTs continue to pose significant challenges in financial risk management.
Existing research on credit risk-evaluation systems primarily focuses on a firm’s individual attributes [1]. However, these approaches often neglect the characteristics of related party firms. RPTs represent a unique and significant social relationship, functioning as conduits for resource and profit transfer, as well as channels for risk contagion, whether through tunneling [2] or propping [3]. Although networks’ financial risk contagion effect has been established [4], a practical method for representing RPT network features has been lacking. To address this gap, our study introduces a feature-representation approach that incorporates RPT network information into credit risk-evaluation models.
Drawing inspiration from deep network embedding research, this study employs DANE [5] to reduce the dimensionality of the RPT network. We generate low-dimensional vector representations of the features, which are then used as inputs for credit risk-evaluation models. These models, based on XGBoost [6], are utilized to predict the default distance [7]. Our findings demonstrate that the DANE-based integrated features, which combine the RPT network structure and firm-specific attributes, significantly enhance the forecasting performance.
Traditional credit risk models rely heavily on financial indicators. This study proposes a new method that leverages the power of RPT network analysis. By converting high-dimensional, nonlinear network data into manageable features, we improve the accuracy of credit risk predictions. This innovation offers significant benefits: regulatory authorities can better detect and prevent credit risk, listed firms can avoid risky related party transactions and investors can minimize losses from credit risk.
The remainder of this article proceeds as follows: Section 2 reviews the existing literature on credit risk evaluation, RPTs, network embedding algorithms and explainable machine learning tools. Section 3 is concerned with the principles of DANE and XGBoost. Section 4 presents the descriptive statistics of the dependent and independent variables. Section 5 derives low-dimensional vector representations of RPT network features from DANE, feeds them into XGBoost-based credit risk-evaluation models and compares the results. Section 6 presents the research conclusions, including key findings, applicability, limitations and further directions of the methodology.

2. Literature Review

2.1. RPTs and Credit Risk Evaluation

Research on credit risk evaluation could date back to as early as 1960s. Beaver [8] first used financial ratios to predict failure (“Failure” refers to the inability of a firm to pay its financial obligations as they mature, such as bankruptcy, bond default, an overdrawn bank account and nonpayment of a preferred stock dividend [8].) based on univariate analysis method, which compared a list of ratios individually between failed and non-failed firms. Current credit risk-evaluation systems primarily draw from firm performance indicators used by the Ministry of Finance, corporate credit-rating systems employed by commercial banks and assessments conducted by rating agencies. These systems predominantly focus on financial ratios encompassing liquidity, profitability, growth and solvency [1,9,10]. While significant emphasis is placed on macroeconomic factors and firm-specific attributes, these approaches often overlook crucial business connections represented by RPTs and their potential for risk contagion. RPTs serve as a vital mechanism for both tunneling [2] and propping [3] in listed firms. These practices function as conduits through which large shareholders transfer resources to maximize profits. Although the directionality of RPTs may differ between tunneling and propping, both phenomena are intrinsically linked to risk transmission, thereby influencing variations in firms’ credit risks. This oversight in existing models underscores the need for a more comprehensive approach that incorporates the complex dynamics of RPTs in credit risk evaluation.
Tunneling refers to the practice of controlling shareholders using RPTs to secretly transfer assets and profits for their own benefit, often to the detriment of smaller shareholders. This exploitation of internal capital markets allows them to extract private benefits and infringe upon the interests of minority stakeholders [11,12,13]. The non-arm’s-length nature of RPTs, which brings agency problems and hampers board monitoring efficacy, has already raised the concern of the Financial Accounting Standards Board in the United States [14]. The value-destroying phenomenon is more prevalent in emerging markets, especially those of China, for the following three reasons [15]. First, it is common for large business groups in China to use carved-out listed subsidiary companies to raise funds from the stock market. The highly concentrated ownership structure further facilitates the tunneling of resources through RPTs. Second, the majority of stock market participants in China are retail investors who are lacking in information and unable to restrain shareholders’s tunneling practices through RPTs in time. Third, due to the weak investor protection mechanism in China, the minority shareholders have difficulty in safeguarding their own interests, thus encouraging the expropriation of large shareholders. Therefore, researchers have appealed to revise the regulatory framework in China to protect minority shareholders’ interests by taking the influence of firms’ CEO–board or CFO–board social ties into account [16].
In contrast, propping involves related parties engaging in mutually supportive RPTs to enhance their collective well-being. This can include purchases, sales, credit guarantees and product transactions aimed at optimizing resource allocation and mitigating imperfections in external markets. Propping typically emerges in two distinct circumstances. Firstly, controlling shareholders use their private resources to strengthen another firm’s solvency to survive credit crises. Secondly, if there are tax rate differences between a parent firm and its subsidiary, the parent firm will utilize RPTs for tax evasion. In particular, the subsidiary’s profits are transferred to the parent firm by selling high and buying low so as to increase the profits in a consolidated financial statement. The overall profitability and operating capacity are improved, thus reducing credit risk [17,18].
As can be seen above, the categories and implementations of RPTs vary significantly, which leads to a complicated network. Previous research found that risk contagions through network connections have a great impact on all members and the network as a whole [19,20]. A firm’s risk is affected by its connectedness with others [21]. The results of a recent study proved that supply chain networks have greater contagion effects than investment and information networks. The capital path, product path and information path are the primary means of transmission of credit risk [22]. The occurrence of credit risk diffusion grows as the average degree of the network and brings more serious consequences to the economy than individual firms [23]. Most notably, if the risk of an enterprise group member increases, other members will follow in the next year. An RPT network resembles an enterprise group with tight connections among the members. In unfair transactions, risks can transfer along fund chains and cause related firms to wind up in a financial crisis [24]. These studies proved the effect of credit risk contagion in networks but failed to propose a feasible way to incorporate listed firms’ RPT features into credit risk evaluation.

2.2. Deep Network Embedding Algorithms

The difficulty of incorporating RPT information into credit risk-evaluation models lies in the fact that high-dimensional, nonlinear and highly sparse networks cannot be directly fed into models. However, deep network embedding algorithms can learn and represent high-dimensional features using vectors in low-dimensional and dense space. Low-dimensional vector representations are further applied in tasks such as classification, visualization and link prediction. Therefore, deep network embedding algorithms effectively solve this challenge.
DeepWalk [25] is the first network embedding algorithm to use deep learning. It learns from the concept of the term vector in word2vec, a useful tool in natural language processing (NLP), to perform serial sampling of nodes with a random walk. However, one of the limitations of DeepWalk is that it uses breadth-first search (BFS) to construct neighborhoods, which only applies to unweighted graphs. Large-scale information network embedding (LINE) [26] is an adjustment of the method of neighborhood construction in DeepWalk as a depth-first search (DFS) to be applicable to weighted graphs. In addition, LINE defines first-order proximity and second-order proximity and sets them as optimization objectives. Node2vec [27] further considers BFS and DFS synthetically when constructing a neighborhood and employs a biased random walk to obtain the neighborhood’s series of nodes. To solve the problem of DeepWalk’s failure to capture relationships with distant nodes, Walklets [28] was proposed based on a truncated random walk to ensure that each node in the network can be visited. Based on definitions of first-order proximity and second-order proximity, structural deep network embedding (SDNE) captures the network’s nonlinear structure as much as possible by learning through deep auto-encoders. SDNE can preserve both local and global structural features and is applicable to sparse networks. Apart from structural features, accelerated attributed network embedding (AANE) [29] further considers the attribute features of nodes. Unfortunately, AANE is too simple to capture nonlinear structure and attribute information. Based on AANE, deep attribute network embedding (DANE) [5] is modified to enable the capture of nonlinear information on both network structure and node attributes.
Comparing the functions and limitations of the models above, this study uses DANE for RPT network embedding. The low-dimensional vectors obtained serve as input for credit risk-evaluation models. In this way, the incorporation of information on both the RPT network structure and firms’ own attributes can be realized.

2.3. Application of Explainable Machine Learning in Credit Risk Evaluation

Machine learning methodologies for credit risk evaluation have gone through continuous iterations. In the stage of statistical approach, one representative is multiple discriminant analysis (MDA). In contrast with univariate analysis method [8], MDA has two advantages. First, it takes the interactions of characteristics into consideration. Second, it reduces space dimensionality by transforming values of explanatory variables to a single discriminant score known as Altman’s Z-score [1]. However, such a model is limited to linear relationships between bankruptcy and financial ratios, and the coefficients need modifications for different times and regions. In addition, binary response models, such as probit [30] and logit model [31] are applicable to binary classification problems. As weak classifiers, the learning capabilities, accuracy and stability of training outcomes, and adaptability to different datasets of such models are inferior to artificial intelligence algorithms.
In recent years, applications of artificial intelligence algorithms in credit risk evaluation have aroused extensive attention. On the one hand, researchers have conducted comparative analysis of models on a wide range of data. For example, neural network (NN), such as a combination of multi-layer perceptrons and self-organizing maps, can display the probability of distress up to three years before bankruptcy occurs [32]. In contrast with linear discriminant analysis (LDA), NN performs better in less clear and more complex classification problems [33]. However, support vector machine (SVM) surpasses back-propagation NN in the problem of corporate bankruptcy prediction in accuracy and generalization performance, especially as the training set size got smaller [34]. In addition, classification and regression tree (CART) is more applicable to prediction of imminent bankruptcies [35]. On the other hand, researchers have explored into model combinations to enhance performance. For example, combining SVM with genetic algorithm (GA), particle swarm optimization (PSO) and switching particle swarm optimization (SPSO) handles better explanatory power and stability of SVM [35,36]. Integration of LDA and NN outperforms single model [33]. However, deep NNs fail to outperform their shallow counterparts and are more computationally expensive to construct. Instead, the ensemble method, eXtreme Gradient Boosting (XGBoost), is preferred when classification performance is the main objective of credit scoring activities [37]. Apart from improvements on model structures, introducing synthetic features, a combination of economic measures using arithmetic operations, further enhances the ensemble method [38].
Despite the fact that above-mentioned complex machine learning models have capabilities of nonlinear mapping and generalization in new environments, as well as high accuracy, they are trained in a “black box”. In other words, the economic meanings of network parameters, thresholds and training results are difficult to illustrate. Therefore, researchers investigate how to improve the explainability [39] of machine learning models to increase both transparency and accuracy. To solve this issue, they use structured interpretative and analytical frameworks of machine learning models, such as Shapley Additive Explanations (SHAP) [40], local interpretable model-agnostic explanations (LIME) [41] and iBreakDown to analyze the significance and contributions of variables from both local and global perspectives [42]. In contrast with LIME, SHAP has an edge in providing a more consistent, comprehensive and intuitive way of understanding the influence of variables on model prediction. This is because LIME approximates the “black box” model locally to ensure both interpretability and local fidelity. Meanwhile, the explanations can vary with different samples, which are generated from random perturbations.
This study uses XGBoost for credit risk evaluation, as it has been widely recognized in both industry and data mining contests such as Kaggle. Compared with neural networks, XGBoost is more explainable and skilled at tabular data processing, in addition to being easier for parameter adjustment. Then, we apply SHAP to assess the significance and contribution of RPT network features in credit risk evaluation.

3. Methodology

Machine learning is a significant branch of AI applications. Deep learning, on which DANE is based, is an extension of machine learning to be applied to more complicated and large-scale datasets. Node features typically encompass two key dimensions: attribute features and structural features. SDNE focuses solely on capturing structural features, while DANE recognizes the significant influence of attribute information on the learning of comprehensive node representations. For instance, in the case of two firms exhibiting similar structural features but possessing distinct attribute characteristics, only DANE, by leveraging both dimensions, can effectively differentiate these firms within a low-dimensional embedding space. RPT network features derived from DANE, along with the firm’s attribute characteristics, will then be fed into XGBoost for credit risk evaluation.

3.1. Framework of DANE

The deep attribute network embedding (DANE) approach [5] is a combination of two deep autoencoders. One is used for the capture of highly nonlinear network structures, and the other is to capture high nonlinearity in attributes.
An autoencoder is a type of unsupervised feedforward neural network that is widely applied in compressed data representation. An autoencoder is composed of two fully connected feedforward neural networks; they are the encoder and the decoder. The encoder connects the input layer with the hidden layer, and the decoder connects the hidden layer and output layer. Supposing that there are L layers in the encoder,
Z 1 = W 1 x + b 1
H 1 = σ Z 1
Z l = W l H l 1 + b l , l = 2 , , L
H l = σ Z l , l = 2 , , L
Here, x represents the input data, and H L is the desired low-dimensional representation of the input. σ ( · ) denotes the nonlinear activation function. W l and b l are model parameters of the l-th layer, which are learned by minimizing the reconstruction error as follows:
min θ i = 1 n | | x i x i | | 2 2
where x i is the compressed data representation of the i-th node derived from the encoder, and θ is model parameter.
The first autoencoder for the embedding of the topological structure is trained with the objective of minimizing the construction loss of the high-order proximity (Given an attributed network G = { E , A } , where E is the adjacency matrix of all nodes in the network, A denotes the attribute matrix. M = E ^ + E ^ 2 + + E ^ t is the high-order proximity matrix, where E ^ is the 1-step probability transition matrix of adjacency matrix E. The high-order proximity of any two nodes in the network depends on the similarity of their high-order proximity matrices. If two nodes share a similar neighborhood, they are similar.) and maximizing the likelihood estimation of first-order proximity ( E i j denotes the j-th element in the i-th line in the adjacency matrix. E i j is the distance between the i-th node and the j-th one in the network. A larger value of E i j indicates more similarity between two nodes. If two nodes are connected, they are similar, which ensure both global and local structures):
min θ i = 1 n | | M i M i | | 2 2
max θ E i j > 0 p i j M
p i j M = 1 1 + e x p ( H i M ( H j M ) T )
where M is the high-order proximity matrix of input data, and M is the higher-order proximity of compressed data representation from the first autoencoder. p i j M is the joint probability between the i-th node and the j-th one. If E i j is positive, there is a link between two nodes. H i M and H j M are compressed representations for the topological structures of the i-th and the j-th node.
The second autoencoder for the embedding of attributes aims to minimize the construction loss of the semantic proximity (Semantic proximity is defined as the similarity between attribute matrices.) and maximize the likelihood estimation of first-order proximity:
min θ i = 1 n | | S i S i | | 2 2
max θ E i j > 0 p i j S
p i j S = 1 1 + e x p ( H i S ( H j S ) T )
where S is the semantic proximity matrix of input data, S is the semantic proximity of compressed data representation from the second autoencoder. H i S and H j S are compressed representations for attributes of the i-th node and the j-th node.
In terms of the relationship between the inputs of two autoencoders, the topological structure and attributes describe node characteristics from two different but complementary perspectives. Meanwhile, they should be consistent due to the fact that they are information on the same network. Traditional methods such as concatenating the outputs of two autoencoders directly and forcing two autoencoders to share the same highest encoding layer fail to preserve both the complementarity and consistency of information. We can minimize the negative log-likelihood jointly [5]:
L = E i j > 0 l o g p i j M E i j > 0 l o g p i j Z + i = 1 n | | M M | | 2 2 + i = 1 n | | S S | | 2 2 i { l o g p i i E i j = 0 l o g ( 1 p i j ) }
The first two terms are the negative logarithms of Equations (7) and (9), which preserve the first-order proximity of the two branches. The third and the fourth terms minimize the construction loss of high-order proximity and semantic proximity. The last term (this term is the negative logarithm of i = 1 n p i i i j ( 1 p i j ) , where p i j = 1 1 + e x p ( H i M ( H j S ) T ) ) is used to ensure the consistency and complementarity of information from the two branches (Figure 1).

3.2. Framework of XGBoost

XGBoost [6] optimizes GBDT [43] in the following two ways. The first optimization is that of the loss function. XGBoost inherits the basic idea of the steepest descent in GBDT, but first-order Taylor expansion is replaced by second-order Taylor expansion in the fitting of the loss function to improve efficiency. Meanwhile, the L 2 regularization term is added to control the complexity of the model in the case of overfitting. The loss function of XGBoost is
L = i l y ^ i , y i + k Ω f k
Ω f = γ T + 1 2 λ w 2
where T is the number of leaf nodes, γ is a parameter of T, w represents the weights of leaf nodes, | | | | 2 denotes L 2 regularization and λ is the coefficient of the sum of the L 2 -regularized weights of leaf nodes. When λ is zero, L will degrade to the loss function of GBDT.
The second optimization is the more efficient weighted quantile sketch algorithm. It is common for previous decision trees to apply greedy algorithms to iterate through all of the possible split points. The obvious drawback is that there is a large amount of computation. However, the weighted quantile sketch algorithm splits the characteristic intervals according to the quantiles of characteristic distributions and then uses corresponding quantiles to substitute the real value of the characteristic to obtain a gradient vector and estimate the optimal split point.

4. Data Description

4.1. Dependent Variable

The dependent variable in this study was firms’ default distance. Currently, the most widely used credit risk-evaluation models are Moody’s KMV, Credit Metrics of JPMorgan and CreditRisk+ of Credit Suisse.
KMV is based on Merton’s option pricing theory. It first estimates the market value and volatility of a firm’s assets and calculates default points based on the firm’s long-term and short-term debts. Then, the default distance is computed, and the expected default probability is derived in reference to the empirical distribution of the default distance.
The Credit Metrics model is based on a transition matrix of default probability and credit ratings published by credit-rating agencies. V a R is calculated using the distribution curve of the value of a credit portfolio for a period of time. The Credit Metrics model also takes correlations between credit assets in the portfolio into consideration, but there are some defects in model assumptions. For example, it assumes that firms with the same credit rating have the same default probability based on the average of historical data, the asset returns obey the normal distribution and the risk-free rate remains constant.
CreditRisk+ is inspired by casualty actuarial methods. It first grades loans in terms of risk exposure to calculate the distributions of defaults and losses in each spectrum and then derives the probability distribution of loan default loss. This model assumes that each loan is dependent on the others and that the default probability of the loan portfolios obeys a Poisson distribution. However, it is common for clients of commercial banks in China to perform related party guarantees, leading to interconnections between loans. Therefore, CreditRisk+ is not applicable in China. In reference to model comparisons [44], this study used KMV to calculate the quarterly default distance ( D D ) of non-financial A-share listed firms as the metric of credit risk.
Assuming that a firm raises funds through both equity and debt financing, E is the equity value, D is the debt value and V is the asset value. When the debt matures, the firm can repay the debt if its asset value exceeds the debt value. Otherwise, defaults occur. KMV sets the default point as the sum of the short-term debt value ( S D ) and 50% of the long-term value ( L D ).
D P = S D + 0.5 L D
As a firm’s asset value cannot be observed directly, the KMV model assumes that its market value obeys Brownian motion and introduces the BSM option pricing model,
d V t = μ V t d t + σ v V t d W
where V t denotes the firm’s market value at time t. μ is the expected growth rate of the asset value. σ v is the volatility of the asset value, and d W is a standard Wiener process.
The market value of a firm can be regarded as a European call option whose underlying asset is the firm’s total assets, and the strike price is its debt value. When the firm’s assets exceed its debt, the call option will be exercised. According to the BSM model, the relationship among the equity value E t , asset value V and debt value D is as follows:
E t = V t N d 1 D e r T N d 2
d 1 = ln V t D + r + σ V 2 2 T σ V T
d 2 = ln V t D + r + σ V 2 2 T σ V T = d 1 σ V T
where r is the risk-free rate, which takes the value of a one-year lump-sum deposit and the withdrawal interest rate announced annually by the People’s Bank of China. N ( · ) is the cumulative distribution function of the standard normal distribution. The asset value of the firm and its volatility σ V are unobservable.
The equity value E t is the product of the quarterly averaged daily closing price and marketable volume. The daily closing price takes the impacts of the reinvestment of cash dividends into consideration. The equity value E t is a function of the asset value V t and time t. According to Ito’s lemma, the equity value E t also obeys geometric Brownian motion, and the volatility σ E satisfies
σ E = V t E t N d 1 σ V
In this study, σ E refers to the realized volatility [45]:
R V k = i = 1 M r k , i 2
where R V k is the realized volatility, M is the total sum of days and r k , i is the logarithm of the return of the firm’s stock on the i-th day in the k-th quarter. The quarterly equity market value is the product of the quarterly averaged closing price and marketable volume due to the fact that there are no non-tradable shares after China’s Split-Share Structure Reform in 2006. By solving the simultaneous equations of the asset market value V and its volatility σ V , the firm’s asset value at time t is
V t = V 0 e x p μ σ V 2 2 t + σ V t ε
where ε is the variable of the standard normal distribution. When t = T , if V T D , the firm will default. The default distance ( D D ) is defined as the distance between the asset value and debt value.
D D = V t D P V t σ V
The KMV Corporation constructed a default information database by collecting historical data on defaults of American firms and derived a mapping relationship between the default distance and the probability of default for transfer from the default distance to the expected default frequency ( E D F ). This empirical distribution is inapplicable to the Chinese market, and the assumption of a normal distribution lacks supporting evidence [46]. Therefore, this study used the default distance ( D D ).

4.2. Independent Variable

Apart from credit risk-evaluation indicator systems in the existing literature, especially firms’ attribute features, including the government structure, liquidity, profitability, solvency, operation capability, growth, capital market performance and cashflow conditions, this study innovated in incorporating RPT network features.

4.2.1. Construction of an RPT Network

It is given that, in contrast with unlisted firms, listed firms have more information disclosed, as well as greater impacts on the entire financial system. Therefore, this study built an RPT network based solely on listed firms. RPT data during the period from 2011 to 2019 were fetched from the Wind and CSMAR databases (the Wind Economic Database is the most widely used database in the financial industry in China, and its clients cover investment institutions, research institutions, academic institutions and regulatory agencies. The CSMAR, which is short for the China Stock Market and Accounting Research Database, is a comprehensive research-oriented database focusing on China’s Finance and Economy. CSMAR was developed by Shenzhen CSMAR Data Technology Co., Ltd. (Shenzhen, China), based on academic research needs, meeting international professional standards while adapting to China’s features) There were 231,964 records in total. This study used the deep network embedding algorithm. We processed the raw data into an adjacency matrix of RPTs between listed companies as the model input, the size of which was 2331 × 2331 . The detailed steps are explained later. Therefore, from the perspective of statistical learning, the sample size was adequate. To ensure the reliability and accuracy of the data, data from these two databases were cross-checked. We further referred to financial statements for random inspection of whether the RPTs of the companies were recorded correctly. The fields of RPTs included codes and abbreviations of securities, announcement dates, related parties and relations, amounts, trading methods and currencies. The attribute features of the disclosing party and related party included their full names in Chinese, their used names, IPO dates, listing segments, controlling shareholders and ownership, industrial sectors and business scope. Table 1 shows an overview of the annual RPTs in the last decade.
According to data from CSMAR, the related relations, transaction methods, payment methods and industries involved were diverse. In terms of categories of related relations, the most common scenarios were the disclosure party and related party belonging to the same controller or parent firm and the related party being the controlling shareholder or sharing the same key personnel. In terms of transaction methods, sales or purchases and the provision or acceptance of labor accounted for the greatest proportions. These were common upstream and downstream relationships in the supply chain. In contrast, mergers and acquisitions, contractual joint ventures and technology services accounted for a small proportion. In terms of payment methods, cash, bank transfers and acceptance bills were more common than equity, bonds and physical assets. In terms of industry distribution, 24 secondary industries were involved. Listed firms in materials, capital goods, technical hardware and equipment, transportation and the energy industry disclosed the largest number of RPTs, while those involved in telecommunication services, family and personal products, insurance, banking, food and main product retailing had the least disclosure. The industry classification adopted in this study was the Wind industry classification, which is based on the GICS (Global Industries Classification Standard) adapted to China’s market.
This study used a weighted adjacency matrix to represent the RPT network. We define the network as G = ( V , E ) , where V = { v 1 , v 2 , , v n } is a node set and v i denotes the i-th listed firm. All of the listed firms were sorted in ascending securities code order. E = { e i j } i , j = 1 n denotes an edge set, representing whether any two firms have related party connections.
In the weighted graph, if there is no related party connection between the i-th firm and the j-th firm, the weight of the edge between them s i j equals zero. Otherwise, s i j is the ratio of the RPT amount to the total assets of the j-th firm. In the same way, the denominator of s j i represents the total assets of the i-th firm instead. Therefore, this adjacency matrix is unsymmetrical. If several transactions are disclosed between two firms, the numerator will be the sum of the transaction amount. Elements on the diagonal are all zero due to the fact that there are no RPTs with a firm itself.
Then, this study processed the raw data on RPTs to construct an adjacency matrix. Firstly, records of unlisted firms were eliminated, and missing values were filled with zeros. The currencies of all the transactions were converted into yuan. After that, transactions with the same disclosure party and related party were combined to derive the sum of amounts. Table 2 shows the statistics on the two most important elements of the RPT network: nodes and edges. Taking the RPT network in 2019 as an instance, there were only 505 non-isolated nodes, which was far below the total number of listed firms disclosing RPTs. This was because only RPTs between two listed firms were included in this network. There were 376 edges; in other words, the adjacency matrix had 752 non-zero elements. However, the dimension of the matrix was 2331 × 2331 , so this network was highly sparse.

4.2.2. Structural Pattern Mining of the RPT Network

This study investigated the structural patterns of connected subgraphs in the RPT network using Networkx in Python, which is widely used in the study of the structure, dynamics and functions of complex networks. Firstly, we computed the degrees of all the nodes in the network. The degree refers to the number of edges connected with a certain node. A firm disclosing no RPTs is an isolated node whose degree is zero. The higher the degree of the node, the more RPT relations it has with other firms, thus exerting a larger impact on the network. A connected subgraph refers to an undirected graph whose nodes are all connected in pairs. Then, connected subgraphs with different numbers of nodes were counted. In Table 3, we took the network in 2019 as an instance. It was divided into 187 connected subgraphs. There were 127 two-node connected subgraphs, which had the largest number. The “largest” connected sub-graph consisted of 31 nodes.
Considering the fact that connected subgraphs have different structures due to their different connection modes even if they share the same number of nodes, this study classified subgraphs based on node degrees. Taking a three-node subgraph as an instance, it had two different structures. One was circular, while the other was linear. The node degree lists were [ 2 , 2 , 2 ] and [ 1 , 2 , 1 ] , respectively, which allowed differentiation between these two structures. Table 4 shows the statistics on the classification of the connected subgraph’s structural pattern. Figure 2 is a schematic of each type of structural pattern.
From Table 4 and Figure 2, it is obvious that the RPT network represented by the adjacency matrix was high-dimensional and sparse. Therefore, it could not be directly used as an input for machine learning tasks such as classification and regression. This study employed network embedding algorithms [5,47] to represent structural features of nodes using vectors in the low-dimensional and dense space. In the meantime, local and global structures were well preserved. Then, the adjacency matrix was fed into SDNE to learn a two-dimensional vector representation. The values of the vector in two dimensions were R P T 1 and R P T 2 , respectively, both of which were input into credit risk-evaluation models.

4.3. Descriptive Statistics

This study investigated the significance of incorporating RPT network features into credit risk evaluation based on the quarterly data of all A-share firms in China from 2011 to 2019. Table 1 shows the total numbers of listed firms annually from 2011 to 2019. The number of observations varied with the numbers of listed and delisted firms. There were 103,629 observations in total, exclusive of financial firms and firms with over three missing values. Data of independent variables were fetched from the Wind and CSMAR databases. Table 5 shows the descriptive statistics of the main variables (Table 6). In terms of liquidity, the median of Q u i c k R a t i o was 1.115, showing that over half of the companies could pay their current debts using current assets. In terms of profitability, the minimum and the first quartile of P r o f i t M a r g i n were −0.152 and 0.031, respectively, indicating that approximately a quarter of the companies suffered operating losses. In terms of solvency, the mean and the third quartile of P r o p e r t y were 0.797 and 1.092, respectively, demonstrating that a majority of the companies had a stable financial structure. In terms of capital market performance, the intrinsic value ( NAPS ) of the companies varied significantly, ranging from −2.070 to 10.385. The medium of MNCF was 0, meaning that only half of the companies had positive cashflows and capabilities of capital expenditure. In terms of government structure, over a quarter of the companies had the same general manager and chairman. The mean of S e p a r a t i o n was only 0.052, which indicated that the degree of separation of ownership and managerial authority was still low. Meanwhile, the sample covered companies with different ownership concentrations, ranging from 0.030 to 0.834.
Table 7 shows the Pearson correlation coefficients of the main variables. It can be seen that E q u i t y / M a r k e t V a l u e and D e b t / M a r k e t V a l u e had significant correlations with the firms’ default distance, and the correlation coefficients were all below 50%. We performed Z-score normalization on continuous variables.

5. Empirical Results

5.1. DANE-Based RPT Network Embedding

We randomly selected a 39-node subnetwork for a small-sample experiment. The input of DANE included edge sets, attributes and labels of nodes. Some of the nodes were connected directly, while the others were connected indirectly. Edges directly connecting any two nodes were paraphrased as a node pair of a source and a target, which contained first-order proximity information. To obtain sets of indirectly connected nodes that provided high-order proximity information, we iterate all of the connected subgraphs with over two nodes in the network using Networkx in Python. We used the logarithm of the Z-score-normalized total assets as a node attribute. The node label was set as the rank of the default distance, which reflected its credit risk level. The larger the default distance, the lower the credit risk. The label of a firm whose rank of default distance fell in the top 50% was one; otherwise, it was zero.
In Figure 3, the node labels are distinguished by two colors. Specifically, purple represents a smaller default distance—in other words, higher credit risk. On the contrary, yellow represents a larger default distance and lower credit risk. The numbers are used to distinguish different nodes in the following subgroup analysis. For instance, v 0 directly connects with v 16 to make up a connected subgraph. The degrees of nodes in the subgraph were all equal to one, and their total assets were approximately equal to each other. Therefore, the two-dimensional vectors of these two nodes were almost exactly the same due to the similarity in their features. Another example is the connected subgraph composed of v 15 , v 18 , v 20 , and v 26 , whose degree list is [ 3 , 1 , 1 , 1 ] . Node v 15 differed from the other three nodes in structural features and its higher total assets. Therefore, v 18 , v 20 and v 26 gathered at the bottom left, while v 15 was far away. The visualization results varied in the node colors and distributions across different subnetworks due to the different topological structures and attributes of the nodes.
DANE’s capability of distinguishing nodes with different attributes and structures was confirmed through a small-sample experiment on a sub-network. We fed the whole sample into the DANE model. As it required the input of a fully connected network, isolated nodes—in other words, firms disclosing no RPTs—were eliminated. The remainder were ranked and renumbered in ascending order of their security codes. The statistics showed that there were 505 non-isolated nodes in 2019. Each node had 24 features. There would be much information loss if the objective dimension of the network embedding was too low. Meanwhile, an excessively high dimension would negatively influence the forecasting performance of machine learning models in the next step. We optimized the model parameters with the goal of lowering the dimension of the objective space as much as possible. If the loss function did not significantly increase with the decrease in the dimension, the current dimension was adopted. The optimal parameters were as follows: the number of hidden layers was 2, the numbers of neurons were 300 and 200, respectively, and the dimension of objective low-dimensional space was 20.

5.2. XGBoost-Based Credit Risk Evaluation

To study the influence of incorporating integrated features of the RPT network structure and the firms’ attributes into credit risk evaluation, we constructed the following three models.
Model 1 ( X G B 1 ): Referring to existing credit risk-evaluation indicator systems, this model predicted the default distance solely based on a firm’s attribute features, such as financial indicators and government structures.
Model 2 ( X G B 2 ): This model incorporated structural features of the SDNE-based RPT network into the baseline model X G B 1 . Then, we used SHAP, an explainable machine learning package, to measure the significance of structural features. Finally, a comparative test verified whether incorporating structural features could improve prediction accuracy.
Model 3 ( X G B 3 ): This model incorporated integrated features of the RPT network structure and firms’ attributes into the baseline model X G B 1 . Then, a comparative test between X G B 2 and X G B 3 verified whether integrated features provided more information to help improve forecasting performance.
The parameters of XGBoost fall into three categories: general parameters, model parameters and learning task parameters. The parameters to be optimized cover the learning rate e t a , maximum number of iterations n _ e s t i m a t o r , minimum loss function of leaf node splits γ , maximum depth of a tree m a x _ d e p t h , minimum sum of instance weights needed in a child node m i n _ c h i l d _ w e i g h t , a family of parameters for subsampling of columns c o l s a m p l e _ b y t r e e and subsampling ratio of the training instances s u b s a m p l e .
Grid search and cross-validation are common methods for the optimization of model parameters. The core concept of grid search is to iterate possible values of all of the parameters and then choose the group with the best performance. It is more simple to split the training set and test set, but the final model parameters will depend on the division of these sets. This does not allow one to make the most of the data and leads to large differences in model performances in the case of different division methods, as well as negative impacts on prediction effects. Therefore, this study employed five-fold cross-validation to split all of the datasets into five mutually exclusive subsets, selected four subsets without repetition for training, and tested on the remaining subset. After that, the root mean square error was calculated on the test set. The average of five loss functions was C V 5 .
C V 5 = 1 5 i = 1 5 R M S E i
R M S E i = 1 m i = 1 m ( y i ^ y i ) 2
where m is the number of samples, y i is actual value of the default distance and y i ^ is the predicted value of the default distance. Table 8 shows the parameter settings of the credit risk-evaluation model.

5.3. Evaluation of the Significance of RPT Network Features

SHAP (Shapley Additive Explanations) was proposed to measure the contribution of each feature to model predictions [40].
p r e d i c t i o n i = b a s e p r e d i c t i o n + f x i 1 + f x i 2 + + f x i k
where p r e d i c t i o n i denotes the prediction of the i-th sample. b a s e p r e d i c t i o n is the average of predictions of all samples in the dataset. f ( x i j ) is the SHAP value of the j-th feature of the i-th sample. If f ( x i j ) is positive, the feature has a positive impact on the prediction; otherwise, it has a negative impact.
We first constructed a tree explainer supporting XGBoost and used a feature matrix to calculate SHAP. Figure 4 shows the SHAP rankings of all features in descending order of the average SHAP value for all of the samples. Each point in the figure denotes a sample. The colors of the points stand for the values of the features. The red points are higher than the blue ones. The abscissa stands for the SHAP value of a feature in one specific sample.
Figure 4 shows that R P T 1 ranked 7 and R P T 2 ranked 24 out of the 26 features and R P T 1 and R P T 2 were two dimensions of features derived from SDNE-based network embedding. R P T 1 contributed far more than R P T 2 to the predictions, illustrating that R P T 1 contained more information on the RPT network structure. Therefore, we focus on exploring R P T 1 below.
There are two possible circumstances when a feature has a high SHAP value. The first scenario is that a minority of samples have extremely high SHAP values and increase the average. In such a case, this feature only has a significant impact on a few samples, while it has little impact on the others. Another scenario is that the feature has a large impact on a majority of the samples.
In addition to the macro-feature importance, this study further analyzed the statistics of the feature importance for each sample. In Figure 5, the horizontal axis stands for the importance rankings of the RPT network features. The vertical axis shows the proportions of samples whose RPT features had a specific importance ranking.
As shown in Figure 5, the importance of R P T 1 was ranked first in 30.49% of the samples and was ranked in the top half in 95.95% of the samples. Figure 6 shows that the importance of R P T 2 was ranked first in only 6.84% of the samples, but it was ranked in the top half in 82.96% of the samples. We can conclude that the RPT network’s structural features had significant contributions to the credit risk evaluation.
It was likely that RPT features contributed significantly to the prediction of the default distance because they contained information on firms’ attribute features. In other words, they were an integration of multiple attribute features. To rule out this possibility, we utilized a linear regression to test whether the RPT network’s structural features could be explained by other features in the model. Taking R P T 1 as an example, we calculated the Pearson correlations between the RPT network features and other attribute features.
It can be seen in Table 9 that the correlations were all below 50%. In other words, there were no significant correlations between R P T 1 , the structural feature of the RPT network and most independent variables. Then, the variance inflation factor (VIF) was used to detect whether multicollinearity existed between independent variables in the regression analysis by measuring how much the standard error of the estimated coefficient was inflated. Otherwise, multicollinearity will cause the regression coefficients to be consistent but not statistically significant with a type II error (a type II error is a false negative in statistical hypothesis testing, where the test fails to reject a false null hypothesis) Table 10 shows that the value of 1/VIF for all of the variables was above 0.2, which proved that there was no multicollinearity.
An F-test was also used to examine whether the coefficients of the regression equation were significantly non-zero. In other words, it examined whether the model that we specified significantly outperformed the intercept-only model, which had no predictors. The results showed that the F-statistic value was 0.8127, which was below the critical value of F 0.1 ( 25 , 413 ) . Therefore, we accepted the null hypothesis.
H 0 : b 1 = b 2 = = b k 1 = 0
where b i ( i = 1 , 2 , , k 1 ) is the regression coefficient of independent variable x i . This proved that the RPT network features could not be explained by other attribute features and provided extra information for credit risk evaluation.

5.4. Comparing the Credit Risk-Evaluation Performance

To examine whether incorporating the RPT network features could significantly improve the performance in forecasting the default distance, we used the “Comparison Test” in Machine Learning [48] to compare the prediction accuracy of X G B 1 , X G B 2 and X G B 3 , of which X G B 1 was the baseline model. This study first compared X G B 1 and X G B 2 to verify that the SDNE-based RPT network features improved the prediction performance. The comparison between X G B 2 and X G B 3 examined whether the DANE-based integrated features of the RPT network structure and the firm attributes outperformed SDNE-based features in X G B 2 .
We used a paired t-test for the comparison. William Sealy Gosset proposed the t-distribution in a classic paper entitled “The Probable Error of a Mean” in Biometrica in 1908. Ronald Fisher further put forward the t-test, which has been one of the most widely used hypothesis testing methods in the field of statistics. It was assumed that the root mean square errors derived from five-fold cross-validation were ϵ 1 1 , ϵ 2 1 , ϵ 3 1 , ϵ 4 1 , ϵ 5 1 and ϵ 1 2 , ϵ 2 2 , ϵ 3 2 , ϵ 4 2 , ϵ 5 2 . ϵ i 1 and ϵ i 2 were the root mean square errors of X G B 2 and X G B 3 on the i-th-fold test set. The difference in errors of each pair was i .
i = ϵ i 1 ϵ i 2 , i = 1 , 2 , 3 , 4 , 5
ϵ i k = 1 N i = 1 N Y i k f x i k 2 , k = 1 , 2
where N is the number of samples. The null hypothesis assumed that X G B 1 and X G B 2 had similar performances. Therefore, the test errors on the same training set and test set, ϵ i 1 and ϵ i 2 , should be close to each other. In other words, μ , the average of i , would be equal to zero. Then, we conducted a t-test on the null hypothesis and compared the t-statistics with the critical value of the t-distribution of four degrees of freedom.
τ t = 5 μ σ
where σ is the standard deviation of the differences in test errors. We first conducted a paired t-test on X G B 1 and X G B 2 and calculated τ t . τ t was 2.424, which was above t 10 % / 2 , 4 = 2.132 and below t 5 % / 2 , 4 = 2.776 . In other words, the probability of μ falling into the confidence interval [ 2.132 , 2.132 ] was 90%. Therefore, the null hypothesis was rejected, proving that X G B 2 outperformed X G B 1 at a 10% significance level. In other words, incorporating the RPT network’s structural features was able to significantly improve the model performance. Based on existing research on the network risk contagion effect, the path and scope of risk transmission differed from features of the RPT network structure, thus making a difference in firms’ credit risk.
Then, we conducted a t-test on X G B 2 and X G B 3 . τ t was 2.351, which was above t 10 % / 2 , 4 = 2.132 and below t 5 % / 2 , 4 = 2.776 . In other words, the probability of μ falling into the confidence interval [ 2.132 , 2.132 ] was 90%. Therefore, the null hypothesis was rejected. X G B 3 outperformed X G B 2 at a 10% significant level. In other words, the DANE-based integrated features of the RPT network structure and firms’ attributes outperformed the SDNE-based structural features.
The reason for this result was that DANE took structural and attribute information into consideration. In the meantime, DANE emphasized the complementarity and consistency of these two aspects of information. Complementarity requires more comprehensive and less redundant information. Consistency eliminates noise and distractions. In addition, DANE captured the highly nonlinear deep structure of the attribute and structural features. In contrast, the firms’ own attribute features only provided linear information. Therefore, incorporating DANE-based integrated features did not cause collinearity problems and served as a complement for nonlinear information instead. In the case of non-significant results, the topological structure and attribute features provided no additional information and, thus, failed to contribute to the improvement of model performance, which contradicted the existing literature.

6. Conclusions

This study first constructed a weighted adjacency matrix to represent the RPT network among China’s A-share listed firms and analyzed the structural patterns of connected subgraphs within this network. Subsequently, we employed DANE, a deep network embedding algorithm, to learn low-dimensional vector representations that captured both the network structure and firm attribute information. These enriched representations were then integrated into XGBoost-based credit risk-evaluation models. Finally, we leveraged SHAP and comparative analysis to rigorously assess the impact of incorporating RPT network information on model performance.
This study yielded several key findings. First, the results of the SHAP value and macro-feature importance verified the significant contributions of the RPT network’s structural features to credit risk evaluation. Regression analysis further substantiated the reason that RPT network features can provide additional information beyond node attributes. Second, the paired t-test results proved that incorporating RPT network features significantly enhanced the credit risk-evaluation performance. Finally, the DANE-based approach, which incorporated both the RPT network’s topological structure and the firm’s attribute information in a complementary and consistent manner, outperformed the SDNE-based method that relied solely on structural features.
This study was based solely on A-share listed firms. Due to the fact that there are a number of unlisted firms in blockchains, we will further expand the sample by cooperating with some commercial banks and securities companies for more data sources, and build a federated learning mechanism. Federated learning will not only enable model training on datasets distributed across a multitude of partners, but also ensures data privacy. Meanwhile, in addition to related party transactions, the methodology can be extended to more social relations between blockchain firms, covering the upstream and downstream areas of the industry chain, pledges and guarantees, and family businesses. This will contribute by providing some practical implications in credit risk identification, assessment and prevention for regulators, financial institutions, blockchain firms and investors.

Author Contributions

Conceptualization, Y.C. and L.F.; methodology, L.L. and L.F.; software, L.L.; validation, Y.C. and L.L.; formal analysis, Y.C., L.L. and L.F.; data curation, L.L.; writing—original draft preparation, L.L.; writing—review and editing, Y.C. and L.F.; visualization, L.L.; supervision, Y.C. and L.F.; project administration, Y.C.; funding acquisition, Y.C. and L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key R&D Program of China (No. 2021YFC3340600, No. 2021YFC3340603) and the National Natural Science Foundation of China (No. 72342024).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We are grateful for three anonymous reviewers and the editor for their insightful comments and constructive suggestions. Ying Chen acknowledges financial support from the National Key R&D Program of China (No. 2021YFC3340600, No. 2021YFC3340603); Libing Fang acknowledges financial support from the National Natural Science Foundation of China (No. 72342024).

Conflicts of Interest

Author Ying Chen was employed by the Nanjing Institute of Digital Financial Industry Co., Ltd. Author Lingjie Liu was employed by the CITIC Group Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RPTRelated Party Transaction

References

  1. Altman, E.I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 1968, 23, 589–609. [Google Scholar] [CrossRef]
  2. Johnson, S.; La Porta, R.; Lopez-de-Silanes, F.; Shleifer, A. Tunneling. Am. Econ. Rev. 2000, 90, 22–27. [Google Scholar] [CrossRef]
  3. Bae, S.C.; Kwon, T.H. Do firms benefit from related party transactions with foreign affiliates? Evidence from Korea. Int. Rev. Financ. 2021, 21, 945–965. [Google Scholar] [CrossRef]
  4. Das, S.R.; Duffie, D.; Kapadia, N.; Saita, L. Common failings: How corporate defaults are correlated. J. Financ. 2007, 62, 93–117. [Google Scholar] [CrossRef]
  5. Gao, H.; Huang, H. Deep attributed network embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
  6. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  7. Merton, R.C. On the pricing of corporate debt: The risk structure of interest rates. J. Financ. 1974, 29, 449–470. [Google Scholar]
  8. Beaver, W.H. Financial ratios as predictors of failure. J. Account. Res. 1966, 4, 71–111. [Google Scholar] [CrossRef]
  9. Carling, K.; Jacobson, T.; Lindé, J.; Roszbach, K. Corporate credit risk modeling and the macroeconomy. J. Bank. Financ. 2007, 31, 845–868. [Google Scholar] [CrossRef]
  10. Bonfim, D. Credit risk drivers: Evaluating the contribution of firm level information and of macroeconomic dynamics. J. Bank. Financ. 1968, 33, 281–299. [Google Scholar] [CrossRef]
  11. Cheung, Y.L.; Rau, P.R.; Stouraitis, A. Tunneling, propping, and expropriation: Evidence from connected party transactions in Hong Kong. J. Financ. Econ. 2006, 82, 343–386. [Google Scholar] [CrossRef]
  12. Cheung, Y.L.; Qi, Y.; Rau, P.R.; Stouraitis, A. Buy high, sell low: How listed firms price asset transfers in related party transactions. J. Bank. Financ. 2009, 33, 914–924. [Google Scholar] [CrossRef]
  13. Jiang, G.; Lee, C.M.; Yue, H. Tunneling through intercorporate loans: The China experience. J. Financ. Econ. 2010, 98, 589–609. [Google Scholar] [CrossRef]
  14. Hope, O.-K.; Lu, H.; Saiy, S. Director compensation and related party transactions. Rev. Account. Stud. 2019, 24, 1392–1426. [Google Scholar] [CrossRef]
  15. Zhang, H.; Li, M.; Yang, Y. Does common institutional ownership constrain related party transactions? Evidence from China. Int. Rev. Econ. Financ. 2024, 93, 1015–1042. [Google Scholar] [CrossRef]
  16. Chen, G.-Z. Social ties and related party transactions. J. Int. Account. Audit. Tax. 2023, 53, 100577. [Google Scholar] [CrossRef]
  17. Friedman, E.; Johnson, S.; Mitton, T. Propping and tunneling. J. Comp. Econ. 2003, 31, 732–750. [Google Scholar] [CrossRef]
  18. Peng, W.Q.; Wei, K.C.J.; Yang, Z. Tunneling or propping: Evidence from connected transactions in China. J. Corp. Financ. 2011, 17, 306–325. [Google Scholar] [CrossRef]
  19. Cabrales, A.; Gottardi, P.; Vega-Redondo, F. Risk-Sharing and Contagion in Networks. Rev. Financ. Stud. 2017, 30, 3086–3127. [Google Scholar] [CrossRef]
  20. Veraart, M.L.A. Distress and default contagion in financial networks. Math. Financ. 2020, 30, 705–737. [Google Scholar] [CrossRef]
  21. Fang, L.; Sun, B.; Li, H.; Yu, H. Systemic risk network of Chinese financial institutions. Emerg. Mark. Rev. 2018, 35, 190–206. [Google Scholar] [CrossRef]
  22. Zhang, W.; Wang, J. Credit risk contagion in complex companies network–Empirical research based on listed agricultural companies. Econ. Anal. Policy 2024, 82, 938–953. [Google Scholar] [CrossRef]
  23. Zhao, Z.; Chen, D.; Wang, L.; Han, C. Credit Risk Diffusion in Supply Chain Finance: A Complex Networks Perspective. Sustainability 2018, 10, 4608. [Google Scholar] [CrossRef]
  24. Li, W.; Ben, S.; Hommel, U.; Paterlini, S.; Yu, J. Default contagion and systemic risk in loan guarantee networks. Account. Financ. 2019, 59, 1923–1946. [Google Scholar] [CrossRef]
  25. Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
  26. Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
  27. Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
  28. Perozzi, B.; Kulkarni, V.; Chen, H.; Skiena, S. Don’t walk, skip! Online learning of multi-scale network embeddings. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, Sydney, Australia, 31 July–3 August 2017; pp. 258–265. [Google Scholar]
  29. Huang, X.; Li, J.; Hu, X. Accelerated attributed network embedding. In Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, TX, USA, 27–29 April 2017; pp. 633–641. [Google Scholar]
  30. Zmijewski, M.E. Methodological issues related to the estimation of financial distress prediction models. J. Account. Res. 1984, 22, 59–82. [Google Scholar] [CrossRef]
  31. Campbell, J.Y.; Hilscher, J.; Szilagyi, J. In search of distress risk. J. Financ. 2008, 63, 2899–2939. [Google Scholar] [CrossRef]
  32. López, T.F.J.; Pastor, S.I. Bankruptcy visualization and prediction using neural networks: A study of US commercial banks. Expert Syst. Appl. 2015, 42, 2857–2869. [Google Scholar] [CrossRef]
  33. Altman, E.I.; Marco, G.; Varetto, F. Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience). J. Bank. Financ. 1994, 18, 505–529. [Google Scholar] [CrossRef]
  34. Shin, K.S.; Lee, T.S.; Kim, H.J. An application of support vector machines in bankruptcy prediction model. Expert Syst. Appl. 2005, 28, 127–135. [Google Scholar] [CrossRef]
  35. Chen, M.-Y. Bankruptcy prediction in firms with statistical and intelligent techniques and a comparison of evolutionary computation approaches. Comput. Math. Appl. 2011, 62, 4514–4524. [Google Scholar] [CrossRef]
  36. Lu, Y.; Zhu, J.; Zhang, N.; Shao, Q. A hybrid switching PSO algorithm and support vector machines for bankruptcy prediction. In Proceedings of the 2014 International Conference on Mechatronics and Control (ICMC), Jinzhou, China, 3–5 July 2014; pp. 1329–1333. [Google Scholar]
  37. Gunnarsson, B.R.; Vanden Broucke, S.; Baesens, B.; Óskarsdóttir, M.; Lemahieu, W. Deep learning for credit scoring: Do or don’t? Comput. Math. Appl. 2021, 295, 292–305. [Google Scholar] [CrossRef]
  38. Zięba, M.; Tomczak, S.K.; Tomczak, J.M. Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst. Appl. 2016, 58, 93–101. [Google Scholar] [CrossRef]
  39. Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A review of machine learning interpretability methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef] [PubMed]
  40. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
  41. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
  42. Bücker, M.; Szepannek, G.; Gosiewska, A.; Biecek, P. Transparency, auditability, and explainability of machine learning models in credit scoring. J. Oper. Res. Soc. 2022, 73, 70–90. [Google Scholar] [CrossRef]
  43. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 1189–1232. [Google Scholar] [CrossRef]
  44. Crouhy, M.; Galai, D.; Mark, R. A comparative analysis of current credit risk models. J. Bank. Financ. 2000, 24, 59–117. [Google Scholar] [CrossRef]
  45. Andersen, T.G.; Bollerslev, T.; Diebold, F.X.; Ebens, H. The distribution of realized stock return volatility. J. Financ. Econ. 2001, 61, 43–76. [Google Scholar] [CrossRef]
  46. Chen, Y.; Chu, G. Estimation of default risk based on KMV model—An empirical study for Chinese real estate companies. J. Financ. Risk Manag. 2014, 3, 40–49. [Google Scholar] [CrossRef]
  47. Wang, D.; Cui, P.; Zhu, W. Structural Deep Network Embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
  48. Zhou, Z.-H. Machine Learning; Springer Nature: Berlin/Heidelberger, Germany, 2021. [Google Scholar]
Figure 1. Architecture of DANE.
Figure 1. Architecture of DANE.
Mathematics 12 02673 g001
Figure 2. Structural patterns of connected subgraphs. The numbers consist with the types in Table 4.
Figure 2. Structural patterns of connected subgraphs. The numbers consist with the types in Table 4.
Mathematics 12 02673 g002
Figure 3. Visualization of subnetwork embedding results based on DANE.
Figure 3. Visualization of subnetwork embedding results based on DANE.
Mathematics 12 02673 g003
Figure 4. Macro-feature importance in X G B 1 .
Figure 4. Macro-feature importance in X G B 1 .
Mathematics 12 02673 g004
Figure 5. Importance rankings of the RPT network feature R P T 1 .
Figure 5. Importance rankings of the RPT network feature R P T 1 .
Mathematics 12 02673 g005
Figure 6. Importance rankings of the RPT network feature R P T 2 .
Figure 6. Importance rankings of the RPT network feature R P T 2 .
Mathematics 12 02673 g006
Table 1. Statistics on RPTs in China.
Table 1. Statistics on RPTs in China.
YearNumber of Listed FirmsNumber of Listed Firms Engaged in RPTsNumber of with Listed FirmsNumber of RPTs
20112342139317,962342
20122494169227,272572
20132489156222,172476
20142613166923,358514
20152827196925,304500
20163052207425,203544
20173485210527,111664
20183584225630,869876
20193777233132,722939
Table 2. Statistics on the nodes and edges of the RPT network.
Table 2. Statistics on the nodes and edges of the RPT network.
YearNumber of NodesNumber of Non-Isolated NodesNumber of Edges
20111393251176
20121692319247
20131562283205
20141669313225
20151969329235
20162074370248
20172105407288
20182256454346
20192331505376
Table 3. Statistics on connected subgraphs in the RPT network.
Table 3. Statistics on connected subgraphs in the RPT network.
Number of Nodes in One Connected SubgraphNumber of Connected Subgraphs
2127
334
413
53
65
81
111
311
Table 4. Statistics on the structural patterns of connected subgraphs in the RPT network.
Table 4. Statistics on the structural patterns of connected subgraphs in the RPT network.
TypeNode Degree List of Connected SubgraphNumber of Connected Subgraph
1[1, 1]127
2[1, 2, 1]33
3[2, 2, 2]1
4[1, 2, 2, 1]5
5[1, 3, 1, 1]3
6[1, 3, 2, 2]5
7[2, 3, 3, 2]0
8[3, 3, 3, 3]0
9[3, 2, 3, 1, 2]2
10[1, 1, 1, 1, 4]1
11[1, 1, 1, 1, 2, 4]2
12[1, 1, 2, 2, 2, 2]1
13[1, 1, 1, 2, 3, 4]1
14[1, 1, 1, 2, 2, 3]1
15[1, 1, 1, 1, 1, 1, 2, 6]1
16[1, 3, 2, 2, 1, 4, 2, 2, 1, 1, 1]1
17[1, 2, 1, 6, 1, 1, 1, 6, 1, 1, 2, 2, 3, 1, 2, 1, 2, 2, 3, 2, 4, 2, 1, 1, 1, 2, 1, 1, 5, 4, 1]1
Table 5. Descriptive statistics.
Table 5. Descriptive statistics.
VariablesMeanStd. Dev.Min25%Median75%Max
CASHCL0.1380.219−0.4860.0090.1100.2460.806
D e b t / M a r k e t V a l u e 5.3165.4730.0001.4953.4337.04025.508
D i r e c t o r 0.3000.4590.0000.0000.0001.0001.000
E q u i t y / M a r k e t V a l u e 0.3740.215−0.2830.2180.3340.4881.045
F i n a n c e / D e b t 0.2670.255−0.3030.0540.2020.4051.136
G r o s s M a r g i n 0.2760.156−0.1730.1640.2540.3670.723
I P O _ s e c t o r 0.7850.4110.0001.0001.0001.0001.000
MNCF 0.0140.355−0.997−0.1430.0000.1571.099
NAPS 4.1142.106−2.0702.6033.8005.35410.385
N e t A s s e t G r o w t h 0.0650.090−0.2240.0200.0530.0940.410
O p e r a t e / A s s e t 0.2440.256−0.4780.0670.2380.4210.959
O w n e r s h i p 0.3140.4640.0000.0000.0001.0001.000
P r o f i t / A s s e t 0.0270.029−0.0560.0080.0220.0430.114
P r o f i t M a r g i n 0.0810.078−0.1520.0310.0700.1220.315
P r o p e r t y 0.7970.634−0.1630.3200.6221.0922.940
Q u i c k R a t i o 1.3500.8930.0000.7391.1151.6884.521
R e t a i n / A s s e t 0.1790.114−0.1580.1070.1700.2470.502
S e p a r a t i o n 0.0520.0780.0000.0000.0010.0860.561
S h a r e h o l d 0.3220.1480.0300.2100.2990.4140.834
T u r n o v e r _ A c c o u n t 3.7543.489−0.6911.4682.6614.63018.161
T u r n o v e r _ A s s e t 0.3450.250−0.1380.1520.2850.4811.136
T u r n o v e r _ F i x e d A s s e t 2.1871.9430.0000.8291.5822.8609.499
T u r n o v e r _ S t o c k 2.3452.086−0.0660.8471.7403.1649.963
V o l a t i l i t y 3.2372.6970.0001.2712.4044.41212.661
Table 6. Variable definitions.
Table 6. Variable definitions.
SymbolDefinition
CASHCLRatio of EBITDA to Interest
D i r e c t o r Concurrently Serving as a Director or General Manager (Yes 1, otherwise 0)
E q u i t y / M a r k e t V a l u e Book-to-Market Ratio
F i n a n c e / D e b t Ratio of Financing Cashflow to Total Liabilities
G r o s s M a r g i n Gross Profit Margin
I P O _ s e c t o r Listed Sector (Main Board 1, otherwise 0)
D e b t / M a r k e t V a l u e Debt-to-Market Ratio
MNCF Per-share Cashflow
NAPS Per-share Net Assets
N e t A s s e t G r o w t h Growth of Net Assets
O p e r a t e / A s s e t Ratio of Operating Capital to Total Assets
O w n e r s h i p Ownership (State-owned 1, otherwise 0)
P r o f i t / A s s e t Ratio of EBITDA to Total Assets
P r o f i t M a r g i n Profit Margin
P r o p e r t y Debt–Equity Ratio
Q u i c k R a t i o Ratio of Quick Assets to Current Liabilities
R e t a i n / A s s e t Ratio of Retained Earnings to Total Assets
R P T _ 1 Structural Feature of RPT Network (First Dimension)
R P T _ 2 Structural Feature of RPT Network (Second Dimension)
S e p a r a t i o n Degree of Separation between Ownership and Control
S h a r e h o l d Proportion of the Largest Shareholder
Table 7. The correlation matrix of the variables.
Table 7. The correlation matrix of the variables.
Variables(1)(2)(3)(4)(5)(6)(7)(8)(9)
(1) DD1        
(2) S h a r e h o l d 0.1711       
(3) V o l a t i l i t y −0.125−0.0271      
(4) P r o p e r t y 0.0960.0380.0651     
(5) N e t A s s e t G r o w t h −0.0040.071−0.146 *0.0111    
(6) MNCF 0.0020.064−0.001−0.0090.1291   
(7) R e t a i n / A s s e t 0.0670.145−0.107−0.403 ***0.2040.0941  
(8) D e b t / M a r k e t V a l u e −0.139 *−0.027−0.023−0.453 ***0.0330.0420.336 ***1 
(9) E q u i t y / M a r k e t V a l u e 0.315 ***0.038−0.0340.179−0.13−0.05−0.177 **−0.392 ***1
The coefficients with *, ** and *** represent significance at the 10%, 5% and 1% levels, respectively.
Table 8. Parameter settings of the XGBoost models.
Table 8. Parameter settings of the XGBoost models.
Variables XGB 1 XGB 2 XGB 3
n _ e s t i m a t o r 505050
m a x _ d e p t h 1263
m i n _ c h i l d _ w e i g h t 10810
g a m m a 111
s u b s a m p l e 111
C o l s a m p l e _ b y t r e e 10.91
e t a 0.10.10.1
Table 9. Pearson correlations of the RPT network’s structural features and the main variables.
Table 9. Pearson correlations of the RPT network’s structural features and the main variables.
VariablesCorrelation Coefficientt-Value
Separation−0.0380.424
D i r e c t o r −0.0440.356
O w n e r s h i p 0.0510.288
I P O _ s e c t o r −0.0490.306
S h a r e h o l d −0.0060.901
V o l a t i l i t y 0.0030.954
P r o p e r t y −0.050.293
N e t A s s e t G r o w t h −0.0850.076
T u r n o v e r _ F i x e d A s s e t 0.0990.038
T u r n o v e r _ S t o c k −0.0140.765
T u r n o v e r _ A c c o u n t −0.0210.656
T u r n o v e r _ A s s e t 0.0160.744
P r o f i t / A s s e t 0.0130.793
D e b t / B o o k V a l u e 0.0590.221
MNCF 0.0110.816
CASHCL −0.0060.903
R e t a i n / A s s e t 0.0290.549
F i n a n c e / D e b t 0.0210.659
O p e r a t e / R e v e n u e −0.0110.811
D e b t / M a r k e t V a l u e 0.0420.381
E q u i t y / M a r k e t V a l u e 0.0460.337
Q u i c k R a t i o 0.0180.711
P r o f i t M a r g i n 00.994
G r o s s M a r g i n 0.0120.808
Table 10. VIF test results.
Table 10. VIF test results.
VariablesVIF1/VIF
S e p a r a t i o n 1.5510.645
D i r e c t o r 1.2880.776
O w n e r s h i p 3.1990.313
I P O _ s e c t o r 3.9370.254
S h a r e h o l d 1.2200.820
V o l a t i l i t y 1.0860.921
P r o p e r t y 1.8200.549
N e t A s s e t G r o w t h 1.4140.707
T u r n o v e r _ F i x e d A s s e t 1.3060.766
T u r n o v e r _ S t o c k 1.4090.710
T u r n o v e r _ A c c o u n t 1.1740.852
T u r n o v e r _ A s s e t 1.8680.535
P r o f i t / A s s e t 2.3240.430
D e b t / B o o k V a l u e 1.3200.757
MNCF 1.0820.924
CASHCL 1.4610.684
R e t a i n / A s s e t 1.7790.562
F i n a n c e / D e b t 1.2800.781
O p e r a t e / R e v e n u e 1.3990.715
D e b t / M a r k e t V a l u e 2.1450.466
E q u i t y / M a r k e t V a l u e 1.5740.635
Q u i c k R a t i o 1.8750.533
P r o f i t M a r g i n 2.2310.448
G r o s s M a r g i n 1.9040.525
Mean1.7350.576
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Y.; Liu, L.; Fang, L. An Enhanced Credit Risk Evaluation by Incorporating Related Party Transaction in Blockchain Firms of China. Mathematics 2024, 12, 2673. https://doi.org/10.3390/math12172673

AMA Style

Chen Y, Liu L, Fang L. An Enhanced Credit Risk Evaluation by Incorporating Related Party Transaction in Blockchain Firms of China. Mathematics. 2024; 12(17):2673. https://doi.org/10.3390/math12172673

Chicago/Turabian Style

Chen, Ying, Lingjie Liu, and Libing Fang. 2024. "An Enhanced Credit Risk Evaluation by Incorporating Related Party Transaction in Blockchain Firms of China" Mathematics 12, no. 17: 2673. https://doi.org/10.3390/math12172673

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop