1. Introduction
In finance, a data-intensive industry, data is highly sensitive and valuable. Comprehensive protection of data and ownership of personal information has become important propositions in the use and management of data in the financial industry [
1].
Nowadays, with the conglomeration of technology and finance, financial institutions tend to use big data technology to collect, store, and correlate and analyse massive amounts of financial data from scattered sources and in various formats, to draw useful insights from them and continuously upgrade the functions of financial services in the industry. In other words, financial institutions can use big data technology to further map the overall situation of enterprises, combine information concerning the industry growth, historical development status, and regional economic characteristics, and clarify the direction of the economy. On this basis, finance within the industrial chain can be upgraded through innovative financial products and optimised credit approval models to help enterprises solve their realistic financing dilemmas. At the same time, the enterprise’s actual sales can be inferred by examining information such as tax payment, inventory, and capital turnover in multiple dimensions through big data technology, and its existing sales data can be reviewed and verified through analysis of the customer group’s account capital stock, financing needs, and risk reality. This can effectively help quality enterprises to better identify fraud, money laundering, and other risks while improving the financial support services of the industry chain [
2,
3].
At the same time, the exponential growth of data volume is very important for the bank and financial industry, improving the security of financial information data [
4,
5,
6]. As financial systems continue to increase and upgrade, more and more financial entities are interconnected in various ways, which also poses a huge challenge to information security regulation. Financial information is exposed to various potential threats, which may lead to serious consequences if financial information security is not well safeguarded [
7,
8].
In addition, the application of cloud computing can effectively alleviate the load on the financial participants and enterprise groups in the industry chain, which are trying to build their information systems [
9,
10]. At present, the upstream and downstream of the industry chain are mostly small and medium-sized enterprises. These enterprises generally have limited technical investment, and their information systems only support regular business operations and cannot match the need of financial institutions and financial technology enterprises. Under such circumstances, on the one hand, the enterprises in the chain cannot enjoy the dividends brought by supply chain finance, and on the other hand, the financial institutions will face the challenge of great difficulty in docking in developing and carrying out supply chain finance. Cloud computing breaks through the constraints of hardware and software, and is capable of allowing access by demand, which can solve the pressure of constructing and docking the information system. To a certain extent, this technology can meet the requirements of storage, calculation, and analysis of financial information in the industry chain, and significantly improve the digitalisation of the industry chain’s financial business. In terms of practical application, with the support of cloud computing technology, financial institutions can cross-analyse key information such as the financial situation of enterprises in the whole industrial chain and the competitive relationship between enterprises, from multiple sources and in multiple dimensions, and make full use of them, so that all enterprises in the chain can share the dividends of supply chain finance [
11]. Consequently, fin-tech, represented by big data technology and cloud computing applications, has played a significant role in the development of the financial industry. However, the massive use of financial data also contains a large number of potential risks.
At the macro level, improper use or leakage of personal or financial institution data will not only directly infringe upon the legitimate rights and interests of individual financial information subjects and affect the normal operation of financial industry institutions, but may also bring about systemic financial risks and threaten financial security, and in serious cases, may spread to the whole society along with the economic chain. At the same time, the occurrence of information infringement and other violations of the law by micro-entities makes it difficult for citizens to complain, and the failure to defend their rights brings huge economic and psychological damage to citizens, which may lead to incidents driven by irrational behaviour, affecting social stability and causing certain damage to the government’s credibility.
At a micro level, it infringes on the interests of consumers and financial institutions [
12,
13]. For consumers, firstly, there is a risk of data misuse and privacy breaches. With the rapid development of digital consumption, financial consumers using financial applications or third-party software for financial services are mandatorily required to allow the right to collect or query information in privacy agreements or various authorisations. All online information of users including identity, location, shopping preferences, payment passwords, and other types of information are recorded in the background. It is not uncommon for financial data to be misused and leaked, by means like over-marketing and big data discriminatory pricing. For financial institutions, firstly, there is the risk of data leakage. Insiders may be bribed to illegally sell or deliberately leak information or be supported by hackers to cause data leakage, which results in losses due to the market going less than expected. Secondly, there is the risk of data contamination. The unlabelled nature of data is highly susceptible to copying and tampering in the course of digital transactions, and once the sample is maliciously damaged, the model results will be very different, which will cause high data clean-up costs for financial institutions, and affect the institutional decision making [
14].
Based on the above analysis, we can find that although there are many methods for data encryption and privacy protection, these methods have one or more of the following drawbacks:
It is based on a specific hardware, such as TEE, which makes the method too narrow in scope.
The computational complexity of the encryption algorithm is too high, and the computational overhead of encryption is too large.
It is not optimised for financial computing scenarios.
Many algorithms in the field of finance, such as Monte Carlo algorithms under linear models, n-dimensional discontinuous segmented linear financial market models, financial time series analysis, etc., are linear in their core computation, i.e., matrix multiplication [
15,
16,
17,
18,
19,
20,
21]. These computationally intensive operations are suitable for computing in the cloud (e.g., cloud GPUs) [
22].
This paper, therefore, proposes a security model for distributed applications that can secure data even if an attacker has physical access to the cloud server. It provides four levels of privacy protection. In the highest level of protection, the server cannot access any information about the user of the data, nor the original text of the data, nor the computational characteristics of the data, such as computational weights and gradients, nor the statistical characteristics of the data, such as the data distribution. Moreover, due to the generality of linear computing in mathematical principles, this model can effectively protect and accelerate all linear computing-based models. It has been experimentally validated that this approach can improve the inference speed of algorithms by up to 10 times compared to a benchmark test using the only client-side computing power without compromising privacy, and can effectively prevent the restoration of user data compared to accelerated operations without privacy protection.
In summary, in this paper, we design a privacy-preserving computational framework to address the specific computational properties of financial scenarios, and our main contributions are as follows:
This paper proposes a computing framework based on client-side encryption and decryption, accelerated computing in the cloud.
This paper adopts this framework to many algorithms in the field of finance, such as Monte Carlo algorithm under linear model, n-dimensional discontinuous linear financial market model, and financial time-series analysis.
This paper designs a scheduling model for multi-client shared cloud GPU.
This paper has implemented a large number of experiments to verify the effectiveness of this method.
2. Related Work
Nowadays, blockchain technology is being used within the field of financial security. Blockchain is a P2P (peer-to-peer) distributed database and is a web-based concept. It consists of a series of blocks containing transactions, timestamped and verified by the network community and protected by a PKI (Public Key Infrastructure). An element of the blockchain cannot be modified after it has been added to the blockchain. Moreover, it retains a permanent account of earlier actions [
23]. Furthermore, blockchain is expected to initiate an industrial and commercial revolution, while contributing to global economic reform. First, blockchain uses cryptography to create a secure code in digital form. Users can then confirm purchases without having to provide any personal data. Since blockchain records are immutable, transactions are automatically completed and decentralised [
24]. The blockchain is a secure database and decentralised transaction system driven by decentralised nodes [
25]. In other words, blockchain is a game-changing technology that has attracted the attention of businesses and governments worldwide. Essentially, the term “distributed ledger technology” refers to the collection of transactions and data that are sequentially tracked and registered on a network of distributed ledgers [
26]. A blockchain is also divided into an ever-expanding list of records, called blocks, linked together using cryptography [
27]. Blockchains are used as transaction ledgers in cryptocurrency systems such as Bitcoin and Ether, where the blockchain records the current state and previous transactions. Blockchains can additionally be defined as collections of blocks that store data in a hash function and include a link to the previous block and timestamp [
28]. A blockchain is a distributed database that only allows new data to be appended to existing data [
29]. Based on this feature, blockchain technology plays an important role in the financial sector. The first generation of blockchain technologies acted as currencies, such as Bitcoin. In addition to Bitcoin, the most prominent cryptocurrency, about 600 other cryptocurrencies have been established and are used as exchange tokens in Bitcoin-based applications. Ether, Monero, and Ripple are the other most popular ones [
30]. The second generation of blockchain technology is not only about cryptocurrency transactions but also about bonds, smart contracts, futures, loans, and mortgages. The integration of smart contracts with the blockchain is the most critical feature of this phase. When certain criteria are met, smart contracts are parts of code buried within them that react in a specific way [
30]. Furthermore, blockchain can solve the problem of e-commerce reputation schemes that use the registration of a large number of fake customers to gain a high reputation; the reputation information is not reversible because it is stored on the blockchain and all reputation changes are easily detectable. For security enhancements, blockchain can solve the problem of a single point of failure of an important central node. And since it can reduce the impact of attacks on public and private key distribution devices, Blockchain can help build more reliable public–private key infrastructures [
31]. Blockchain technology can also be used for privacy protection in the financial sector, where a data storage system built on blockchain can guarantee the anonymity of users while ensuring their ownership of data [
32].
In addition, cloud computing is being widely used in enterprises. For large enterprises and governments, much of the appeal of the cloud lies in better control and reduction of data centre costs. In particular, cloud computing allows the reclassification of IT expenses from capital expenditure to cash-based ’operational expenditure’ [
33]. Another business benefit of cloud computing is reduced demand for skilled labor in places where there is a shortage of high technology (e.g., South East Asia) [
34]. Moreover, it lowers the barrier to entry into computing [
35,
36,
37]. However, with its widespread use, cloud computing also poses security concerns. A key challenge in cloud computing security relates to the environment in which the computing takes place. Cloud computing providers operating globally distributed networks of data centres may face specific security risks (e.g., terrorism or cyber-attacks) and may also have unique legal issues regarding security tort liability [
38,
39]. Certain user environments have specific security requirements, such as governments and financial institutions [
40]. They may ask providers to promote the security level.
Furthermore, several studies have been conducted to improve data security by desensitising the data itself. Tuple repetition-based privacy-preserving data publishing schemes have been proposed to hide sensitive information from individuals when publishing data. It is effective in addressing the issue of releasing data collected by the data holders or publishers from the data owners, which means it is capable of protecting sensitive information about an individual’s released data. Anonymisation techniques such as generalisation, suppression, swapping, disaggregation, and randomisation can suffer from loss of personal identity or other information, which can reduce the usefulness of the data [
41]. Moreover, one study proposes a framework that allows the use of random data perturbation techniques, systematically transforming the original data and presenting the modified data to parties as query results through a decision tree. This approach provides valid results for analytical purposes, but the actual or real data is not revealed and privacy is protected [
42].
6. Conclusions
With the continuous increase and upgrade of the financial system, more and more financial entities are interconnected in numerous ways, which brings great challenges to information security because financial information will be exposed to a variety of potential threats. In this paper, we proposes a privacy protection model based on cloud computing that can ensure data security even if an attacker has physical access to the cloud server. The model provides four levels of privacy protection according to users’ actual needs. In the highest level of protection, the server has no access to any information about the data user, nor does it have access to the original file information of the data, nor can it restore the computational characteristics of the data, such as computational weights and gradients, and statistical characteristics of the data, such as data distribution. In addition, due to the generality of the mathematical principles of linear operator, the model can effectively protect and accelerate all models based on linear operations. The main innovations of this document are as follows:
The paper proposes a computing framework based on client-side encryption and decryption, accelerated computing in the cloud, and designs multiple privacy protection levels.
The paper adapts framework to many algorithms in the field of finance, such as Monte Carlo algorithm under linear model, n-dimensional discontinuous linear financial market model, and financial time-series analysis.
The paper designs a scheduling model for multi-client shared cloud GPU.
The paper has implemented a large number of experiments to verify the effectiveness of this method.
The final results showed that the method can increase the speed by 10 times, compared with the privacy protection method that only uses local computing power instead of the cloud server. It can also effectively prevent the user’s privacy from being leaked with relatively minimal delay cost, compared with no privacy protection method. Therefore, the main advantages of this approach are fast speed, low computational loss based on privacy protection, optimised for the computational characteristics of financial scenarios, and the ability to provide different levels of protection for different usage scenarios.
Although this paper depends on the communication efficiency of the usage scenario, this limitation is somewhat alleviated considering the prevalence of fiber bandwidth and 5G high-speed networks.
In addition, although the model in this paper is tested with a large ImageNet dataset, how to deploy the model in this paper in a larger data volume environment and how to apply the model in this paper to a nonlinear computing scenario are the next research directions for the authors of this paper. At the same time, a large number of algorithms based on linear computing exist in the fields of smart agriculture, smart medicine, and remote sensing and mapping [
48,
49,
50,
51,
52,
53,
54,
55,
56]. It will be the future work of the authors to migrate the privacy-preserving algorithms proposed in this paper to more application scenarios.