Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms

Cardona, Luis F.; Guzmán-Luna, Jaime A.; Restrepo-Carmona, Jaime A.

doi:10.3390/jrfm17080352

Open AccessSystematic Review

Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms

by

Luis F. Cardona

,

Jaime A. Guzmán-Luna

^* and

Jaime A. Restrepo-Carmona

Facultad de Minas, Universidad Nacional de Colombia Sede Medellín, Medellín 050001, Colombia

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2024, 17(8), 352; https://doi.org/10.3390/jrfm17080352

Submission received: 9 July 2024 / Revised: 1 August 2024 / Accepted: 10 August 2024 / Published: 13 August 2024

(This article belongs to the Special Issue Machine Learning Applications in Finance, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Crowdfunding platforms are important for startups, since they offer diverse financing options, market validation, and promotional opportunities through an investor community. These platforms provide detailed company information, aiding informed investment decisions within a regulated and secure environment. Machine learning (ML) techniques are important in analyzing large data sets, detecting anomalies and fraud, and enhancing decision-making and business strategies. A systematic review employed PRISMA guidelines, which studied how ML improves fraud detection on digital crowdfunding platforms. The analysis includes English-language studies from peer-reviewed journals published between 2018 and 2023 to analyze the pre- and post-COVID-19 pandemic. The findings indicate that ML techniques such as Random Forest, Support Vector Machine, and Artificial Neural Networks significantly enhance the predictive accuracy and utility of tax planning for startups considering equity crowdfunding. The United States, Germany, Canada, Italy, and Turkey do not present statistically significant differences at the 95% confidence level, standing out for their notable academic visibility. Florida Atlantic and Cornell Universities, Springer and John Wiley & Sons Ltd. publishing houses, and the Journal of Business Ethics and Management Science magazines present the highest citations without statistical differences at the 95% confidence level.

Keywords:

crowdfunding; fraud; machine learning; PRISMA; statistical analysis

1. Introduction

Crowdfunding platforms are fundraising methods in which people contribute different amounts of money through online platforms, facilitating access to capital for projects and businesses (Brown et al. 2017). These platforms positively impact the economy, benefiting small and medium-sized businesses by reducing transaction costs, improving investment efficiency, attracting non-professional investors, promoting financial accessibility, fostering innovation and entrepreneurship, and validating market concepts (Ellman and Hurkens 2019a, 2019b). Within this order of ideas, crowdfunding promotes job creation, increases tax revenue, and supports innovative high-tech companies. Through these mechanisms, crowdfunding drives economic growth and strengthens the financial ecosystem (Cicchiello et al. 2019). According to Freedman and Jin (2010), the crowdfunding market in North America generates more than USD 17 billion annually and is projected to grow to USD 300 billion by 2030. There are different types of crowdfunding platforms, including those based on rewards, actions, donations, and loans (Markas and Wang 2019; Petrov and Emelyanova 2021). The elements of crowdfunding campaigns include the creator, project, funding goal, investors, incentives, and sponsors. The platform selection should reflect user trust through ethical practices and security measures to prevent fraud (Markas and Wang 2019). Additionally, companies can gain a competitive advantage by identifying and prioritizing key innovation opportunities (Markas and Wang 2019).

Crowdfunding platforms are a powerful tool for financing projects, businesses, and personal causes. However, the risk of fraud can appear as fake projects or misappropriation of funds (Ellman and Hurkens 2019b; Bafna et al. 2023). These anomalous behaviors affect investor confidence and reduce public–private funding. Additionally, fraud can attract negative media and public attention, affecting the flexibility and accessibility of crowdfunding (Bafna et al. 2023; Xu et al. 2023). Platforms should invest in security and control systems and algorithms to prevent and detect fraud. The operations can be conducted for identity verification, monitoring suspicious transactions, and providing channels to report fraud. Traditional fraud detection methods in crowdfunding include rule-based systems, fundamental statistical analysis, and manual review to report suspicious or unusual transactions (Cumming et al. 2021). However, these methods are vulnerable to tax evasion by fraudsters and can be slow and error-prone (Petrov and Emelyanova 2021; Winoto and Wulandari 2023).

Behl et al. (2022) evaluated the uses of Artificial intelligence (AI) tools in donation-based crowdfunding platforms. The employing of AI technology can improve operational performance in disaster relief operations. However, the community perceives indicators of risk, privacy, transaction security, and fraud, negatively. Burtch et al. (2016) studied digital platforms and found they must balance information control to avoid harming campaigns’ visibility and success. It is important to note that crowdfunding platforms must be visible and clearly show all the investors their finances. For these reasons, Machine Learning (ML) significantly enhances fraud detection. These techniques can identify complex patterns in transactions and fraudulent behaviors. Elitzur (2024) invites businesses and industries to leverage ML to improve real-time fraud alert systems, optimize resources, and minimize losses, making it a powerful tool in the fight against fraud. ML provides advanced data visualization tools essential for identifying complex and non-linear patterns that can influence the success of campaigns. Different authors in their publications motivate the studies of fraud and anomalous-behavior detection in financial processes in other areas, such as business (Goodell et al. 2021) and healthcare (Bassani et al. 2019).

A subset of Artificial Intelligence (AI) is ML, which uses several algorithms to improve predictions using historical data without explicit programming. ML is divided into different approaches, as supervised and unsupervised algorithms. Supervised methods are algorithms trained on labeled data, while unsupervised methods identify patterns in unlabeled data (Butt et al. 2020; Sharifani and Amini 2023). These algorithms are used in fraud detection, recommendation systems, image recognition, natural language processing, and medical diagnosis (Sharifani and Amini 2023; Yadav et al. 2023). Traditional fraud detection methods are less precise and require manual review, while ML-based methods are faster and more accurate, analyze large volumes of data in real time, and adapt to new frauds (Yadav et al. 2023). However, ML algorithms require significant computational resources and high-quality data to avoid false positives (Sharifani and Amini 2023; Yadav et al. 2023). Artificial Neural Networks (ANNs), k-Nearest Neighbors (K-NN), Support Vector Machines (SVMs), Naive Bayes (NB), K-Means, and Singular Value Decomposition (SVD) are examples of ML methods employed in fraud detection (Sharifani and Amini 2023; Yadav et al. 2023; Cardona et al. 2024). Artificial Neural Networks (ANNs) detect fraud anomalies in real time by recognizing patterns in network traffic. K-Nearest Neighbors (K-NN) classifies data to implement suitable security measures. Support Vector Machines (SVMs) and Naive Bayes (NB) identify threats by differentiating data classes using probabilistic models. Unsupervised learning methods such as K-Means and Singular Value Decomposition (SVD) are effective for anomaly detection and data dimensionality reduction (Butt et al. 2020).

This study employs bibliometric analysis to report whose ML methods are used for identifying and preventing fraudulent activities. It is important to clarify that this manuscript is based on the strategies used in the literature to detect fraud in crowdfunding platforms and does not study the prediction of crowdfunding campaigns’ success. Creating a fraud detection system with machine learning presents significant challenges in enhancing political e-government systems and industry 4.0. These include needing high-quality data, explaining decisions from complex models, compliance with regulations such as the General Data Protection Regulation, and ethical considerations. Despite these challenges, the global fraud detection and prevention market is expected to grow significantly, driven by the increasing use of digital technologies and the adoption of risk-based authentication and fraud analysis solutions. Digital crowdfunding platforms are increasingly used in different contexts, showing exponential growth (Freedman and Jin 2010). However, these platforms are susceptible to different types of fraud, leading to distrust and discouraging user investment. For these reasons, it is necessary to integrate AI, especially ML techniques, to improve the early detection of fraud and anomalous behavior on these platforms. Digital platforms have increased after the COVID-19 pandemic (Zribi 2022), so this study evaluates those works published from 2018 to April 2024. This research undertakes a comprehensive and systematic analysis to identify the countries, universities, and industries investing in the study of fraud on crowdfunding platforms using machine learning techniques. The analysis, conducted through a statistical approach, reveals the most significant associations based on bibliometric variables such as the year of publication, total citations, number of institutions and authors, and the journal’s quartile. This approach will also unveil the most frequently used machine learning techniques and their effectiveness in fraud identification and control on digital crowdfunding platforms. In conclusion, this work aims to compile and analyze the primary studies on the subject. The PRISMA methodology (Page et al. 2021) is employed for this purpose. The study is guided by the following research questions, which will be thoroughly explored and answered in this work. This analysis will provide a comprehensive understanding of the use of machine learning techniques in detecting fraud on digital crowdfunding platforms.

▪: RQ1: What are the most promising machine learning techniques in detecting fraud on digital crowdfunding platforms?
▪: RQ2: Which types of fraud are commonly found in studies?
▪: RQ3: What representative works were developed to detect fraud in digital crowdfunding platforms?
▪: RQ4: Which countries or groups of countries have the highest academic output in fraud detection on digital crowdfunding platforms? Are there any significant differences in their contributions compared to the total citations?
▪: RQ5: What are the clusters and statistical comparison analyses between the universities, publishers, and journals, compared to the total citations? Are there any statistical differences between those variables compared to the total citations?
▪: RQ6: What future trends and developments are expected in this area of research?

2. Materials and Methods

2.1. Bibliometric Analysis

To address these questions, the PRISMA methodology (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) is employed. This method improves the transparency and quality of systematic reviews and meta-analyses. It is based on a set of guidelines and a checklist that ensures that studies are identified, selected, and reported clearly and systematically. PRISMA helps researchers conduct and present systematic reviews rigorously and transparently. In addition, it allows us to identify the most relevant works, countries, and research centers that are most notable in academic production, whether articles, conferences, or books (Page et al. 2021; Pranckutė 2021). The PRISMA methodology characteristics employed in this work are detailed below.

▪: Research must be related to fraud detection and types of fraud on digital crowdfunding platforms. For this reason, research topics related to cryptocurrencies, marketing, blockchain, e-commerce, and cryptocurrency are excluded.
▪: For this study, two of the most remarkable and widely used databases in the academic and scientific world, Scopus and Web of Science (WoS), were selected. These two databases have been used in various review works, providing robust conceptual tools for the state of a particular investigation (Pranckutė 2021).
▪: The years of the search were chosen to span from 2018 to April 2024, covering the pre- and post-COVID-19 pandemic periods. This choice corresponds to the period of growth in digital platforms, underscoring the pertinence and timeliness of our research. The AND connector used and separated combinations of the following keys: equity, crowdfunding, fraud, detection, machine learning, tax planning, and security.
▪: Journals, working papers, and conference documents must be peer-reviewed and belong to institutional universities or research centers. It is important to clarify that conference papers were selected because they typically provide information on the latest innovations and trends before they appear in journal publications.
▪: All PRISMA steps were carried out manually and humanly without requiring artificial intelligence. The authors reviewed the title, abstract, and conclusions and then conducted a subsequent reading of the document.

Figure 1 shows the number of documents extracted using the PRISMA methodology. Of the 925 records taken until April 2024 in the two databases (Scopus and WoS), a first data cleansing was carried out based on the year and the English language, resulting in the exclusion of 167 records (13 documents after 2018 and 154 duplicate records). Of the remaining 758 records, keyword-based filters were applied, excluding terms such as cryptocurrency, blockchain, commerce, and marketing. In total, 623 records were excluded using the above keywords. After this filter, 135 records remained, reduced to 39 after a manual reading of the title, summary, and conclusions. A total of 96 reports are assessed for eligibility. Finally, each article was carefully read, focusing on those that met the study’s purpose of detecting fraud through machine learning. This process resulted in 26 records that will be analyzed in depth. It is essential to highlight that a bibliometric extraction from these 26 records will be carried out to make relevant inferences.

2.2. Statistical Analysis

Statistical analyses with the list of the most representative works are carried out. A table that presents key variables is constructed, enabling us to address the guiding research questions. The variables considered, such as the journal’s title and its quartile, the year of publication, the type of fraud studied, the number of citations and self-citations, the number of institutions involved in the studies, the publisher, the country, and the number of countries that have participated in the study, are crucial in our understanding of fraud studies. Subsequently, the information is tabulated, and a series of descriptive graphs are crafted, effectively enhancing the visualization of the data.

The Principal Component Analysis (PCA) is used to optimize manufacturing and control processes, analyze failures, and optimize product design (Minh et al. 2023). Additionally, PCA concentrates the most relevant information in the first principal components, making it easier to visualize the data and identify patterns, similarities, and clustering (Granato et al. 2018). The biplot and dendrogram graphs were used to analyze the relationships between the variables under study. In this work, the PCA analysis is employed to identify variable aggregation.

On the other hand, an ANOVA analysis is used to compare countries, universities, journals, and publisher, regarding the number of citations at the 95% confidence level. The results of this analysis identify which factors have significant statistical differences at the 95% confidence level. It is important to note that total citation is a key factor employed in the comparison, since it provides valuable information to understand the impact and relevance of academic publications in different contexts (Aksnes et al. 2019). The statistical analysis shows that if a pair of means overlaps, there are no statistically significant differences between both means. On the other hand, if a pair of means does not overlap, this implies a statistically significant difference at a 5% significance level (Montgomery and Runger 2020). Statistical analyses are performed using Statgraphics Centurion XVI software. It is important to note that the normality test results determine the type of group comparison test, whether parametric or non-parametric. Parametric tests, including ANOVA and Fisher’s LSD intervals, are used when the data follow a normal distribution. The Kruskal–Wallis test uses the non-parametric approach. The Kruskal–Wallis test compares three or more independent groups, and determines whether they come from the same distribution (Montgomery and Runger 2020).

3. Results

3.1. Overview of Main Works

Table 1 summarizes the most important characteristics of the collected works. Table 1, detailed below, provides the following observations.

▪: 1.4% of the works were published before 2018, so 98.6% of the documents were published in recent years. The above reaffirms that the post-pandemic period has represented the development of digital crowdfunding platforms and the need to analyze and detect fraud using machine learning techniques.
▪: The machine learning algorithm most employed by the different studies is Random Forest (with 9% of the total methods used in the studies), followed by Latent Dirichlet Allocation and Support Vector Machine (both methods with 7% of the total studies), and finally the methods of Decision Trees, Logistic Regression, Long Short-Term Memory and Neural Networks (RQ1 is Answered).
▪: 42% of the studies were published in high-impact journals in Q1, followed by Q2 with 8% and Q3 and Q4 with 12% and 8%, respectively. It is important to highlight that Q1 articles concentrate the highest number of citations around 91%, followed by Q2 journals with 6%, and finally, 3% of citations with unclassified journals.
▪: Cumming et al. (2021) is the article with the highest number of citations of 185 and self-citations of 20. On the other hand, Belavina et al. (2020) have 136 citations and ten self-citations. The two studies described above use Propensity Score Matching (PSM) and sequential testing to model the early detection of fraud on digital crowdfunding platforms. Although some recent studies have few or no citations, common in recent publications, the diversity of methods used, and international collaboration underlines the importance and growing interest in fraud detection using advanced techniques.
▪: The most frequent publishers include Elsevier, Springer, and MDPI, indicating their relevance in this field of research. Institutional collaboration is mostly between one or two institutions.
▪: The main types of fraud on crowdfunding platforms include personal fraud, such as plagiarism of content and collusion with auditors, and the creation of fraudulent projects, which may be feasible but undeliverable, impractical, or technically unfeasible. There are also frauds related to hacking attacks on microgrid investment platforms and fraud in rewards-based campaigns. Other types of fraud include misinformation about technologies such as 5G, diversion of funds and misuse of money, and fraud in donation-based crowdfunding campaigns. Also included are prevented fraud, where campaigns were stopped before transferring funds and attempted fraud, where the fraud was discovered after funds were received (RQ2 is Answered).
▪: Digital platforms used in the studies include GoFundMe, Kickstarter, Indiegogo (Belavina et al. 2020; Perez et al. 2022), LendingClub, and Seedrs (Huo et al. 2024).
▪: Research in fraud detection and risk management on crowdfunding platforms has explored different methodologies. Xu et al. (2023) and Bafna et al. (2023) propose decentralized systems based on blockchain and Ethereum smart contracts to ensure proper utilization of funds, avoiding external influences and fraud. On the other hand, Hou and Qu (2023) and Li and Qu (2022) use machine learning models such as BERT, BNB, MT5, and hybrid classifiers to detect logical contradictions and misleading narratives in crowdfunding projects. Prateek et al. (2021) and Shafqat et al. (2020) demonstrate the effectiveness of combining machine learning classifiers with rule-based models to identify fraud. Wu et al. (2022) highlight the importance of evaluating crowdfunding platforms for investments in microgrids, while Cumming et al. (2021) and Lee et al. (2022) highlight the precision of logistic regression to detect fraud. Furthermore, studies such as that of Winoto and Wulandari (2023) and Meoli et al. (2022) analyze the impact of regulation and financial literacy on the crowdfunding ecosystem to mitigate risks such as adverse selection and moral hazard. Finally, Elmer and Ward-Kimola (2023) and Han and Dang (2020) investigate how crowdfunding platforms can spread disinformation. Also, these authors concluded the importance of ML algorithms for early warning systems in proactive fraud prevention. Table 1 displays the most representative works. (RQ3 is Answered). The above authors report high accuracy values, around 98%, in fraud detection (Shafqat et al. 2020; Choi et al. 2022). It is important to note that machine learning techniques have been successfully used in other contexts and applications. For example, in predicting the success of crowdfunding campaigns, Logistic Regression, with or without PCA, showed an appropriate performance, reaching an accuracy of 84%. Random Forest with PCA obtained an accuracy of 82%, while XGBoost with PCA achieved an accuracy of 83% (Raflesia et al. 2023). Other applications detect fake news information, where the Bi-LSTM algorithm showed an accuracy of 96.77% (Hamed et al. 2023). These applications show an excellent opportunity for machine learning algorithms to be used in fake news and fraud prediction.

Table 1. Overview of PRISMA selected studies that employed machine learning on fraud detection in crowdfunding digital platforms.

Study	First Author (Reference)	Journal Title	Type of Document	Publication Year	Fraud Type Evaluated	Fraud Detection in Crowdfunding Platforms	Number of Citations (Auto Citations)	Number of Institutions Involved	Publisher	Number of Countries	Journal Quartile
1	Mohammadi et al. (2025)	International Journal of Finance & Managerial Accounting	Journal Article	2025	Fraudsters may use asymmetric information and regulatory loopholes to deceive investors.	NN	0 (0)	1	Inderscience	1	Q3
2	Huo et al. (2024)	Information Processing & Management	Journal Article	2024	Risks and challenges associated with information disclosure on crowdfunding platforms and how they affect resource acquisition for digital ventures.	STM	1 (0)	5	Elsevier	3	Q1
3	Bafna et al. (2023)	Journal of Information and Computational Science	Journal Article	2023	Embezzlement of funds, lack of value returned to contributors, misuse of money.	Not Applied	0 (0)	1	Binary Information Press	1	Q4
4	Bianda et al. (2023)	Journal of World Science	Journal Article	2023	Types of frauds related to crowdfunding services based on Sharia principles (Riba, Gharar, Maysir, Tadlis, Dharar, Al-Ta’addi, Al-Taqshir, Mukhalafah al-shurut, Zhulm).	Not Applied	3 (0)	1	Riviera Publishing	1	NC
5	Elmer and Ward-Kimola (2023)	Media, Culture & Society	Journal Article	2023	Electoral fraud, disinformation about 5G.	Not Applied	6 (2)	2	SAGE Publications Ltd.	1	Q1
6	Hou and Qu (2023)	Current applied science and technology	Journal Article	2023	Feasible but fraudulent projects, impractical fraudulent projects.	BERT, MT5, SP, AFT	0 (0)	1	King Mongkut’s Institute of Technology Ladkrabang	2	Q4
7	Lathifah et al. (2022)	2022 10th International Conference on Cyber and IT Service Management (CITSM)	conference	2022	Unauthorized access control in administrative areas, advanced SQL injection vulnerabilities, insecure design, misconfiguration of security, vulnerable and outdated components, software and data integrity failures.	Not Applied	2 (0)	1	IEEE Xplore	1	NC
8	Winoto and Wulandari (2023)	Management Studies and Entrepreneurship Journal	Journal Article	2023	Fraud in data verification, explorative analysis, fraud in project presentation, fraud in project execution, asymmetric information risk.	Not Applied	0 (0)	1	YRPI	1	NC
9	Xu et al. (2023)	Information Sciences	Journal Article	2023	Personal fraud, content plagiarism, collusion with auditors.	Not Applied	5 (0)	2	Elsevier	1	Q1
10	Zkik et al. (2023)	Electronic Commerce Research	Journal Article	2024	Re-entry attacks, infinite loop attacks, block timestamp attacks, Advanced Persistent Threats (APTs), malware, Distributed Denial-of-Service (DDoS) attacks.	AdaBoost, XGBoost, RF	4 (0)	3	Springer	3	Q1
11	Choi et al. (2022)	KSII Transactions on Internet and Information Systems	Journal Article	2022	Detect fraudulent activities within health-related crowdfunding campaigns on platforms like GoFundMe.	LDA, CF	0 (0)	3	Korea Society of Internet Information	1	Q3
12	Lee et al. (2022)	Sensors	Journal Article	2022	Fraud in reward-based crowdfunding campaigns.	LR, FSLR	2 (0)	2	MDPI	2	Q1
13	Li and Qu (2022)	Songklanakarin Journal of Science & Technology	Journal Article	2022	Projects with logically feasible and practical concepts and technically infeasible projects.	BNB, NO, BK	0 (0)	1	Songklanakarin Journal of Science & Technology	2	Q3
14	Meoli et al. (2022)	Corporate Governance: An International Review	Journal Article	2022	Fraud related to equity-based crowdfunding, especially concerning investor financial literacy.	Not Applied	50 (7)	2	Emerald Group Publishing Ltd.	1	Q1
15	Perez et al. (2022)	Proceedings of the 14th ACM Web Science Conference 2022	conference	2020	Embezzlement fraud, opportunistic fraud, total fraud.	RF, AdaBoost, DT, k-NN, NB, SVM, EC, MLP	12 (1)	3	arXiv	2	NC
16	Riadi et al. (2022)	Engineering Science Letter	Journal Article	2022	Fraud related to the security of crowdfunding services in charitable organizations.	Not Applied	0 (0)	1	The Indonesian Institute of Science and Technology	1	NC
17	Wu et al. (2022)	Financial Innovation	Journal Article	2022	Fraud and hacking attacks on crowdfunding platforms for investments in microgrid projects.	q-ROFSs, M-SWARA, DEMATEL, TOPSIS, IFS and PFS	40 (17)	2	Springer	2	Q1
18	Cumming et al. (2021)	Journal of Business Ethics	Journal Article	2021	Prevented fraud, attempted fraud, fraud in campaigns with delays of more than a year, no communication for six months, and no rewards.	PSM	185 (20)	5	Springer	3	Q1
19	Prateek et al. (2021)	WISP 2021 Proceedings. 2.	conference	2021	Fraudulent campaigns on donation-based crowdfunding platforms.	RF, SVM	0 (0)	1	AIS Electronic Library	1	NC
20	Belavina et al. (2020)	Management Science	Journal Article	2020	Embezzlement of funds, and performance opacity.	ST	136 (10)	2	INFORMS	1	Q1
21	Han and Dang (2020)	Proceedings of the 7th International Conference on Management of e-Commerce and e-Government	conference	2020	Failure to pay promised returns, delayed return payments, breach of established agreements, and lack of post-sale services.	RF, SVM	0 (0)	1	ACM DL	1	NC
22	Petrov and Emelyanova (2021)	CEUR Workshop Proceedings	conference	2021	Bankruptcy risks, fraud or unfair practices, risks associated with public offerings and unlicensed activities, information disclosure risks, and illegal platform use risks.	SA, LR, NN, DT, NBC	2 (0)	2	CEUR Workshop Proceedings	1	NC
23	Shafqat et al. (2020)	Applied Sciences	Journal Article	2020	Successfully funded but not delivered projects, canceled projects, and projects suspended for fraud.	LDA, LSTM.	4 (0)	2	MDPI	1	Q2
24	Ellman and Hurkens (2019a)	Economics Letters	Journal Article	2019	Fraud in the context of reward-based crowdfunding.	Not Applied	27 (3)	1	Elsevier	1	Q2
25	Shafqat and Byun (2019)	Applied Sciences	Journal Article	2019	Misrepresentation of ideas, advance fee fraud, investment fraud, non-payment or non-delivery, personal data breach.	LSTM-LDA	18 (4)	1	MDPI	1	Q1
26	Zenone and Snyder (2019)	Policy & Internet	Journal Article	2019	Faking or exaggerating one’s own illness, faking or exaggerating someone else’s illness, identity theft, and misuse of funds.	Not Applied	51 (3)	1	John Wiley and Sons Ltd.	1	Q1

Figure 2 shows the countries with the most citations. The United States has the highest number of citations, indicating its research’s academic and scientific impact. Western Europe, with countries such as Germany, and Asia, China, and Japan, also have representative citations, reflecting their strong presence in this field of research. Africa, Latin America, and some Asian countries have significantly fewer citations, which indicates less contribution or recognition in the international academic literature.

Figure 3 shows the universities’ distribution of the total citations. Concordia University leads with 376 citations, followed by the University of Pennsylvania with 272. Florida Atlantic University, Poschingerstraße, the University of Birmingham, and the University of Bremen are tied, with 185 citations each. The University of Bergamo has 150 citations, Cornell University 136, Simon Fraser University-Health Sciences 102, and İstanbul Medipol University 80. Concordia University and the University of Pennsylvania stand out, suggesting the need to analyze their publication and collaboration strategies.

Figure 4 shows the authors’ most remarkable citations and publications in the field of research. Cumming, D. stands out with 360 citations in three publications, reflecting strong influence, while the results for Belavina, E., with 130 citations in one publication, suggest great relevance and innovation. Cumming’s citation efficiency is 120 citations per publication, and Belavina’s is 130. In contrast, Sharfqat, W. and Shang, W. L., with three publications each, do not exceed 120 citations, indicating lower comparative efficiency.

3.2. Principal Component Analysis (PCA) Results

In a PCA analysis, the components 1 and 2 are represented on the X and Y axes. These components are linear combinations of the original variables and manage to capture most of the variability present in the data. Figure 5a shows the country’s biplot distribution. The biplot axes represent the first two principal components (Component 1 and Component 2). The variables used to carry out the analysis in Figure 5 are the countries, the total number of publications, the number of authors, the year, the number of institutions, and the number of articles published in Q1 or Q2. The first principal component (Component 1) is influenced by the “number of authors”, “number of institutions”, and “Q1”, since these vectors are mainly aligned with the axis of the first component. The second principal component (Component 2) is influenced by “year”, since this vector is mainly aligned with the axis of the second component. Figure 5b shows the university’s biplot distribution. The “total citations” arrow indicates a strong association with the second principal component (Component 2). The variables “year” and “Journal Article” have similar directions, suggesting a possible correlation between them. The variable “conference” is in the opposite direction to “Journal Article”, suggesting contrasting characteristics between these types of publications. The variables “Q1” and “Q2” can be related to the journal’s quartile, showing different patterns of association. Finally, the “number of authors” is oriented horizontally, indicating its more significant relationship with the first principal component (Component 1).

Figure 6 shows the dendrogram diagram of the country analysis. The dendrogram provided is made using the nearest neighbor method and the squared Euclidean distance. The analysis of this dendrogram allows us to understand the similarity between different countries based on several variables: total citations, number of authors per year, number of institutions, articles in Q1, and articles in Q2. The distance in the dendrogram shows the dissimilarity among the countries; the greater the distance, the greater the difference in the values of the variables studied. In the multi-country clustering analysis, several groups are identified based on similarity in variables such as citations, number of authors, institutions, and articles in Q1 and Q2. Canada, France, Morocco, Thailand, India, the United Kingdom, and Italy form a close group, indicating similar values in the mentioned variables. Germany and Iran, although grouped together, show moderate differences from the other groups. The United States and Japan are further apart, indicating more significant differences, while Spain and Indonesia are also together but at some distance. The United States and Spain are also similar, but present notable differences from the previous group. Finally, China and Korea are in the most distant group, suggesting significantly different values in the analyzed variables (RQ4 is Answered).

Figure 7 shows the university’s dendrogram using the nearest neighbor method with the squared Euclidean approach. The closest universities in the dendrogram have similar characteristics in the variables considered. The top 14 universities employed in the analysis are described as follows: Concordia University (U1), University of Pennsylvania (U2), Florida Atlantic University (U3), Poschingerstraße (U4), University of Birmingham (U5), University of Bremen (U6), University of Bergamo (U7), Cornell University (U8), Simon Fraser University-Health Sciences (U9), İstanbul Medipol University (U10), Campus UAB (U11), Jeju National University (U12), Normal University (U13) and Hunan University (U14). Institutions that cluster at greater distances (such as U12 and U14) have unique profiles compared to the others. The primary groups identified present different levels of similarity among the institutions. Group 1 is composed of U1, U2, U3, U4, U5, U6, U7, U8, U9, and U10, and shows a very high level of similarity, with distances less than 8, which indicates that these institutions have quite similar profiles in the mentioned variables. Group 2 is composed of U11, U12, U13, and U14, and also presents relatively low distances, although not as low as the first group, suggesting a moderate similarity between them. Finally, Group 3 is composed of U15, U16, U17, U18, U19, and U20, constituting a different group with moderate similarity. Within Group 1, a subgroup from U1 to U6 has a very low distance, approximately 2, and another subgroup from U7 to U10 has a slightly larger distance, approximately 4 to 6. On the other hand, U21 is an isolated institution that is not grouped with any other, up to a distance greater than 16, which indicates a very different profile in the variables analyzed (RQ5 is Answered).

3.3. Statistical Comparison Analysis

The normal probability plot using the residuals showed that the standard skewness and the kurtosis statistics values were outside the range of −2 to +2, which indicates significant normality departure. For these reasons, the authors employed a Kruskal–Wallis nonparametric test for statistical comparison. Figure 8 shows the statistical comparison of total citations between countries, universities, publishers, and journals. In this figure, the error bars correspond to the Kruskal–Wallis intervals at a 95% confidence level, and the point between the lines corresponds to the mean of the total citations. The p-values of the Kruskal–Wallis test for different groups are as follows: in countries, it is 6.61 × 10⁻⁵; in universities, it is 1.26 × 10⁻⁴; in publishing houses, it is 1.89 × 10⁻⁸; and in academic journals, it is 5.67 × 10⁻⁹. As can be seen, the p-value is less than 0,05, which implies a statistically significant difference amongst the medians at the 95% confidence level. Below are the observations derived from the results in Figure 8.

▪: The comparison of the five countries with the highest citations shows no statistically significant differences at the 95% confidence level. The above implies that the United States, Germany, Canada, Italy, and Turkey are the most representative countries in terms of total citations, and are the countries whose works have had remarkable visualization in the academic community (RQ4 is Answered).
▪: The comparison of the universities reveals two groups. The first group comprises Florida Atlantic University, Poschingerstraße, University of Birmingham, University of Bremen, Cornell University, University of Pennsylvania, and Concordia University. The second group includes Simon Fraser University-Health Sciences and the University of Bergamo. For the universities in the first group, no statistically significant differences are observed at the 95% confidence level, which indicates that their academic production is being recognized and that they are the most proficient universities for producing some research work. However, with fewer citations, the second group presents significant differences from the first group (RQ5 is Answered).
▪: The comparison of the publishers shows three groups. The first group, with the highest citations, consists of Springer and John Wiley and Sons Ltd. The second group includes Emerald Group Publishing Ltd. and Proceedings of the 14th ACM Web Science. The last group consists of the Proceedings of the 14th ACM Web Science. The first group’s editorials present more citations and significant differences from the other two groups. It is essential to highlight that these publishers’ focus on publishing works related to fraud, identification, and control, using machine learning techniques (RQ5 is Answered).
▪: The comparison of journals shows two groups. The first group includes the Journal of Business Ethics and Management Science, while the second group includes Policy & Internet, Corporate Governance: An International Review, Financial Innovation, and Economics Letters. All these journals and their editors are interested in improving their research on fraud identification using machine learning methods on crowdfunding platforms (RQ5 is Answered).

Finally, the effect sizes of the non-parametric test are estimated. The effect size is a research tool that complements tests of statistical significance, providing information about the magnitude of the observed effect (Aarts et al. 2014). This approach not only helps validate the results, but also provides a more complete and accurate view of the academic landscape, facilitating the identification of the most influential and prominent entities in the field of study (Aarts et al. 2014). The effects observed in different groups show that 48% are registered in countries, 91% in universities, 56% in publishing houses, and 100% in academic journals. Universities and journals have the most significant effect sizes, implying that they are important in publishing. Publishers and countries follow them. The authors highlight the importance of conducting a bibliometric analysis using different statistical methods to identify the most prolific countries and universities in a research topic.

4. Challenges and Further Studies on Fraud Detection in Crowdfunding Platforms

4.1. Data Challenges and ML Algorithms

High-quality, well-structured, labeled data are fundamental for building precise and efficient machine learning models to learn relevant patterns and make accurate predictions (Mvula et al. 2023). Machine learning provides continuous feedback, allowing models to adapt, improve, and address new fraud tactics. Through data analysis, user and transaction segmentation can be performed, enhancing fraud detection strategies by targeting specific segments prone to fraudulent activities. Advanced data visualization tools are crucial in understanding complex, non-linear patterns, helping analysts quickly identify problematic areas and make informed decisions (Elitzur 2024). Additionally, companies must identify and access valuable internal and external data sources, ensuring availability through crawling algorithms or database access. Innovative and research projects utilizing machine learning can uncover new insights, while a data-driven approach helps identify innovative product development opportunities (Bafna et al. 2023; Xu et al. 2023).

Further work can be carried out with advanced linguistic techniques to improve the detection of fraudulent reviews (Lee and Sohn 2019; Raflesia et al. 2023). Evaluating new ensemble and deep-learning algorithms, such as neural networks and SVM, could increase accuracy (Lee and Sohn 2019; Raflesia et al. 2023). Creating adaptive methods that respond to changes in reviewer behavior will improve the robustness of the model (Raflesia et al. 2023). Optimizing hyperparameters with techniques such as GridSearchCV maximizes performance on ML algorithms (Butt et al. 2020; Raflesia et al. 2023). Furthermore, policies and regulations tailored to specific contexts must be developed to protect crowdfunding stakeholders (Wonglimpiyarat 2018). Figure 9 presents recent ML approaches for cybersecurity intrusion detection in different applications. Louati et al. (2024) employ unsupervised learning with a MARL model based on the NSL-KDD dataset, achieving high accuracy and low false-positive and negative incidence. Kalinin and Krundyshev (2023) and Talukder et al. (2024) use supervised learning, achieving accuracies of up to 99.95%, using neural networks and oversampling techniques. Jemili et al. (2023) and Roshan and Zafar (2024) studied online supervised and adaptive learning models, combining algorithms such as Random Forest, XGBoost, and K-Nearest Neighbors. These authors found that the evaluated ML algorithms present more accuracies than 98% using different cybersecurity data sets (RQ6 is Answered).

4.2. Improvements in Crowdfunding Platforms: Design, Policies and Inter-Operability

Figure 10 shows the main topics for further research for detecting fraud and anomalous behaviors on digital crowdfunding platforms. Future research on crowdfunding platforms can focus on the following aspects (RQ6 is Answered).

▪: Analyze how new technologies such as blockchain, big data, augmented reality, and virtual reality can increase transparency and trust in these platforms. These technologies can improve project presentation and sponsor engagement (Zhou et al. 2023; Ratten 2023; Yang et al. 2023). In addition, it is important to explore innovative business models that integrate analytical and digital marketing services. The above could strengthen the long-term financial sustainability of crowdfunding platforms and make them more resilient (Zhou et al. 2023; Ratten 2023; Yang et al. 2023).
▪: The competition between platforms and how this influences their strategies can help them effectively differentiate themselves in the market (Zhou et al. 2023). Further work in AI can be carried out to personalize and adapt the user experience to their interests and behaviors (Yang et al. 2023; Gawer 2021; Chen et al. 2022).
▪: Ensuring transparency and ethical practices to safeguard user data privacy and foster trust (Yang et al. 2023; Chen et al. 2022; Vicari and Kirby 2023). Implementing incentive and control mechanisms can enhance relationship management with project creators and contributors (Chen et al. 2022).
▪: The role of crowdfunding platforms in supporting entrepreneurship, particularly in times of global crises, is not just a theoretical concept but a practical solution for entrepreneurs and small businesses (Ratten 2023). This aspect warrants further investigation.
▪: Facilitating networking and inter-operability among entrepreneurs, investors, and consumers, and examining effective digital marketing strategies can attract more participants to these platforms (Ratten 2023; Yang et al. 2023).
▪: The regulatory environment can be studied to maintain competitiveness and innovation while mitigating risks such as fraud and privacy issues (Gawer 2021, 2022). Future studies should consider how platform- and policy-design changes can improve well-being and market efficiency, testing different design elements and policy interventions to observe their impact on contributor behavior and campaign success.
▪: Deeper research is needed to understand endogenous anonymity election and its broader implications, exploring the various factors that lead contributors to choose anonymity and how these choices affect the overall crowdfunding ecosystem (Burtch et al. 2016; Wonglimpiyarat 2018).
▪: It is important to incorporate crowdfunding campaigns with traditional marketing channels such as social media, email marketing, and public relations to maximize their reach. Furthermore, these campaigns contribute to brand building and long-term customer loyalty and have great potential in B2B markets, facilitating the launch of industrial products and technologies. It is important to explore the impact of equity crowdfunding on investor relations and corporate governance and its ability to foster innovation and support new technologies and business models. Considering the regulatory environment and legal implications ensures compliance and effectiveness of campaigns (Brown et al. 2017). Addressing ethical considerations and social impact ensures responsible practices and a positive societal effect.
▪: Finally, comparative studies between different countries can provide valuable insights into how cultural, economic, and regulatory environments influence the success of crowdfunding platforms, helping to identify best practices and areas for improvement (Bassani et al. 2019).

4.3. Strategies and Challenges in Cybersecurity and Fraud Education: Methods, Academic Programs and Gamification Learning Platforms

Figure 11 shows different studies of educational and social communications. Cybersecurity and fraud education for children and younger people and equipment management processes in an industry are most effective when self-directed instruction, collaboration, and traditional teaching methods are used, significantly improving assessment results. Best practices for topic selection include using age-appropriate materials published by government agencies, which educators should actively seek out. Nationwide implementation faces challenges such as lack of standardized curriculum, shortage of trained professionals, and resource limitations, especially in low-income and rural schools, highlighting the need for better funding and increased fraud awareness. Educational programs have proven effective, significantly increasing student interest and knowledge, with a 15% increase after workshops (Solis-Diaz 2023). Furthermore, game-based learning platforms, especially those using mobile applications, are highly engaging for primary school students, and improve learning outcomes through personalized content.

4.4. Implementing Machine Learning for Fraud Detection in Crowdfunding Platforms

To carry out a machine learning (ML) project to identify fraud on a crowdfunding platform, it is essential first to pinpoint the issue and the specific sector that requires improvement. Then, high-quality data must be collected and prepared, which includes tasks such as data cleaning, identifying outliers, handling missing values, and normalization. Once the data are prepared, the most appropriate ML algorithm, such as Random Forest, SVM, ANNs, KNN, or Naive Bayes, is selected, depending on the characteristics of the data and the problem to be solved. It is essential to divide the data into training and test sets, to evaluate the model’s performance. Evaluation metrics such as Recall, F1-score, and ROC-AUC serve to evaluate and validate the model, continuously monitoring it to make necessary adjustments when new data appear or conditions change. Throughout this process, it is essential to consider ethical and security aspects to ensure compliance with current legal and privacy regulations. Figure 12 shows this fact. It is important to note that to develop a machine learning system, tools such as the Scikit-Learn, TensorFlow, or PyTorch libraries are used to build models. These models are often deployed on cloud platforms such as AWS, Google Cloud, or Microsoft Azure. In addition, a data team composed of data scientists, data engineers, and domain-specific experts is essential for managing and optimizing the process.

5. Limitations of the Study

This study has several significant limitations. First, the amount of data available fluctuates, due to continuous updates in the WoS and Scopus databases, which may affect the consistency of the results. Secondly, the study topics were selected during the data recovery phase, which could introduce bias in the results obtained. Third, the search terms used were derived from the existing scientific literature, which could have excluded some relevant keywords. Further research could help identify new search keywords. Fourth, bibliometric analyses were limited to articles written in English, which could have introduced sampling bias and consequently influenced the study results. Furthermore, future research should consider including articles in other languages to provide a more complete view and reduce potential bias.

6. Conclusions

▪: In this work, a bibliometric analysis is carried out to identify the most relevant studies on fraud detection in digital crowdfunding platforms using machine learning techniques. The analyses were carried out during the COVID-19 pandemic and post-pandemic period, from 2018 to 2024. Using the PRISMA methodology, 26 works were retrieved from the two databases, Scopus and Web of Science. The common fraud methods in this digital platform are fraud in crowdfunding campaigns based on rewards, fraudulent campaigns based on donations, embezzlement and misuse of funds, fraud due to embezzlement, use of information asymmetric, and regulatory loopholes to deceive investors. Furthermore, machine learning techniques used at industrial and academic levels include Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), Neural Networks (NN), Naive Bayes (NB), Decision Trees (DT), Latent Dirichlet Allocation (LDA), and Long Short-Term Memory (LSTM), among others.
▪: The results of the analysis reveal that Cumming, D., followed by Belavina, E., are the authors with the highest publications and citations. At the same time, Sharfqat, W. and Shang, W. L. have the lowest citations. It is important to note that the universities with the highest number of publications and citations are Concordia University, with more than 350 citations; the University of Pennsylvania, with around 250 citations; and Florida Atlantic University and Poschingerstraße, with about 150 citations each. Other institutions in the topmost prolific universities, although with fewer than 150 citations, include Birmingham, Bremen, Bergamo, and Cornell Universities.
▪: The analysis of the principal components shows that two principal components influence the countries where the authors are affiliated. The first component is influenced by “number of authors”, “number of institutions”, and “Q1”, while Component 2 is influenced by “year.” Regarding institutions, the first component is related to “year” and “Journal Article”, while “conference” is in the opposite direction to “Journal Article”, showing contrasting characteristics. The authors prefer to publish in scientific journals rather than conferences, and the journal quartile (Q1 and Q2) reflects different patterns of association. The “number of authors” relates primarily to Component 1. The analysis also shows that Canada, France, Morocco, Thailand, India, the United Kingdom, and Italy form a close group, indicating similar values in publications, citations, and several institutions participating in the research. Germany and Iran present moderate differences. The United States and Japan are further apart, pointing to significant differences. Spain and Indonesia are together, but at some distance, while the United States and Spain are similar to each other but different from the previous group. China and Korea form the most distant group, suggesting very different values.
▪: Further studies that will impact this research topic are highlighted by the need to expand the data set and integrate advanced linguistic analyses, allowing for a more complete analysis. Incorporating real-time data and developing adaptive versions of the method will improve responsiveness. Exploring new algorithms and optimizing models are important to increase accuracy. Developing appropriate policies and regulations aims to ensure a practical operational framework. These joint approaches will strengthen early anomaly detection and improve efficiency in industrial contract management.
▪: This work provides tools for understanding recent studies of fraud detection on crowdfunding platforms using machine learning techniques. The above enables accurate and faster identification of fraudulent activities, thereby protecting investors and reducing associated financial risks. Furthermore, collaboration between academic institutions and industries facilitates the development of new technologies and methodologies to address practical problems and strengthens the regulatory framework and trust in these platforms. So, this work identifies global trends, future research directions, and opportunities for continued innovations in the security and efficiency of crowdfunding platforms.

Author Contributions

The three authors participated in all steps of the manuscript preparation: methodology, investigation, formal analysis, writing—original draft, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Universidad Nacional de Colombia and the Contraloría General de la República (CGR) de Colombia with the inter-administrative contract No. 1550 of 2023.

Data Availability Statement

No new data were created or generated in the article.

Acknowledgments

The authors are grateful for the support of the Universidad Nacional de Colombia and the Contraloría General de la República (CGR) de Colombia with the inter-administrative contract No. 1550 of 2023. The contract aims to provide scientific and technological services to the Dirección de Información, Análisis y Reacción Inmediata through applied research activities, technological development, innovation, and knowledge transfer to promote the deployment of methodologies, techniques, and technological solutions for developing the Digital Government Policy in the Contraloría General de la República de Colombia to strengthen surveillance and fiscal control.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

List of Abbreviations

AFT	Adaptive Fuzzy Training
AI	Artificial Intelligence
ANAC	Italy’s National Anti-Corruption Authority
ANN	Artificial Neural Network
AR	Augmented Reality.
BBN	Bayesian Belief Network
BERT	Bidirectional Encoder Representations from Transformers
BK	Balanced K-means
BNB	Binomial Naive Bayes
CART	Classification and Regression Trees
CBS	Cost Breakdown Structure
CF	Collaborative Filtering
DEMATEL	Decision-Making Trial and Evaluation Laboratory
DNN	Deep Neural Network
DT	Decision Trees
EmPULIA	Digital platform for public tenders in Apulia
EU	European Union
FSLR	Forward Stepwise Logistic Regression
GBM	Gradient Boosting Machine
IEEE	Institute of Electrical and Electronics Engineers
IF	Isolation Forest
IFS	Intuitionistic Fuzzy Sets
INCM	Portuguese Mint and Official Printing Office
IoT	Internet of Things
KNN	K-Nearest Neighbors
k-NN	k-Nearest Neighbors
LaBSE	Language-agnostic BERT Sentence Embedding
LASER	Language-Agnostic SEntence Representations
LDA	Latent Dirichlet Allocation
LR	Logistic Regression
LSTM	Long Short-Term Memory
LSTM-LDA	Long Short-Term Memory—Latent Dirichlet Allocation
MBERT	Bidirectional Encoder Representations from Transformers Multilingüe
MDL	Minimum Description Length
ML	Machine Learning
MLP	Multilayer Neural Network
MLP	Multilayer Perceptron
M-SWARA	Modified Step-wise Weight Assessment Ratio Analysis
MT5	Multilingual Text-to-Text Transfer Transformer
NB	Naive-Bayes
Neural	Network
NLP	Natural Language Processing
PCA	Principal Component Analysis
PFS	Pythagorean Fuzzy Sets
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
PSA	Production Sharing Agreements
PSM	Propensity Score Matching
Q	Quartile
q-ROFSs	q-Rung Orthopair Fuzzy Sets
RBFN	Radial Basis Function Network
RF	Random Forest
RQ	Research Question
SciML	Scientific Machine Learning
SCSC	Supply Chain Smart Contract
SD	Standard Deviation
SMS	Safety Management Systems
SP	Stochastic Programming
ST	Sequential Testing
STM	Structural Topic Modelling
SVM	Support Vector Machine
TOPSIS	Technique for Order of Preference by Similarity to Ideal Solution
UAV	Unmanned Aerial Vehicles
WBS	Work Breakdown Structure
XGBoost	Extreme Gradient Boosting
XLMR	Cross-lingual Language Model Pretraining

References

Aarts, Sil, Marjan Van Den Akker, and Bjorn Winkens. 2014. The Importance of Effect Sizes. The European Journal of General Practice 20: 61–64. [Google Scholar] [CrossRef] [PubMed]
Ahmad, Norita, Phillip A. Laplante, Joanna F. DeFranco, and Mohamad Kassab. 2021. A Cybersecurity Educated Community. IEEE Transactions on Emerging Topics in Computing 10: 1456–63. [Google Scholar] [CrossRef]
Aksnes, Dag W., Liv Langfeldt, and Paul Wouters. 2019. Citations, Citation Indicators, and Research Quality: An Overview of Basic Concepts and Theories. Sage Open 9: 2158244019829575. [Google Scholar] [CrossRef]
Bafna, Bhavana, Vedant Daigavane, Shlok Shaha, Gaurav Shinde, and Sachin Shelke. 2023. Decentralized Transaction System for Detection and Prevention of Fraud in Crowdfunding Platforms. Journal of Information and Computational Science 13: 133–38. [Google Scholar]
Bassani, Gaia, Nicoletta Marinelli, and Silvio Vismara. 2019. Crowdfunding in Healthcare. The Journal of Technology Transfer 44: 1290–310. [Google Scholar] [CrossRef]
Behl, Abhishek, Pankaj Dutta, Zongwei Luo, and Pratima Sheorey. 2022. Enabling Artificial Intelligence on a Donation-Based Crowdfunding Platform: A Theoretical Approach. Annals of Operations Research 319: 761–89. [Google Scholar] [CrossRef]
Belavina, Elena, Simone Marinesi, and Gerry Tsoukalas. 2020. Rethinking Crowdfunding Platform Design: Mechanisms to Deter Misconduct and Improve Efficiency. Management Science 66: 4980–97. [Google Scholar] [CrossRef]
Bianda, Ryan, Aang Gunaepi, and Muhammad Misbakul Munir. 2023. Offering Sharia Securities Through Technology-Based Crowdfunding Services Based on Sharia Principles According to MUI Fatwa. Journal of World Science 2: 332–40. [Google Scholar] [CrossRef]
Brown, Terrence E., Edward Boon, and Leyland F. Pitt. 2017. Seeking Funding in Order to Sell: Crowdfunding as a Marketing Tool. Business Horizons 60: 189–95. [Google Scholar] [CrossRef]
Burtch, Gordon, Anindya Ghose, and Sunil Wattal. 2016. Secret Admirers: An Empirical Examination of Information Hiding and Contribution Dynamics in Online Crowdfunding. Information Systems Research 27: 478–96. [Google Scholar] [CrossRef]
Butt, Umer Ahmed, Muhammad Mehmood, Syed Bilal Hussain Shah, Rashid Amin, M. Waqas Shaukat, Syed Mohsan Raza, Doug Young Suh, and Md. Jalil Piran. 2020. A Review of Machine Learning Algorithms for Cloud Computing Security. Electronics 9: 1379. [Google Scholar] [CrossRef]
Cardona, Luis F., Jaime A. Guzmán-Luna, and Jaime A. Restrepo-Carmona. 2024. Bibliometric Analysis of Intelligent Systems for Early Anomaly Detection in Oil and Gas Contracts: Exploring Recent Progress and Challenges. Sustainability 16: 4669. [Google Scholar] [CrossRef]
Chen, Liang, Tong W. Tong, Shaoqin Tang, and Nianchen Han. 2022. Governance and Design of Digital Platforms: A Review and Future Research Directions on a Meta-Organization. Journal of Management 48: 147–84. [Google Scholar] [CrossRef]
Choi, Jaewon, Jaehyoun Kim, and Ho Lee. 2022. Hybrid Fraud Detection Model: Detecting Fraudulent Information in the Healthcare Crowdfunding. KSII Transactions on Internet and Information Systems (TIIS) 16: 1006–27. [Google Scholar]
Cicchiello, Antonella Francesca, Francesca Battaglia, and Stefano Monferrà. 2019. Crowdfunding Tax Incentives in Europe: A Comparative Analysis. The European Journal of Finance 25: 1856–82. [Google Scholar] [CrossRef]
Cumming, Douglas, Lars Hornuf, Moein Karami, and Denis Schweizer. 2021. Disentangling Crowdfunding from Fraudfunding. Journal of Business Ethics 1: 26. [Google Scholar]
Elitzur, Ramy. 2024. Machine Learning and Non-Investment Crowdfunding Research: A Tutorial. Journal of Alternative Finance 1: 109–27. [Google Scholar] [CrossRef]
Ellman, Matthew, and Sjaak Hurkens. 2019a. Fraud Tolerance in Optimal Crowdfunding. Economics Letters 181: 11–16. [Google Scholar] [CrossRef]
Ellman, Matthew, and Sjaak Hurkens. 2019b. Optimal Crowdfunding Design. Journal of Economic Theory 184: 104939. [Google Scholar] [CrossRef]
Elmer, Greg, and Sabrina Ward-Kimola. 2023. Crowdfunding (as) Disinformation: ‘Pitching’ 5G and Election Fraud Campaigns on GoFundMe. Media, Culture and Society 45: 578–94. [Google Scholar] [CrossRef]
Freedman, Seth, and Ginger Zhe Jin. 2011. Learning by Doing with Asymmetric Information: Evidence from Prosper.com. Working Paper. New York: National Bureau of Economic Research, Inc. [Google Scholar]
Gawer, Annabelle. 2021. Digital Platforms’ Boundaries: The Interplay of Firm Scope, Platform Sides, and Digital Interfaces. Long Range Planning 54: 102045. [Google Scholar] [CrossRef]
Gawer, Annabelle. 2022. Digital Platforms and Ecosystems: Remarks on the Dominant Organizational Forms of the Digital Age. Innovation 24: 110–24. [Google Scholar] [CrossRef]
Goodell, John W., Satish Kumar, Weng Marc Lim, and Debidutta Pattnaik. 2021. Artificial Intelligence and Machine Learning in Finance: Identifying Foundations, Themes, and Research Clusters from Bibliometric Analysis. Journal of Behavioral and Experimental Finance 32: 100577. [Google Scholar] [CrossRef]
Granato, Daniel, Jânio S. Santos, Graziela B. Escher, Bruno L. Ferreira, and Rubén M. Maggio. 2018. Use of Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA) for Multivariate Association Between Bioactive Compounds and Functional Properties in Foods: A Critical Perspective. Trends in Food Science & Technology 72: 83–90. [Google Scholar]
Hamed, Suhaib Kh, Ab Aziz, Mohd Juzaiddin, and Mohd Ridzwan Yaakub. 2023. Fake news detection model on social media by leveraging sentiment analysis of news content and emotion analysis of users’ comments. Sensors 23: 1748. [Google Scholar] [CrossRef]
Han, Wenying, and Hao Dang. 2020. Product Crowdfunding Default Risk Warning Based on Random Forest Model. Paper present at the 7th International Conference on Management of e-Commerce and e-Government, Jeju Island, Republic of Korea, July 1–3; pp. 99–105. [Google Scholar]
Hou, Wenting, and Jian Qu. 2023. BM5-SP-SC: A Dual Model Architecture for Contradiction Detection on Crowdfunding Projects. Current Applied Science and Technology 10: 55003. [Google Scholar] [CrossRef]
Huo, Hong, Chen Wang, Chunjia Han, Mu Yang, and Wen-Long Shang. 2024. Risk Disclosure and Entrepreneurial Resource Acquisition in Crowdfunding Digital Platforms: Evidence from Digital Technology Ventures. Information Processing and Management 61: 103655. [Google Scholar] [CrossRef]
Jemili, Farah, Rahma Meddeb, and Ouajdi Korbaa. 2023. Intrusion detection based on ensemble learning for big data classification. Cluster Computing 27: 3771–98. [Google Scholar] [CrossRef]
Kalinin, Maxim, and Vasiliy Krundyshev. 2023. Security Intrusion Detection Using Quantum Machine Learning Techniques. Journal of Computer Virology and Hacking Techniques 19: 125–36. [Google Scholar] [CrossRef]
Kumar, Naveen, Deepak Venugopal, Liangfei Qiu, and Subodha Kumar. 2018. Detecting Review Manipulation on Online Platforms with Hierarchical Supervised Learning. Journal of Management Information Systems 35: 350–80. [Google Scholar] [CrossRef]
Lathifah, Ari, Faaza Bil Amri, and Ani Rosidah. 2022. Security Vulnerability Analysis of the Sharia Crowdfunding Website Using OWASP-ZAP. Paper present at the 2022 10th International Conference on Cyber and IT Service Management (CITSM), Yogyakarta, Indonesia, September 20–21; New York: IEEE, pp. 1–5. [Google Scholar]
Lee, SeungHun, Wafa Shafqat, and Hyun-Chul Kim. 2022. Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns. Sensors 22: 7677. [Google Scholar] [CrossRef]
Lee, Won Sang, and So Young Sohn. 2019. Discovering Emerging Business Ideas Based on Crowdfunded Software Projects. Decision Support Systems 116: 102–13. [Google Scholar] [CrossRef]
Li, Qi, and Jian Qu. 2022. A Novel BNB-NO-BK Method for Detecting Fraudulent Crowdfunding Projects. Songklanakarin Journal of Science and Technology 44: 1209–19. [Google Scholar]
Louati, Faten, Farah Barika Ktata, and Ikram Amous. 2024. Big-IDS: A Decentralized Multi Agent Reinforcement Learning Approach for Distributed Intrusion Detection in Big Data Networks. Cluster Computing 27: 6823–6841. [Google Scholar] [CrossRef]
Markas, Ruhaab, and Yisha Wang. 2019. Dare to Venture: Data Science Perspective on Crowdfunding. SMU Data Science Review 2: 19. [Google Scholar]
Meoli, Michele, Alice Rossi, and Silvio Vismara. 2022. Financial Literacy and Security-Based Crowdfunding. Corporate Governance: An International Review 30: 27–54. [Google Scholar] [CrossRef]
Minh, Pham Son, Hung-Son Dang, and Nguyen Canh Ha. 2023. Optimization of 3D Cooling Channels in Plastic Injection Molds by Taguchi-Integrated Principal Component Analysis (PCA). Polymers 15: 1080. [Google Scholar] [CrossRef] [PubMed]
Mohammadi, Ali, Fraydoon Rahnamay Roodposhti, Hoda Hemmati, and Narges Yazdanian. 2025. Identification and Modeling of Crowdfunding Risk Indicators in FinTech-Based Businesses Based on the Combined Approach of Thematic Analysis and Partial Least Squares in SEM. International Journal of Finance and Managerial Accounting 10: 13–24. [Google Scholar]
Montgomery, Douglas C., and George Runger. 2020. Applied Statistics and Probability for Engineers. New York: John Wiley & Sons Ltd. [Google Scholar]
Mvula, Paul K., Paula Branco, Guy-Vincent Jourdan, and Herna L. Viktor. 2023. A Systematic Literature Review of Cyber-Security Data Repositories and Performance Assessment Metrics for Semi-Supervised Learning. Discover Data 1: 4. [Google Scholar] [CrossRef]
Page, Matthew J., Joanne E. McKenzie, Patrick M. Bossuyt, Isabelle Boutron, Tammy C. Hoffmann, Cynthia D. Mulrow, Larissa Shamseer, Jennifer M. Tetzlaff, Elie A. Akl, Sue E. Brennan, and et al. 2021. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 372: n71. [Google Scholar] [CrossRef]
Perez, Beatrice, Sara Machado, Jerone Andrews, and Nicolas Kourtellis. 2022. I Call BS: Fraud Detection in Crowdfunding Campaigns. Paper present at the 14th ACM Web Science Conference, Barcelona, Spain, June 26–29; pp. 1–11. [Google Scholar]
Petrov, Lev F., and Ellina S. Emelyanova. 2021. The Crowdfunding: Financial Flows and Risks. CEUR Workshop Proceedings 2830: 41–51. [Google Scholar]
Pranckutė, Raminta. 2021. Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World. Publications 9: 12. [Google Scholar] [CrossRef]
Prateek, Pranay, Dan J. Kim, and Ling Ge. 2021. Detection of Fraudulent Campaigns on Donation-Based Crowdfunding Platforms Using a Combination of Machine. Paper present at the 16th Pre-ICIS Workshop on Information Security and Privacy, Austin, TX, USA, December 12; pp. 1–9. [Google Scholar]
Raflesia, Sarifah Putri, Dinda Lestarini, Rizka Dhini Kurnia, and Dinna Yunika Hardiyanti. 2023. Using Machine Learning Approach Towards Successful Crowdfunding Prediction. Bulletin of Electrical Engineering and Informatics 12: 2438–45. [Google Scholar] [CrossRef]
Ratten, Vanessa. 2023. Digital Platforms and Transformational Entrepreneurship During the COVID-19 Crisis. International Journal of Information Management 72: 102534. [Google Scholar] [CrossRef] [PubMed]
Riadi, Imam, Ariqah Adliana Siregar, and Adiniah Gustika Pratiwi. 2022. Security on Charity Crowdfunding Services Using KAMI Index 4.1. Engineering Science Letter 1: 15–19. [Google Scholar]
Roshan, Khushnaseeb, and Aasim Zafar. 2024. Ensemble Adaptive Online Machine Learning in Data Stream: A Case Study in Cyber Intrusion Detection System. International Journal of Information Technology. [Google Scholar] [CrossRef]
Shafqat, Wafa, and Yung-Cheol Byun. 2019. Topic Predictions and Optimized Recommendation Mechanism Based on Integrated Topic Modeling and Deep Neural Networks in Crowdfunding Platforms. Applied Sciences 9: 5496. [Google Scholar] [CrossRef]
Shafqat, Wafa, Yung-Cheol Byun, and Namje Park. 2020. Effectiveness of Machine Learning Approaches Towards Credibility Assessment of Crowdfunding Projects for Reliable Recommendations. Applied Sciences 10: 9062. [Google Scholar] [CrossRef]
Sharifani, Koosha, and Mahyar Amini. 2023. Machine Learning and Deep Learning: A Review of Methods and Applications. World Information Technology and Engineering Journal 10: 3897–904. [Google Scholar]
Solis-Diaz, Christian Javier. 2023. Education as a Solution to Combat Rising Cybercrime Rates against Children and Teenagers. Electronic Theses, Projects, and Dissertations. Available online: https://scholarworks.lib.csusb.edu/etd/1811 (accessed on 12 July 2024).
Talukder, Alamin, Manowarul Islam, Ashraf Uddin, Khondokar Fida Hasan, Selina Sharmin, Salem A. Alyami, and Mohammad Ali Moni. 2024. Machine Learning-Based Network Intrusion Detection for Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction. Journal of Big Data 11: 33. [Google Scholar] [CrossRef]
Vicari, Stefania, and Daniel Kirby. 2023. Digital Platforms as Socio-Cultural Artifacts: Developing Digital Methods for Cultural Research. Information, Communication & Society 26: 1733–55. [Google Scholar]
Winoto, Wahyu, and Permata Wulandari. 2023. Explorative Analysis of Securities Crowdfunding: Pillars, Business Flow and Risk Mitigation in MSME Funding in Indonesia. Management Studies and Entrepreneurship Journal (MSEJ) 4: 3206–21. [Google Scholar]
Wonglimpiyarat, Jarunee. 2018. Challenges and Dynamics of FinTech Crowdfunding: An Innovation System Approach. The Journal of High Technology Management Research 29: 98–108. [Google Scholar] [CrossRef]
Wu, Xiaohang, Hasan Dinçer, and Serhat Yüksel. 2022. Analysis of Crowdfunding Platforms for Microgrid Project Investors via a Q-Rung Orthopair Fuzzy Hybrid Decision-Making Approach. Financial Innovation 8: 52. [Google Scholar] [CrossRef]
Xu, Yang, Quanlin Li, Cheng Zhang, Yunlin Tan, Ping Zhang, Gguojun Wang, and Yaoxue Zhang. 2023. A Decentralized Trust Management Mechanism for Crowdfunding. Information Sciences 638: 118969. [Google Scholar] [CrossRef]
Yadav, Pavinder, Nidhi Gupta, and Pawan Kumar Sharma. 2023. A Comprehensive Study Towards High-Level Approaches for Weapon Detection Using Classical Machine Learning and Deep Learning Methods. Expert Systems with Applications 212: 118698. [Google Scholar] [CrossRef]
Yang, Yunpeng, Nan Chen, and Hongmin Chen. 2023. The Digital Platform, Enterprise Digital Transformation, and Enterprise Performance of Cross-Border E-Commerce-From the Perspective of Digital Transformation and Data Elements. Journal of Theoretical and Applied Electronic Commerce Research 18: 777–94. [Google Scholar] [CrossRef]
Zenone, Marco, and Jeremy Snyder. 2019. Fraud in Medical Crowdfunding: A Typology of Publicized Cases and Policy Recommendations. Policy and Internet 11: 215–34. [Google Scholar] [CrossRef]
Zhou, Xiaoyang, He Liu, Jialu Li, Kai Zhang, and Benjamin Lev. 2023. Channel Strategies When Digital Platforms Emerge: A Systematic Literature Review. Omega 120: 102919. [Google Scholar] [CrossRef]
Zkik, Karim, Anass Sebbar, Oumaima Fadi, Sachin Kamble, and Amine Belhadi. 2023. Securing Blockchain-Based Crowdfunding Platforms: An Integrated Graph Neural Networks and Machine Learning Approach. Electronic Commerce Research 1: 37. [Google Scholar] [CrossRef]
Zribi, Sirine. 2022. Effects of Social Influence on Crowdfunding Performance: Implications of the COVID-19 Pandemic. Humanities and Social Sciences Communications 9: 192. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram for crowdfunding-platform fraud detection.

Figure 2. Country distribution of the total citations of the main works gathered.

Figure 3. Universities’ research distribution of the total citations, and the number of authors of the main works gathered.

Figure 4. Author distribution between the total citations and the number of publications carried out in the field of research.

Figure 5. Biplot analysis of the country (a) and the university’s (b) component distribution.

Figure 6. Dendrogram analysis of the country aggrupation.

Figure 7. Dendrogram analysis of the university’s aggrupation using nearest neighbor method coupled with squared Euclidean.

Figure 8. Comparison of the statistical analysis of the (a) country, (b) universities, (c) publisher, and (d) journal with the total citations, at a 95% confidence level.

Figure 9. Advanced machine learning approaches for cybersecurity intrusion detection (Kalinin and Krundyshev 2023; Louati et al. 2024; Talukder et al. 2024; Jemili et al. 2023; Roshan and Zafar 2024).

Figure 10. Challenges of Machine Learning in Fraud Detection on Crowdfunding Platforms (Lee and Sohn 2019; Butt et al. 2020; Raflesia et al. 2023; Kumar et al. 2018; Wonglimpiyarat 2018).

Figure 11. Aspects of social education and knowledge appropriation of fraud (Solis-Diaz 2023; Ahmad et al. 2021).

Figure 12. Machine Learning implementation process for fraud detection on crowdfunding platforms.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cardona, L.F.; Guzmán-Luna, J.A.; Restrepo-Carmona, J.A. Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms. J. Risk Financial Manag. 2024, 17, 352. https://doi.org/10.3390/jrfm17080352

AMA Style

Cardona LF, Guzmán-Luna JA, Restrepo-Carmona JA. Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms. Journal of Risk and Financial Management. 2024; 17(8):352. https://doi.org/10.3390/jrfm17080352

Chicago/Turabian Style

Cardona, Luis F., Jaime A. Guzmán-Luna, and Jaime A. Restrepo-Carmona. 2024. "Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms" Journal of Risk and Financial Management 17, no. 8: 352. https://doi.org/10.3390/jrfm17080352

APA Style

Cardona, L. F., Guzmán-Luna, J. A., & Restrepo-Carmona, J. A. (2024). Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms. Journal of Risk and Financial Management, 17(8), 352. https://doi.org/10.3390/jrfm17080352

Article Menu

Bibliometric Analysis of the Machine Learning Applications in Fraud Detection on Crowdfunding Platforms

Abstract

1. Introduction

2. Materials and Methods

2.1. Bibliometric Analysis

2.2. Statistical Analysis

3. Results

3.1. Overview of Main Works

3.2. Principal Component Analysis (PCA) Results

3.3. Statistical Comparison Analysis

4. Challenges and Further Studies on Fraud Detection in Crowdfunding Platforms

4.1. Data Challenges and ML Algorithms

4.2. Improvements in Crowdfunding Platforms: Design, Policies and Inter-Operability

4.3. Strategies and Challenges in Cybersecurity and Fraud Education: Methods, Academic Programs and Gamification Learning Platforms

4.4. Implementing Machine Learning for Fraud Detection in Crowdfunding Platforms

5. Limitations of the Study

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

List of Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI