1. Introduction
Today, the use and management of data in various fields has increased significantly, facilitating business and industry analysis for decision making [
1]. The amount of data generated is growing exponentially [
2]. It is estimated that around 2.5 quintillion bytes of data are created every day [
3]. In other words, the data are fully available to those willing to mine them. Data are a valuable asset for businesses in the 21st century [
4]. Clive Humby, a British mathematician, stated that ‘data is the new oil’ in 2006. Like oil, data are not valuable in their raw state; their value comes when they are collected quickly, completely and accurately, and linked to other relevant data [
5].
The public sector and tax analytics are also seizing the opportunity to use data to combat tax evasion [
6]; public resources management is a critical government function that helps ensure fiscal responsibility and accountability. The public budget, usually set for one year, is a fundamental plan of action that reflects the government’s priorities and objectives through the amounts allocated for revenue and expenditure [
7]. Public expenditure management ensures that resources are used efficiently and effectively and supports the delivery of essential public services. It is, therefore, a key element of good governance.
In addition, the management of public expenditures is essential to achieve macroeconomic stability and can help control inflation and reduce deficits [
8]. It also promotes transparency and accountability of public finances [
9,
10]. By providing a clear and understandable picture of spending, citizens can better assess economic factors and their impact on society. This allows them to support or challenge spending decisions in a more informed way, while governments can demonstrate progress against their commitments [
11].
Today, problems such as corruption, lack of transparency and inefficiency are among the main difficulties facing the public sector [
12]. These problems impede the proper functioning and management of government, which has a negative impact on the development of a nation and the well-being of its citizens. The institutional design of local governments has a significant impact on poverty: the increased risk of corruption and inefficiency in the use of transfers for education and health spending increase poverty. On the contrary, transparent governance can reduce it; in fact, a one percentage point increase in transparency and relative efficiency indices reduces the poverty rate by 0.6 percentage points [
13].
The risk of corruption tends to increase with the size of the local state and decrease with improvements in fiscal performance, tax collection, and average years of education [
14]. It is essential to implement strategies to mitigate this situation, one of the most promising is the use of analytics and data management in the public sector [
15]. Technology in public administration (e-government) has enabled the delivery of public services over the Internet, promoting efficiency in data collection, processing, and reporting, and improving decision-making [
16].
Advances in smart technologies, better informed and connected citizens, and globally interconnected economies have created new opportunities [
17]. The concept of e-government is being enhanced by governments by recognizing the power of data and heuristic processing through artificial intelligence (AI) to improve services, interact with citizens and society, propose new policies, and implement solutions for the well-being of the community, therefore, becoming a smart government [
18].
However, the diverse sources and formats of public sector documents make the collection, processing and organization of these documents challenging from a data analytics perspective [
19]. It is important to develop approaches to managing these data, as their analysis allows for greater citizen involvement by giving them more access to public decisions and spending [
20], increasing transparency in the public sector, and giving citizens a greater sense of accountability by providing different views on government performance in meeting its public policy objectives. As an example, Valle-Cruz et al. [
16] presented the use of AI as a promising tool for intelligent monitoring of the public sector and combating tax evasion. The availability of this information is argued to help stakeholders make better-informed decisions. The use of AI would work like any other decision support system, providing several alternatives and a wealth of information to enable the final decision to be made.
Some of the advantages of AI-based techniques relate to their ability to analyze any data, regardless of organization, size, or format [
21]. However, some limitations of AI-based methods are related to the computational capacity available at the time. For example, Yahyaoui and Tkiouat [
22] showed how reinforcement learning based on Markov models and partially observable computations of the behavior of a contributing agent can refine the analysis of an audit policy. They showed that by synchronizing procedures and dynamically updating intelligent behaviors, a hybrid model can be built that combines bottom-up agent-based execution with various partially observable intelligent behaviors. This can serve as a platform for testing the effectiveness of an audit policy and for training an intelligent auditor. Rukanova et al. [
23] developed a framework for the value of data analysis in government oversight that is intended to serve as a tool for analysis and understanding. In their study, they found that collective capacity building processes in data analytics and their link to capacity processes in an individual organization are very interesting but not yet well understood.
This paper contributes to the field of information by describing how effective data management and processing can significantly enhance the quality of information used for financial control and auditing practices, in the digital era where several industries use digital technologies to support their operations [
24]. To do so, we defined the following research questions (RQ): RQ1. What are the research trends regarding data capture, storage, processing, interoperability, and visualization in the context of fiscal control? RQ2. Which countries have contributed the most to these topics in the last years? RQ3. What groups of co-authors are more representative of collaborative research on these topics? The study emphasizes the critical role of accurate data collection and secure storage as foundational elements for reliable fiscal oversight. Provides information on advanced data processing techniques that enable proactive decision-making in public expenditure management. In addition, the paper underscores the importance of interoperable systems that facilitate seamless data exchange between government entities, promoting efficiency and transparency. Finally, the article advocates for the adoption of appropriate data visualization strategies to improve communication of financial insights and support evidence-based policy-making. The paper contributes by providing a stakeholder-driven framework for designing data management architectures that enhance real-time supervision and interoperability in public expenditure oversight.
The organization of the paper is as follows.
Section 2 contains the bibliometric analysis on the role of data collection, storage, processing, interoperability, and visualization in improving the smart monitoring of public expenditure.
Section 3 provides an overview of data capture techniques, data capture tools, data capture types, and a data capture architecture proposed for the CGR.
Section 4 describes the data storage options, on-premises and in the cloud, and the data storage architecture proposed for the CGR.
Section 5 contains the three stages in data processing: data cleaning, data transformation, data analysis, and a data processing architecture proposed for the CGR.
Section 6 refers to the architecture that facilitates effective interoperability.
Section 7 provides an overview of the basic principles and advanced technologies related to data visualization in the modern era.
Section 8 contains a case report for fiscal control in Colombia and provides user requirement lists, generated from workshops that involved multiple stakeholders. Then,
Section 9 contains the discussion, and finally, conclusions are presented in
Section 10.
8. Results: A Case Study for Fiscal Control in Colombia
The Colombian Comptroller General’s Office (CGR, or Contraloría General de la República) is the nation’s highest fiscal oversight authority, which is responsible for ensuring the efficient and lawful use of public resources. Established as a constitutional entity, the CGR has evolved significantly since the 1945 constitutional reform that transformed it from a technical accounting department into a comprehensive supervisory body. Subsequent reforms in 1968 and 1976 refined its role [
129], emphasizing financial and legal oversight, and introducing prior and perceptual control mechanisms. The 1991 constitutional reform shifted to posterior and selective control [
130], reducing interference in audited entities’ decisions and promoting citizen participation in fiscal monitoring. In 2019 further modifications introduced concurrent and preventive control measures [
131]. Over time, the CGR has integrated advanced technologies like big data analytics and AI to enhance its real-time oversight capabilities while continuing to foster transparency, accountability, and citizen involvement in safeguarding public resources.
Recognizing the need to process large volumes of data to ensure the proper use of public resources, the CGR’s strategic plan for 2018–2022 focused on enabling digital transformation through enterprise architecture [
132]. The plan included the creation of the Integrated Information Center, a hub designed to enhance data monitoring, analysis, and interoperability between the organization and its strategic partners. The Center leverages cutting-edge technologies, such as AI, ML, and blockchain, to support fiscal control actions, strategic decision-making, and the optimization of institutional processes, while also fostering inter-institutional collaboration and enhancing participatory fiscal control. In this regard, the CGR established the Directorate of Information, Analysis, and Immediate Reaction (DIARI) as part of the country’s broader efforts to integrate advanced technologies into the supervision of public resources.
The establishment of DIARI was formalized by the Legislative Act 04 of 2019 and was regulated by Decree Law 2037 of 2019 [
133]. The DIARI is structured into three units: Information, Analysis, and Immediate Reaction, each playing a crucial role in accessing and analyzing data to issue preventive alerts and assess risks to public resources [
134]. The Information Unit connects to relevant data sources for fiscal supervision, ensuring data quality before passing it to the Analysis Unit. The Analysis Unit develops and validates analytical models to address critical business questions, producing data reports, alerts, and dashboards that highlight potential fiscal risks. The Immediate Reaction Unit is responsible for implementing timely and effective real-time supervision actions to protect public resources at risk of imminent loss. Together, these units ensure comprehensive monitoring, analysis, and rapid response in safeguarding public resources [
134].
This new fiscal control model, which combines preventive, concomitant, and subsequent oversight, aims to improve the efficiency and timeliness of fiscal control in Colombia. To support this model, there is a critical need for advanced data capture systems that can efficiently gather information from diverse sources, robust data storage solutions that handle large volumes of information securely, and powerful data processing capabilities to analyze and extract actionable insights in real-time. Furthermore, achieving interoperability between various data systems is essential to ensure seamless information exchange across platforms. Effective data visualization tools are also necessary to present complex data in an accessible and intuitive way, allowing decision-makers to quickly identify potential risks and respond accordingly. The DIARI’s multidisciplinary team, composed of professionals from over 15 fields, is tasked with integrating these needs by connecting information sources, structuring data, and coordinating continuous monitoring efforts, ensuring that the CGR can respond quickly to potential threats to public resources.
9. Discussion
The literature review examines essential components of effective data management crucial for enhancing public expenditure oversight. It begins by highlighting the importance of understanding diverse data types and acquisition methodologies, focusing on recent research trends in data capture, storage, processing, interoperability, and visualization within the context of fiscal control. A significant trend is the increasing adoption of cloud-based solutions, which provide scalability and accessibility advantages over traditional on-premise data storage. This shift reflects broader global movements toward digital transformation in the public sector.
A key finding is the critical role of interoperability in facilitating information exchange among various systems and organizations within the public sector. Recent studies emphasize the integration of smart government processes—such as financial reporting and auditing—strengthened by technologies like the Internet of Things (IoT), Software as a Service (SaaS), cloud computing, and APIs. Countries such as the United States, Canada, and members of the European Union have been at the forefront of research in these areas, contributing significantly to the literature and best practices for effective fiscal oversight.
The role of public managers at various levels, from central government to municipal offices, presents both challenges and opportunities in the realm of data management and fiscal control. Public managers often face challenges such as resource constraints, resistance to change, and the need for training in advanced technologies. At the central government level, managers are tasked with implementing comprehensive data management strategies across multiple agencies, requiring coordination and integration of disparate systems. Conversely, this complexity also presents an opportunity for central managers to champion cross-agency collaboration, promoting a unified approach to fiscal oversight that can enhance overall transparency and efficiency.
At the municipal level, public managers deal with unique challenges, including limited budgets and a narrower range of technological resources. However, municipalities also have the advantage of being closer to citizens, which allows for more direct engagement and feedback. This proximity enables managers to tailor data management initiatives to the specific needs of their communities, fostering innovation in public service delivery. By leveraging local partnerships and community resources, municipal managers can implement agile solutions that enhance data capture and visualization, improving oversight of public expenditures.
Furthermore, the discussion must include the evolving role of citizens in Society 5.0, where technology enhances human-centric approaches to governance. Citizens are not only users of public services but also active participants in monitoring public spending. In this context, open data initiatives empower citizens to engage with government data, enhancing transparency and accountability. By providing tools and platforms for data visualization, governments can encourage citizen participation in fiscal oversight, allowing communities to identify potential risks and hold officials accountable.
This participatory model shifts the perception of citizens from passive recipients of services to active co-producers of governance, aligning with the principles of Society 5.0, which emphasizes a symbiotic relationship between humans and technology. Citizens equipped with data literacy skills can analyze public spending patterns and advocate for responsible resource allocation, thereby reinforcing democratic engagement and civic responsibility.
Ongoing advancements in technologies supporting public spending supervision are essential. These developments aim to produce more accurate reviews, leveraging automation to reduce constraints on fiscal supervision capacity and optimize resource allocation. Notably, collaborative research efforts have emerged as a vital strategy in this domain. Analysis of co-authorship networks indicates that research groups from academia, governmental organizations, and industry are increasingly collaborating on projects related to fiscal control. Identifying leading co-authors in this field reveals a network of prominent scholars and practitioners who are shaping research trends and fostering interdisciplinary approaches.
The SWOT analysis conducted with diverse stakeholders proved to be a vital tool for understanding user requirements in the design of respective architectures. By involving DIARI officials, industry experts, and academic representatives, the SWOT analysis offered a comprehensive view of the strengths, weaknesses, opportunities, and threats in current data management processes. This collaborative approach highlighted specific needs and challenges, facilitating the identification of strategic improvement areas. Insights from this analysis were instrumental in ensuring that the developed architectures are both technically robust and aligned with practical realities and user expectations.
User requirement lists, generated from multi-stakeholder workshops, are critical for designing a robust data management architecture tailored to organizational needs. These lists ensure that the architecture addresses diverse functional requirements, aligns with strategic goals, and prioritizes key features like real-time data capture, secure storage, and interoperability. Reflecting stakeholder input enhances system flexibility, collaboration, and ownership while also mitigating risks such as integration challenges and data quality issues. This comprehensive approach ensures effective support for smart oversight of public expenditure.
9.2. Proposed Research Agenda
The literature review and case study on Colombia highlight several critical areas for future research in data management for public expenditure oversight. A proposed research agenda focuses on key topics, existing gaps, and methodologies to enhance fiscal control through effective data handling.
First, data capture methodologies need further exploration, particularly in acquiring data from real-time sources such as IoT devices and social media. Research should investigate the role of emerging technologies like blockchain in improving data capture efficiency and security, utilizing comparative case studies across countries.
Second, studies on cloud-based storage solutions versus traditional systems are essential, especially concerning sustainability, cost-effectiveness, and security in developing countries like Colombia. Longitudinal studies could map performance and user satisfaction over time, informing better decision-making.
Research on interoperability frameworks is also crucial, focusing on integrating data systems across public institutions. A design science research approach could develop frameworks that address the specific data-sharing needs of different government levels and citizens.
The contribution of data processing and analytics to fiscal oversight should be examined, particularly the application of machine learning for predictive analytics in risk assessment. Experimental designs can identify effective algorithms for detecting financial anomalies.
Data visualization for public engagement warrants focused research to enhance citizen oversight of public funds. Investigating how different demographic groups interact with visualization formats will provide insights into engagement strategies, utilizing qualitative methods like surveys and focus groups.
The impact of citizen participation in Society 5.0 is significant, emphasizing citizens as key players in holding public spending accountable. Research should explore participatory methods that enable citizens to co-design oversight tools, testing their effectiveness in promoting transparency.
Lastly, examining collaborative research networks will reveal trends in co-authorship and collaborative efforts in public expenditure oversight. Bibliometric analysis can identify major research networks and their contributions, while qualitative studies can explore collaboration dynamics.
Author Contributions
Conceptualization, J.A.R.-C., C.A.E., J.S.-P. and R.E.V.; methodology, J.A.R.-C., C.A.E., J.S.-P. and R.E.V.; validation, J.A.R.-C., J.C.Z., R.M.V., O.M., Á.M.H., J.S.-P. and R.E.V.; formal analysis, J.A.R.-C., M.V., C.Z., C.A.E., J.S.-P. and R.E.V.; investigation, J.A.R.-C., J.C.Z., R.M.V., M.V., C.Z., O.M., Á.M.H., C.A.E., J.S.-P. and R.E.V.; writing—original draft preparation, M.V., C.Z., J.S.-P. and R.E.V.; writing—review and editing, J.A.R.-C., J.S.-P. and R.E.V.; supervision, J.A.R.-C., J.C.Z. and R.M.V. All authors have read and agreed to the published version of the manuscript.
Funding
This work was developed with the funding of Contraloría General de la República (CGR) and Universidad Nacional de Colombia (UNAL) in the frame of contracts CGR-373-2023 and CGR-379-2023, with the support of Universidad Pontificia Bolivariana (UPB).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are available on request from the authors.
Conflicts of Interest
Authors Jaime A. Restrepo-Carmona, Carlos A. Escobar, Julián Sierra-Pérez and Rafael E. Vásquez were employed by the company Corporación Rotorr, Universidad Nacional de Colombia. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
AI | Artificial Intelligence |
API | Application Programming Interface |
CGR | Contraloría General de la República |
| (Office of the Comptroller General of the Republic) |
CPU | Central Processing Unit |
DAMA | Data Management International |
DIARI | Directorate of Information, Analysis, and Immediate Reaction of the CGR |
DT | Digital Transformation |
ETL | Extract, Transform, and Load |
GFS | Google File System |
HDFS | Hadoop Distributed File System |
IoT | Internet of things (IoT) |
OCR | Optical Character Recognition |
PCA | Principal Component Analysis |
RQ | Research Question |
SaaS | Software as a Service |
SQL | Structured Query Language |
TOGAF | The Open Group Architecture Framework |
UI | User Interface |
UX | User Experience |
References
- DAMA-International. The Data Management Body of Knowledge (DAMA DMBOK); Technics Publications: Sedona, AZ, USA, 2017. [Google Scholar]
- Lone, K.; Sofi, S.A. Effect of Accountability, Transparency and Supervision on Budget Performance. Utopía Y Prax. Latinoam. 2023, 25, 130–143. [Google Scholar]
- Bharany, S.; Sharma, S.; Khalaf, O.I.; Abdulsahib, G.M.; Al Humaimeedy, A.S.; Aldhyani, T.H.H.; Maashi, M.; Alkahtani, H. A Systematic Survey on Energy-Efficient Techniques in Sustainable Cloud Computing. Sustainability 2022, 14, 6256. [Google Scholar] [CrossRef]
- Kubina, M.; Varmus, M.; Kubinova, I. Use of Big Data for Competitive Advantage of Company. Procedia Econ. Financ. 2015, 26, 561–565. [Google Scholar] [CrossRef]
- Spiekermann, S.; Novotny, A. A vision for global privacy bridges: Technical and legal measures for international data markets. Comput. Law Secur. Rev. 2015, 31, 181–200. [Google Scholar] [CrossRef]
- Fang, C. Taxation with information: Impacts of customs data exchange on tax evasion in Pakistan. Econ. Syst. 2024, 101243. [Google Scholar] [CrossRef]
- Sisto, R.; Garcia, J.; Quintanilla, A.; deJuanes, A.; Mendoza, D.; Lumbreras, J.; Mataix, C. Quantitative Analysis of the Impact of Public Policies on the Sustainable Development Goals through Budget Allocation and Indicators. Sustainability 2020, 12, 10583. [Google Scholar] [CrossRef]
- Alsaadi, A. Financial-tax reporting conformity, tax avoidance and corporate social responsibility. J. Financ. Report. Account. 2020, 18, 639–659. [Google Scholar] [CrossRef]
- Petropoulos, T.; Thalassinos, Y.; Liapis, K. Greek Public Sector’s Efficient Resource Allocation: Key Findings and Policy Management. J. Risk Financ. Manag. 2024, 17, 60. [Google Scholar] [CrossRef]
- Gao, S. An Exogenous Risk in Fiscal-Financial Sustainability: Dynamic Stochastic General Equilibrium Analysis of Climate Physical Risk and Adaptation Cost. J. Risk Financ. Manag. 2024, 17, 244. [Google Scholar] [CrossRef]
- Dammak, S.; Jmal Ep Derbel, M. Social responsibility and tax evasion: Organised hypocrisy of Tunisian professionals. J. Appl. Account. Res. 2023, 25, 325–354. [Google Scholar] [CrossRef]
- Adam, I.; Fazekas, M. Are emerging technologies helping win the fight against corruption? A review of the state of evidence. Inf. Econ. Policy 2021, 57, 100950. [Google Scholar] [CrossRef]
- Gidigbi, M.O. Assessing the impact of poverty alleviation programs on poverty reduction in Nigeria: Selected programs. Poverty Public Policy 2023, 15, 76–97. [Google Scholar] [CrossRef]
- Valle-Cruz, D.; García-Contreras, R. Towards AI-driven transformation and smart data management: Emerging technological change in the public sector value chain. Public Policy Adm. 2023, 09520767231188401. [Google Scholar] [CrossRef]
- Thomas, M.A.; Cipolla, J.; Lambert, B.; Carter, L. Data management maturity assessment of public sector agencies. Gov. Inf. Q. 2019, 36, 101401. [Google Scholar] [CrossRef]
- Valle-Cruz, D.; Fernandez-Cortez, V.; Gil-Garcia, J.R. From E-budgeting to smart budgeting: Exploring the potential of artificial intelligence in government decision-making for resource allocation. Gov. Inf. Q. 2022, 39, 101644. [Google Scholar] [CrossRef]
- Oliveira, T.A.; Oliver, M.; Ramalhinho, H. Challenges for Connecting Citizens and Smart Cities: ICT, E-Governance and Blockchain. Sustainability 2020, 12, 2926. [Google Scholar] [CrossRef]
- Kankanhalli, A.; Charalabidis, Y.; Mellouli, S. IoT and AI for Smart Government: A Research Agenda. Gov. Inf. Q. 2019, 36, 304–309. [Google Scholar] [CrossRef]
- Bendre, M.R.; Thool, V.R. Analytics, challenges and applications in big data environment: A survey. J. Manag. Anal. 2016, 3, 206–239. [Google Scholar] [CrossRef]
- Aftabi, S.Z.; Ahmadi, A.; Farzi, S. Fraud detection in financial statements using data mining and GAN models. Expert Syst. Appl. 2023, 227, 120144. [Google Scholar] [CrossRef]
- Parycek, P.; Schmid, V.; Novak, A.S. Artificial Intelligence (AI) and Automation in Administrative Procedures: Potentials, Limitations, and Framework Conditions. J. Knowl. Econ. 2023, 15, 8390–8415. [Google Scholar] [CrossRef]
- Yahyaoui, F.; Tkiouat, M. Partially observable Markov methods in an agent-based simulation: A tax evasion case study. Procedia Comput. Sci. 2018, 127, 256–263. [Google Scholar] [CrossRef]
- Rukanova, B.; Tan, Y.H.; Slegt, M.; Molenhuis, M.; van Rijnsoever, B.; Migeotte, J.; Labare, M.L.; Plecko, K.; Caglayan, B.; Shorten, G.; et al. Identifying the value of data analytics in the context of government supervision: Insights from the customs domain. Gov. Inf. Q. 2021, 38, 101496. [Google Scholar] [CrossRef]
- Wynn, M.; Jones, P. Corporate Responsibility in the Digital Era. Information 2023, 14, 324. [Google Scholar] [CrossRef]
- Haddaway, N.R.; Page, M.J.; Pritchard, C.C.; McGuinness, L.A. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst. Rev. 2022, 18, e1230. [Google Scholar] [CrossRef] [PubMed]
- Maté Jiménez, C. Big data. Un nuevo paradigma de análisis de datos. An. De Mecánica Y Electr. 2014, 91, 10–16. [Google Scholar]
- Muñoz, L.; Mazon, J.N.; Trujillo, J. ETL Process Modeling Conceptual for Data Warehouses: A Systematic Mapping Study. IEEE Lat. Am. Trans. 2011, 9, 358–363. [Google Scholar] [CrossRef]
- Mositsa, R.J.; Van der Poll, J.A.; Dongmo, C. Towards a Conceptual Framework for Data Management in Business Intelligence. Information 2023, 14, 547. [Google Scholar] [CrossRef]
- Cretu, C.; Gheonea, V.; Talaghir, L.; Manolache, G.; Iconomesu, T. Budget—Performance Tool in Public Sector. In Proceedings of the 5th WSEAS International Conference on Economy and Management Transformation, Timisoara, Romania, 24–26 October 2010. [Google Scholar]
- Dawar, K.; Oh, S.C. The Role of Public Procurement Policy in Drivingindustrial Development; Technical Report; United Nations Industrial Development Organization (UNIDO): Vienna, Austria, 2017. [Google Scholar]
- Adam, I.; Hernandez-Sanchez, A.; Fazeka, M. Global Public Procurement OpenCompetition Index; Technical report; Government Transparency Institute: Budapest, Hungary, 2021. [Google Scholar]
- CABRI. Value for Money in Public Spending; Technical Report; CABRI: Kabri, Israel, 2015. [Google Scholar]
- Popov, M.P.; Prykhodchenko, L.L.; Lesyk, O.V.; Dulina, O.V.; Holynska, O.V. Audit as an Element of Public Governance. Stud. Appl. Econ. 2021, 39, 1–9. [Google Scholar] [CrossRef]
- Abdullah, A.S.B.; Zulkifli, N.F.B.; Zamri, N.F.B.; Harun, N.W.B.; Abidin, N.Z.Z. The Benefits of Having Key Performance Indicators (KPI) in Public Sector. Int. J. Acad. Res. Account. Financ. Manag. Sci. 2022, 12, 719–726. [Google Scholar] [CrossRef]
- Leite, P.; George, T.; Sun, C.; Jones, T.; Lindert, K. Social Registries for Social Assistance and Beyond: A Guidance Note & Assessment Tool; Technical report; World Bank: Washington, DC, USA, 2017. [Google Scholar]
- Han, Y. The impact of accountability deficit on agency performance: Performance-accountability regime. Public Manag. Rev. 2020, 22, 927–948. [Google Scholar] [CrossRef]
- Vassiliadis, P. A Survey of Extract-Transform-Load Technology. Int. J. Data Warehous. Min. (IJDWM) 2009, 5, 1–27. [Google Scholar] [CrossRef]
- Yaqoob, I.; Hashem, I.A.T.; Gani, A.; Mokhtar, S.; Ahmed, E.; Anuar, N.B.; Vasilakos, A.V. Big data: From beginning to future. Int. J. Inf. Manag. 2016, 36, 1231–1247. [Google Scholar] [CrossRef]
- Eito-Brun, R. Gestión de Contenidos; Editorial UOC: Barcelona, Spain, 2014. [Google Scholar]
- Hernandez, A.T.; Vazquez, E.G.; Rincon, C.A.B.; Montero-García, J.; Calderon-Maldonado, A.; Ibarra-Orozco, R. Metodologías para analisispolitico utilizando web scraping. Res. Comput. Sci. 2015, 95, 113–121. [Google Scholar] [CrossRef]
- Kumar, N.; Gupta, M.; Sharma, D.; Ofori, I. Technical Job Recommendation System Using APIs and Web Crawling. Comput. Intell. Neurosci. 2022, 2022, 7797548. [Google Scholar] [CrossRef] [PubMed]
- Puñales, E.M.; Salgueiro, A.P. Aplicación de minería de datos a lainformación recuperada de la intranet para agrupar los resultados relevantes. In Jornada Científica ICIMAF-2015; Instituto de Cibernética, Matemática y Física: Havana, Cuba, 2015. [Google Scholar]
- Güemes, V.L. Business Intelligence Para la Toma de Decisiones Estratégicas: Un Casode Aplicación de Minería de Datos Dentro del Sector Bancario. Master’s Thesis, Universidad de Cantabria, Santander, Spain, 2019. [Google Scholar]
- Olson, D.L.; Lauhoff, G. Descriptive Data Mining; Springer: Singapore, 2019. [Google Scholar] [CrossRef]
- He, S.; Zhu, J.; He, P.; Lyu, M.R. Experience Report: System Log Analysis for Anomaly Detection. In Proceedings of the 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), Ottawa, ON, Canada, 23–27 October 2016. [Google Scholar] [CrossRef]
- Bahri, M.; Bifet, A.; Gama, J.; Gomes, H.M.; Maniu, S. Data stream analysis: Foundations, major tasks and tools. WIREs Data Min. Knowl. Discov. 2021, 11, e1405. [Google Scholar] [CrossRef]
- Krishnamurthi, R.; Kumar, A.; Gopinathan, D.; Nayyar, A.; Qureshi, B. An Overview of IoT Sensor Data Processing, Fusion, and Analysis Techniques. Sensors 2020, 20, 6076. [Google Scholar] [CrossRef] [PubMed]
- Bazzaz Abkenar, S.; Haghi Kashani, M.; Mahdipour, E.; Jameii, S.M. Big data analytics meets social media: A systematic review of techniques, open issues, and future directions. Telemat. Inform. 2021, 57, 101517. [Google Scholar] [CrossRef] [PubMed]
- Patnaik, S.K.; Babu, C.N.; Bhave, M. Intelligent and adaptive web data extraction system using convolutional and long short-term memory deep learning networks. Big Data Min. Anal. 2021, 4, 279–297. [Google Scholar] [CrossRef]
- Gangavarapu, T.; Jaidhar, C.D.; Chanduka, B. Applicability of machine learning in spam and phishing email filtering: Review and approaches. Artif. Intell. Rev. 2020, 53, 5019–5081. [Google Scholar] [CrossRef]
- Taleb, I.; Serhani, M.A.; Dssouli, R. Big Data Quality: A Survey. In Proceedings of the 2018 IEEE International Congress on Big Data (BigData Congress), San Francisco, CA, USA, 2–7 July 2018. [Google Scholar] [CrossRef]
- World Economic Forum. Data Integrity; Technical report; World Economic Forum: Cologne, Switzerland, 2020. [Google Scholar]
- OECD. Data Accessibility: Open, Free and Accessible Formats; OECD: Paris, France, 2019. [Google Scholar] [CrossRef]
- Nikiforova, A. Data Security as a Top Priority in the Digital World: Preserve Data Value by Being Proactive and Thinking Security First. In Springer Proceedings in Complexity; Springer International Publishing: Cham, Switzerland, 2023; pp. 3–15. [Google Scholar] [CrossRef]
- Gupta, I.; Singh, A.K.; Lee, C.N.; Buyya, R. Secure Data Storage and Sharing Techniques for Data Protection in Cloud Environments: A Systematic Review, Analysis, and Future Directions. IEEE Access 2022, 10, 71247–71277. [Google Scholar] [CrossRef]
- Blumzon, C.F.I.; Pănescu, A.T. Data Storage. In Good Research Practice in Non-Clinical Pharmacology and Biomedicine; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; pp. 277–297. [Google Scholar] [CrossRef]
- Mishra, S.P.; Sahoo, S.K.; Jena, B.; Tirthankar. Migrating on-premise application workloads to a hybrid cloud architecture. J. Inf. Optim. Sci. 2022, 43, 1099–1108. [Google Scholar] [CrossRef]
- Sriramoju, S. A Comprehensive Review on Data Storage. Int. J. Sci. Res. Sci. Technol. 2019, 6, 236–241. [Google Scholar] [CrossRef]
- Sen, R.; Sharma, A. Optimization of Cost: Storage over Cloud Versus on Premises Storage. In Proceedings of the 2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT), Gwalior, India, 10–12 April 2020; pp. 179–181. [Google Scholar] [CrossRef]
- Yang, P.; Xiong, N.; Ren, J. Data Security and Privacy Protection for Cloud Storage: A Survey. IEEE Access 2020, 8, 131723–131740. [Google Scholar] [CrossRef]
- Syed, A.; Purushotham, K.; Shidaganti, G. Cloud Storage Security Risks, Practices and Measures: A Review. In Proceedings of the 2020 IEEE International Conference for Innovation in Technology (INOCON), Bangluru, India, 6–8 November 2020. [Google Scholar] [CrossRef]
- Nachiappan, R.; Javadi, B.; Calheiros, R.N.; Matawie, K.M. Cloud storage reliability for Big Data applications: A state of the art survey. J. Netw. Comput. Appl. 2017, 97, 35–47. [Google Scholar] [CrossRef]
- Saadoon, M.; Ab. Hamid, S.H.; Sofian, H.; Altarturi, H.H.; Azizul, Z.H.; Nasuha, N. Fault tolerance in big data storage and processing systems: A review on challenges and solutions. Ain Shams Eng. J. 2022, 13, 101538. [Google Scholar] [CrossRef]
- Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R. The Hadoop Distributed File System. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, NV, USA, 3–7 May 2010. [Google Scholar] [CrossRef]
- Strohbach, M.; Daubert, J.; Ravkin, H.; Lischka, M. Big Data Storage. In New Horizons for a Data-Driven Economy; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 119–141. [Google Scholar] [CrossRef]
- Shin, H.; Lee, K.; Kwon, H.Y. A comparative experimental study of distributed storage engines for big spatial data processing using GeoSpark. J. Supercomput. 2021, 78, 2556–2579. [Google Scholar] [CrossRef]
- Verma, C.; Pandey, R. Comparative Analysis of GFS and HDFS: Technology and Architectural landscape. In Proceedings of the 2018 10th International Conference on Computational Intelligence and Communication Networks (CICN), Esbjerg, Denmark, 17–19 August 2018. [Google Scholar] [CrossRef]
- Wang, M.; Li, B.; Zhao, Y.; Pu, G. Formalizing Google File System. In Proceedings of the 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing, Singapore, 18–21 November 2014. [Google Scholar] [CrossRef]
- Rao, M.V. 16—Data duplication using Amazon Web Services cloud storage. In Data Deduplication Approaches; Thwel, T.T., Sinha, G., Eds.; Academic Press: Cambridge, MA, USA, 2021; pp. 319–334. [Google Scholar] [CrossRef]
- Mondal, A.S.; Sanyal, M.; Barua, H.B.; Chattopadhyay, S.; Mondal, K.C. Comparative Analysis of Object-Based Big Data Storage Systems on Architectures and Services: A Recent Survey. J. Inst. Eng. (India) Ser. B 2024, 105, 685–700. [Google Scholar] [CrossRef]
- Nambiar, A.; Mundra, D. An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management. Big Data Cogn. Comput. 2022, 6, 132. [Google Scholar] [CrossRef]
- Oracle. What is MySQL? 2024. Available online: https://www.oracle.com/mysql/what-is-mysql/ (accessed on 15 August 2024).
- Oracle. MySQL Documentation. 2024. Available online: https://dev.mysql.com/doc/ (accessed on 15 August 2024).
- IBM. What is PostgreSQL? 2024. Available online: https://www.ibm.com/topics/postgresql (accessed on 15 August 2024).
- The PostgreSQL Global Development Group. PostgreSQL 16.3. 2024. Available online: https://www.postgresql.org/docs/release/16.3/ (accessed on 15 August 2024).
- Microsoft. SQL Server Technical Documentation. 2024. Available online: https://learn.microsoft.com/en-us/sql (accessed on 15 August 2024).
- MongoDB. MongoDB Documentation. 2024. Available online: https://www.mongodb.com/docs/ (accessed on 15 August 2024).
- The Apache Software Foundation. Cassandra Documentation. 2024. Available online: https://cassandra.apache.org/doc/latest/ (accessed on 15 August 2024).
- Maharana, K.; Mondal, S.; Nemade, B. A review: Data pre-processing and data augmentation techniques. Glob. Transit. Proc. 2022, 3, 91–99. [Google Scholar] [CrossRef]
- Luengo, J.; García-Gil, D.; Ramírez-Gallego, S.; García, S.; Herrera, F. Big Data Preprocessing: Enabling Smart Data; Springer International Publishing: Berlin/Heidelberg, Germany, 2020. [Google Scholar] [CrossRef]
- Liang, W.; Tadesse, G.A.; Ho, D.; Fei-Fei, L.; Zaharia, M.; Zhang, C.; Zou, J. Advances, challenges and opportunities in creating data for trustworthy AI. Nat. Mach. Intell. 2022, 4, 669–677. [Google Scholar] [CrossRef]
- Shehab, N.; Badawy, M.; Arafat, H. Big Data Analytics and Preprocessing. In Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges; Springer International Publishing: Cham, Switzerland, 2021; pp. 25–43. [Google Scholar] [CrossRef]
- Yang, C.; Huang, Q.; Li, Z.; Liu, K.; Hu, F. Big Data and cloud computing: Innovation opportunities and challenges. Int. J. Digit. Earth 2016, 10, 13–53. [Google Scholar] [CrossRef]
- Ahmed, N.; Barczak, A.L.C.; Susnjak, T.; Rashid, M.A. A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale datasets using HiBench. J. Big Data 2020, 7, 110. [Google Scholar] [CrossRef]
- L’Esteve, R. Databricks. In The Azure Data Lakehouse Toolkit; Apress: Berkeley, CA, USA, 2022; pp. 83–139. [Google Scholar] [CrossRef]
- Sreemathy, J.; Joseph, V.I.; Nisha, S.; Prabha, I.C.; Priya, R.M.G. Data Integration in ETL Using TALEND. In Proceedings of the 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 6–7 March 2020. [Google Scholar] [CrossRef]
- Dolev, S.; Florissi, P.; Gudes, E.; Sharma, S.; Singer, I. A Survey on Geographically Distributed Big-Data Processing Using MapReduce. IEEE Trans. Big Data 2019, 5, 60–80. [Google Scholar] [CrossRef]
- Zhang, J.; Lin, M. A comprehensive bibliometric analysis of Apache Hadoop from 2008 to 2020. Int. J. Intell. Comput. Cybern. 2022, 16, 99–120. [Google Scholar] [CrossRef]
- Sharma, M.; Kaur, J. A comparative study of big data processing: Hadoop vs. spark. In Proceedings of the 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 13–15 March 2019; pp. 690–701. [Google Scholar]
- Ibtisum, S.; Bazgir, E.; Rahman, S.M.A.; Hossain, S.M.S. A comparative analysis of big data processing paradigms: Mapreduce vs. apache spark. World J. Adv. Res. Rev. 2023, 20, 1089–1098. [Google Scholar] [CrossRef]
- Bawankule, K.L.; Dewang, R.K.; Singh, A.K. Historical data based approach to mitigate stragglers from the Reduce phase of MapReduce in a heterogeneous Hadoop cluster. Clust. Comput. 2022, 25, 3193–3211. [Google Scholar] [CrossRef]
- Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
- Ariyaluran-Habeeb, R.A.; Nasaruddin, F.; Gani, A.; Hashem, I.A.T.; Ahmed, E.; Imran, M. Real-time big data processing for anomaly detection: A survey. Int. J. Inf. Manag. 2019, 45, 289–307. [Google Scholar] [CrossRef]
- Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef]
- Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. J. Appl. Sci. Technol. Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
- Bansal, M.; Goyal, A.; Choudhary, A. A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning. Decis. Anal. J. 2022, 3, 100071. [Google Scholar] [CrossRef]
- Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
- Psycharis, Y. Public Spending Patterns. In Contributions to Economics; Physica-Verlag HD: Heidelberg, Germany, 2008; pp. 41–71. [Google Scholar] [CrossRef]
- Aliyu, G.; Umar, I.E.; Aghiomesi, I.E.; Onawola, H.J.; Rakshit, S. Anomaly Detection of Budgetary Allocations Using Machine-Learning-Based Techniques. In Engineering Innovation for Addressing Societal Challenges; Trans Tech Publications Ltd.: Lausanne, Switzerland, 2021. [Google Scholar] [CrossRef]
- Wolniak, R.; Grebski, W. Functioning of predictie analytics in bussiness. Sci. Pap. Silesian Univ. Technol. Organ. Manag. Ser. 2023, 2023, 631–649. [Google Scholar] [CrossRef]
- Bhikaji, V.; Abdul, S. Trends of public expenditure in India: An empirical analysis. Int. J. Soc. Sci. Econ. Res. 2019, 4, 3307–3318. [Google Scholar]
- Mishra, R.; Kaur, I.; Sahu, S.; Saxena, S.; Malsa, N.; Narwaria, M. Establishing three layer architecture to improve interoperability in Medicare using smart and strategic API led integration. SoftwareX 2023, 22, 101376. [Google Scholar] [CrossRef]
- Hagen, L.; Keller, T.E.; Yerden, X.; Luna-Reyes, L.F. Open data visualizations and analytics as tools for policy-making. Gov. Inf. Q. 2019, 36, 101387. [Google Scholar] [CrossRef]
- Ramadhan, A.N.; Pane, K.N.; Wardhana, K.R.; Suharjito. Blockchain and API Development to Improve Relational Database Integrity and System Interoperability. Procedia Comput. Sci. 2023, 216, 151–160. [Google Scholar] [CrossRef]
- Borgogno, O.; Colangelo, G. Data sharing and interoperability: Fostering innovation and competition through APIs. Comput. Law Secur. Rev. 2019, 35, 105314. [Google Scholar] [CrossRef]
- Platenius-Mohr, M.; Malakuti, S.; Grüner, S.; Schmitt, J.; Goldschmidt, T. File- and API-based interoperability of digital twins by model transformation: An IIoT case study using asset administration shell. Future Gener. Comput. Syst. 2020, 113, 94–105. [Google Scholar] [CrossRef]
- Amazon Web Services. What Is Interoperability? 2024. Available online: https://aws.amazon.com/what-is/interoperability/ (accessed on 15 August 2024).
- Chen, J.X. Data Visualization and Virtual Reality. In Handbook of Statistics; Elsevier: Amsterdam, The Netherlands, 2005; pp. 539–563. [Google Scholar] [CrossRef]
- Chandra, T.B.; Dwivedi, A.K. Data visualization: Existing tools and techniques. In Advanced Data Mining Tools and Methods for Social Computing; Elsevier: Amsterdam, The Netherlands, 2022; pp. 177–217. [Google Scholar] [CrossRef]
- Prokofieva, M. Using dashboards and data visualizations in teaching accounting. Educ. Inf. Technol. 2021, 26, 5667–5683. [Google Scholar] [CrossRef]
- Bina, S.; Kaskela, T.; Jones, D.R.; Walden, E.; Graue, W.B. Incorporating evolutionary adaptions into the cognitive fit model for data visualization. Decis. Support Syst. 2023, 171, 113979. [Google Scholar] [CrossRef]
- Ryan, L. Data visualization as a core competency. In The Visual Imperative; Elsevier: Amsterdam, The Netherlands, 2016; pp. 221–242. [Google Scholar] [CrossRef]
- Zaki, T.; Islam, M.N. Neurological and physiological measures to evaluate the usability and user-experience (UX) of information systems: A systematic literature review. Comput. Sci. Rev. 2021, 40, 100375. [Google Scholar] [CrossRef]
- Lindholm, M.; Sarjakoski, T. Designing a Visualization User Interface. In Visualization in Modern Cartography; Elsevier: Amsterdam, The Netherlands, 1994; pp. 167–184. [Google Scholar] [CrossRef]
- Ryan, L. The importance of visual design. In The Visual Imperative; Elsevier: Amsterdam, The Netherlands, 2016; pp. 153–175. [Google Scholar] [CrossRef]
- Ware, C. Color. In Information Visualization; Elsevier: Amsterdam, The Netherlands, 2021; pp. 95–141. [Google Scholar] [CrossRef]
- Ware, C. Foundations for an Applied Science of Data Visualization. In Information Visualization; Elsevier: Amsterdam, The Netherlands, 2021; pp. 1–29. [Google Scholar] [CrossRef]
- Tufte, E.R. Beautiful Evidence; Graphics Press LLC: Cheshire, Connecticut, 2006. [Google Scholar]
- Midway, S.R. Principles of Effective Data Visualization. Patterns 2020, 1, 100141. [Google Scholar] [CrossRef] [PubMed]
- Ware, C. Images, Narrative, and Gestures for Explanation. In Information Visualization; Elsevier: Amsterdam, The Netherlands, 2021; pp. 331–358. [Google Scholar] [CrossRef]
- Zhang, Y.; Cao, T.; Li, S.; Tian, X.; Yuan, L.; Jia, H.; Vasilakos, A.V. Parallel processing systems for big data: A survey. Proc. IEEE 2016, 104, 2114–2136. [Google Scholar] [CrossRef]
- Pazzi, S.; Svetlova, E. NGOs, public accountability, and critical accounting education: Making data speak. Crit. Perspect. Account. 2023, 92, 102362. [Google Scholar] [CrossRef]
- Tableau. Tableau. 2024. Available online: https://www.tableau.com (accessed on 15 August 2024).
- Microsoft. Power BI. 2024. Available online: https://www.microsoft.com/en-us/power-platform (accessed on 15 August 2024).
- Google. Data Studio: Make Interactive Data Visualizations. 2024. Available online: https://newsinitiative.withgoogle.com/resources/trainings/data-studio-make-interactive-data-visualizations/ (accessed on 15 August 2024).
- Plotly. Plotly. 2024. Available online: https://plotly.com/ (accessed on 15 August 2024).
- Grafana. Grafana. 2024. Available online: https://grafana.com/ (accessed on 15 August 2024).
- IBM. IBM Cognos Analytics. 2024. Available online: https://www.ibm.com/products/cognos-analytics (accessed on 15 August 2024).
- Congreso de la República de Colombia. Ley 20 de 1975. 1975. Available online: https://www.funcionpublica.gov.co/eva/gestornormativo/norma_pdf.php?i=79924 (accessed on 15 June 2024).
- Congreso de la República de Colombia. Ley 42 de 1993. 1993. Available online: https://www.funcionpublica.gov.co/eva/gestornormativo/norma_pdf.php?i=289 (accessed on 15 June 2024).
- Congreso de la República de Colombia. Acto Legislativo 04 de 2019. 2019. Available online: https://www.funcionpublica.gov.co/eva/gestornormativo/norma_pdf.php?i=100251 (accessed on 15 June 2024).
- Contraloría General de la República. Informe de Gestión al Congreso y al Presidente de la República 2021–2022. 2022. Available online: https://www.contraloria.gov.co/en/resultados/informes/informes-constitucionales/historico-informes-constitucionales (accessed on 15 June 2024).
- Presidencia de la República de Colombia. Decreto Ley 2037 de 2019. 2020. Available online: https://www.funcionpublica.gov.co/eva/gestornormativo/norma.php?i=102213 (accessed on 15 April 2024).
- Economía Colombiana. DIARI, la Herramienta Que Revolucionó en Tiempo y Resultados las Inspecciones Fiscales. 2024. Available online: https://www.economiacolombiana.co/desarrollo-futuro/plataforma-diari-herramienta-aliada-de-cgr-4055 (accessed on 15 June 2024).
- Restrepo-Carmona, J.A.; Zuluaga, J.C.; Flórez, D.A.; Gómez, M.S.; Londoño, L.; Gómez, G.; Villamil, R.M.; Morales, O.; Hurtado, A.M.; Escobar, C.A.; et al. The Design of a Strategic Platform for the Smart Supervision of Public Expenditure for Colombia in the Context of Society 5.0. Urban Sci. 2024, 8, 117. [Google Scholar] [CrossRef]
- The Open Group. The TOGAF Standard, 10th ed.; 2024; Available online: https://www.opengroup.org/togaf (accessed on 15 August 2024).
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).