A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment
Abstract
:1. Introduction
1.1. Research Context
1.2. Aim and Objectives
1.3. Rationale
1.4. Novel Contribution
- Promoting an infrastructure hosted on commercial clouds leveraging the cost-effective pay-as-you-go model. This model is particularly beneficial for SMEODs as it allows for financial flexibility and scalability.
- Shifting from a traditional SaaS model towards a standardised form of PaaS. This transition is crucial for SMEODs as it offers a more controlled and customisable computing environment suitable for specific big data needs.
- Reducing the risk of vendor lock-in by recommending the use of portable, interoperable, and vendor-agnostic components. This strategy is vital for ensuring flexibility and independence in a multi-cloud environment.
- Providing a domain-specific reference architecture that offers guidance and simplifies the implementation process. This aspect is particularly valuable for SMEODs that may lack the technical expertise or resources to navigate the complexities of big data systems.
2. Related Literature Review
2.1. Reference Architectures
2.2. Architectures, Technology Stacks, and Concrete Implementations
3. MC-BDP Reference Architecture
3.1. Methods
3.2. Experimental Procedure
3.3. MC-BDP Reference Architecture for Big Data Stream Processing
3.4. MC-BDP Architectural Layers
3.4.1. The Persistence Layer
3.4.2. The Node Layer
3.4.3. The Container Layer
3.4.4. The Networking Layer
- Pipework, a legacy networking tool based on Linux’s cgroups and namespaces [76].
- Flannel, a virtual network that provides a subnet to each host to use with container runtimes [77].
- Open vSwitch, which creates either an underlay or overlay network to connect containers running on multiple hosts [78].
- Weave Net, a Docker plugin that creates a virtual network to connect Docker containers across multiple hosts and enable their automatic discovery [79].
3.4.5. The Orchestration Layer
3.4.6. The Service Layer
3.4.7. The Security Layer
3.4.8. The Monitoring Layer
3.4.9. The Messaging Layer
3.5. MC-BDP Prototype Implementation
4. MC-BDP Prototype Evaluation
4.1. Experimental Results
4.1.1. Average Container CPU Utilisation
4.1.2. Maximum Container CPU Utilisation
4.1.3. Average Container Memory Utilisation
4.1.4. Maximum Container Memory Utilisation
4.1.5. Container Network Utilisation (GB Sent by Workers)
4.1.6. Container Network Utilisation (GB Received by Workers)
4.1.7. Container Network Utilisation (GB Sent and Received by the Job Manager)
4.1.8. Total Records Processed and Data Loss
4.1.9. Data Ingested
5. Conclusions and Future Work
- Leverage the cloud’s economies of scale by promoting an infrastructure hosted on commercial clouds;
- Move away from the traditional SaaS approach towards a standardised form of PaaS;
- Mitigate the risk of vendor lock-in by prescribing the use of portable, interoperable, and vendor-agnostic components deployed to multiple clouds; and
- Alleviate concerns around complexity and skill shortages in the domain by providing a domain-specific reference architecture to guide implementers.
5.1. Recommendations for Future Research
- Investigating how MC-BDP handles batch processing. Comparing [108]’s approach of using a stream engine to process both batches and streams with an alternative approach based on the Lambda architecture.
- Following a trend observed in [109]’s research on container co-location, observing how MC-BDP performs with jobs of different characteristics. Memory-intensive, network-intensive, and disk I/O-intensive jobs could be created and compared to the CPU-intensive job utilised in the experiments.
- Integrating recent research on scheduling algorithms such as RTSATD, which uses task duplication to optimise the performance of big data stream processing across different cloud regions [52], or [110]’s algorithm developed to minimise data transfers over the network into MC-BDP’s orchestration layer.
- Adding a cost perspective to MC-BDP’s evaluation by integrating fine-grained billing information obtained from cloud providers. This is in line with [45]’s and [106]’s research. It is believed that understanding non-functional requirements from a cost as well as performance perspective would be advantageous to budget-constrained organisations.
- Extending MC-BDP’s evaluation to include other domain-specific industry case studies.
- Conducting additional case studies with other types of SMEODs to validate the proposition that MC-BDP is beneficial to them.
- Strengthening the statistical significance of the quantitative results obtained by widening the scale of the experiments, using more than two commercial cloud providers and a greater number of virtual machines and containers and extending the experiments in duration and volume of incoming data.
5.2. Limitations of the Study
5.2.1. Internal Validity
5.2.2. External Validity
5.2.3. Construct Validity
5.3. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix A.1. Supplementary Tables
Layer | Description |
---|---|
Horizontal Layers | |
Persistence Layer | The persistence layer consists of file stores, disk space, and relational and non-relational databases. This layer is used differently by components in other layers, namely, the node, container, service, and messaging layers. |
Node Layer | The node layer is composed of virtual machines to which containers are deployed. In order to maximise economies of scale, MC-BDP recommends an infrastructure based on commercial clouds and, where possible, it recommends that multiple clouds be utilised simultaneously. This is a known strategy for mitigating the risk of vendor lock-in, as identified in the literature [18,22,103]. |
Container Layer | The container layer consists of containers running services based on container images. Since images contain all the necessary configuration that a service needs to run, nothing needs to be installed on the platform other than the container runtime, thus allowing consumers to compare offerings by different cloud providers more easily and to migrate to a different provider if needed. |
Networking Layer | The networking layer represents a network which allows containers deployed to different nodes at different locations to communicate seamlessly. MC-BDP departs from the traditional approach of networking machines to that of networking containers. |
Orchestration Layer | MC-BDP’s orchestration layer is responsible for launching, stopping, and managing containers in a cluster. It is therefore responsible for managing services deployed to containers, registering additional nodes from different clouds (or removing them), scaling the number of containers that a service runs on, controlling which containers run on which nodes, and monitoring the overall state of the cluster. |
Service Layer | The service layer comprises the applications deployed to the container cluster, which range from smaller-scale deployments such as front-end user interfaces to larger-scale frameworks for big data processing distributed across hundreds of containers. |
Vertical Layers | |
Security Layer | The security layer is orthogonal to all other layers since it is implemented in multiple contexts. Encryption, for example, can be configured independently at the framework, orchestration, networking, messaging, and persistence levels. Due to the complexity inherent to the security aspect of large-scale cloud-based systems, our study recommends addressing it through a systematic and comprehensive framework which is multi-layer and multi-purpose, such as the Cloud Computing Adoption Framework (CCAF) [62]. |
Monitoring Layer | The monitoring layer consists of services aimed at providing metrics related to the performance of specific components. Since diverse aspects of a system can and usually are monitored, this layer is also orthogonal to the others, and would likewise benefit from a systematic approach such as a multi-layer and multi-purpose framework. |
Messaging Layer | The messaging layer is used primarily to facilitate the transmission of data from one system to another. One example of such usage in the context of streaming architectures is use as a sink or output for real-time data from IOT devices and as a source or input for big data processing frameworks. Likewise, the messaging layer could be configured as a sink for the result of the data processing performed by the big data framework and as a source for subsequent processing by the same or different framework. |
Experimental Design | Environment Setup Using Multi-Cloud Clusters of Azure and Google Cloud Virtual Machines. | |||
---|---|---|---|---|
Experimental Setup | Experiment | Parallelism | Azure VMs | Google VMs |
Scalability | ||||
Single-Cloud Three Workers | 3 | 3 | 0 | |
Multi-Cloud Three Workers | 3 | 2 | 1 | |
Single-Cloud Six Workers | 6 | 6 | 0 | |
Multi-Cloud Six Workers | 6 | 2 | 4 | |
Fault Tolerance | ||||
Single-Cloud Three Workers | 2 | 3 | 0 | |
Multi-Cloud Three Workers | 2 | 2 | 1 | |
Single-Cloud Six Workers | 4 | 6 | 0 | |
Multi-Cloud Six Workers | 4 | 4 | 2 | |
Technology Agnosticism | ||||
Single-Cloud Three Workers Beam | 3 | 3 | 0 | |
Single-Cloud Three Workers Flink | 3 | 3 | 0 | |
Multi-Cloud Three Workers Beam | 3 | 2 | 1 | |
Multi-Cloud Three Workers Flink | 3 | 2 | 1 | |
Single-Cloud Six Workers Beam | 6 | 6 | 2 | |
Single-Cloud Six Workers Flink | 6 | 6 | 2 | |
Multi-Cloud Six Workers Beam | 6 | 4 | 0 | |
Multi-Cloud Six Workers Flink | 6 | 4 | 0 | |
Windowing Rate vs. Resource Utilisation | ||||
Multi-Cloud Three Workers | 3 | 2 | 1 | |
Multi-Cloud Six Workers | 6 | 3 | 3 | |
Multi-Cloud Ten Workers | 10 | 6 | 4 | |
Container Co-location | ||||
One Container per node | 8 | 4 | 4 | |
Two Containers per node | 8 | 2 | 2 | |
Four Containers per node | 8 | 1 | 1 |
Experiments Conducted | Scalability | |||||
Single-Cloud Three Workers | Exp. 1 2 records/min | Exp. 2 2 records/s | Exp. 3 2 records/100 s | Exp. 4 2 records/10 ms | Exp. 5 2 records/ms | |
Multi-Cloud Three Workers | Exp. 6 2 records/min | Exp. 7 2 records/s | Exp. 8 2 records/100 s | Exp. 9 2 records/100 s | Exp. 10 2 records/ms | |
Single-Cloud Six Workers | Exp. 11 2 records/min | Exp. 12 2 records/s | Exp. 13 2 records/100 s | Exp. 14 2 records/100 s | Exp. 15 2 records/ms | |
Multi-Cloud Six Workers | Exp. 16 2 records/min | Exp. 17 2 records/s | Exp. 18 2 records/100 s | Exp. 19 2 records/100 s | Exp. 20 2 records/ms | |
Fault Tolerance | ||||||
Single-Cloud Three Workers | Exp. 21 2 records/min | Exp. 22 2 records/s | Exp. 23 2 records/100 s | Exp. 24 2 records/10 ms | Exp. 25 2 records/ms | |
Multi-Cloud Three Workers | Exp. 26 2 records/min | Exp. 27 2 records/s | Exp. 28 2 records/100 s | Exp. 29 2 records/10 ms | Exp. 30 2 records/ms | |
Single-Cloud Six Workers | Exp. 31 2 records/min | Exp. 32 2 records/s | Exp. 33 2 records/100 s | Exp. 34 2 records/10 ms | Exp. 35 2 records/ms | |
Multi-Cloud Six Workers | Exp. 36 2 records/min | Exp. 37 2 records/s | Exp. 38 2 records/100 s | Exp. 39 2 records/10 ms | Exp. 40 2 records/ms | |
Technology Agnosticism | ||||||
Single-Cloud Three Workers Beam | Exp. 41 2 records/min | Exp. 42 2 records/s | Exp. 43 2 records/100 s | Exp. 44 2 records/10 ms | Exp. 45 2 records/ms | |
Single-Cloud Three Workers Flink | Exp. 46 2 records/min | Exp. 47 2 records/s | Exp. 48 2 records/100 s | Exp. 49 2 records/10 ms | Exp. 50 2 records/ms | |
Multi-Cloud Three Workers Beam | Exp. 51 2 records/min | Exp. 52 2 records/s | Exp. 53 records/100 s | Exp. 54 2 records/10 ms | Exp. 55 2 records/ms | |
Multi-Cloud Three Workers Flink | Exp. 56 2 records/min | Exp. 57 2 records/s | Exp. 58 records/100 s | Exp. 59 records/10 ms | Exp. 60 2 records/ms | |
Single-Cloud Six Workers Beam | Exp. 61 2 records/min | Exp. 62 2 records/s | Exp. 63 2 records/100 s | Exp. 64 2 records/10 ms | Exp. 65 2 records/ms | |
Single-Cloud Six Workers Flink | Exp. 66 2 records/min | Exp. 67 2 records/s | Exp. 68 2 records/100 s | Exp. 69 2 records/10 ms | Exp. 70 2 records/ms | |
Multi-Cloud Six Workers Beam | Exp. 71 2 records/min | Exp. 72 2 records/s | Exp. 73 2 records/100 s | Exp. 74 2 records/10 ms | Exp. 75 2 records/ms | |
Multi-Cloud Six Workers Flink | Exp. 76 2 records/min | Exp. 77 2 records/s | Exp. 78 2 records/100 s | Exp. 79 2 records/10 ms | Exp. 80 2 records/ms | |
Windowing Rate versus Resource Utilisation | ||||||
Multi-Cloud Three Workers | Exp. 81 R = 2.0 5 s start every 10 s | Exp. 82 R = 1.5 5 s start every 7.5 s | Exp. 83 R = 1.0 5 s start every 5 s | Exp. 84 R = 0.2 5 s start every 1 s | Exp. 85 R = 0.1 5 s start every 0.5 s | |
Multi-Cloud Six Workers | Exp. 86 R = 2.0 5 s start every 10 s | Exp. 87 R = 1.5 5 s start every 7.5 s | Exp. 88 R = 1.0 5 s start every 5 s | Exp. 89 R = 0.2 5 s start every 1 s | Exp. 90 R = 0.1 5 s start every 0.5 s | |
Multi-Cloud Ten Workers | Exp. 91 R = 2.0 5 s start every 10 s | Exp. 92 R = 1.5 5 s start every 7.5 s | Exp. 93 R = 1.0 5 s start every 5 s | Exp. 94 R = 0.2 5 s start every 1 s | Exp. 95 R = 0.1 5 s start every 0.5 s | |
Container Co-location | ||||||
One Container per node | Exp. 96 R = 2.0 5 s start every 10 s | Exp. 97 R = 1.5 5 s start every 7.5 s | Exp. 98 R = 1.0 5 s start every 5 s | Exp. 99 R = 0.2 5 s start every 1 s | Exp. 100 R = 0.1 5 s start every 0.5 s | |
Two Containers per node | Exp. 101 R = 2.0 5 s start every 10 s | Exp. 102 R = 1.5 5 s start every 7.5 s | Exp. 103 R = 1.0 5 s start every 5 s | Exp. 104 R = 0.2 5 s start every 1 s | Exp. 105 R = 0.1 5 s start every 0.5 s | |
Four Containers per node | Exp. 106 R = 2.0 5 s start every 10 s | Exp. 107 R = 1.5 5 s start every 7.5 s | Exp. 108 R = 1.0 5 s start every 5 s | Exp. 109 R = 0.2 5 s start every 1 s | Exp. 110 R = 0.1 5 s start every 0.5 s |
Performance Metrics | Container CPU Utilisation | Container Memory Utilisation | Container Network Utilisation | Additional Metric |
Scalability Metrics | ||||
Average and maximum CPU utilisation by a container during experiment execution time. | Average and maximum memory utilisation by a container during experiment execution time (in MB). | Total number of bytes transmitted and received over a network by a container during experiment execution time. | ||
Fault Tolerance Metrics | ||||
Average and maximum CPU utilisation by a container during experiment execution time. | Average and maximum memory utilisation by a container during experiment execution time (in MB). | Total number of bytes transmitted and received over a network by a container during experiment execution time. | Records Processed Percentage of total number of records processed compared to total number of transmitted records | |
Technology Agnosticism Metrics | ||||
Average and maximum CPU utilisation by a container during experiment execution time. | Average and maximum memory utilisation by a container during experiment execution time (in MB). | Total number of bytes transmitted and received over a network by a container during experiment execution time. | ||
Windowing Rate versus Resource Utilisation Metrics | ||||
Average and maximum CPU utilisation by a container during experiment execution time. | Average and maximum memory utilisation by a container during experiment execution time (in MB). | Total number of bytes transmitted and received over a network by a container during experiment execution time. | Data Volume Processed Total number of kilobytes and total number of records received by each worker for processing after the windowing function is applied. | |
Container Co-Location Metrics | ||||
Average and maximum CPU utilisation by data processing containers running a node during experiment execution time. | Average and maximum CPU utilisation by data processing containers running a node during experiment execution time (in MB). | Total number of bytes transmitted and received over a network by the data processing containers during experiment execution time. | Data Volume Processed Total number of kilobytes and total number of records received by each worker for processing after the windowing function is applied. |
Average CPU Utilisation by Velocity | |||
---|---|---|---|
Scalability Metrics | |||
Fault Tolerance Metrics | |||
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |||
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |||
Windowing Rate versus Resource Utilisation Metrics | |||
Container Co-location Metrics | |||
Average Node CPU Utilisation by Node Cluster | |||
Maximum CPU Utilisation by Velocity | |
---|---|
Scalability Metrics | |
Fault Tolerance Metrics | |
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |
Windowing Rate versus Resource Utilisation Metrics | |
Container Co-location Metrics | |
Maximum Node CPU Utilisation by Node Cluster | |
Average Memory Utilisation by Velocity | |||
---|---|---|---|
Scalability Metrics | |||
Fault Tolerance Metrics | |||
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |||
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |||
Windowing Rate versus Resource Utilisation Metrics | |||
Container Co-location Metrics | |||
Average Node Memory Utilisation by Node Cluster | |||
Maximum Memory Utilisation by Velocity | |||
---|---|---|---|
Scalability Metrics | |||
Fault Tolerance Metrics | |||
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |||
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |||
Windowing Rate versus Resource Utilisation Metrics | |||
Container Co-location Metrics | |||
Maximum Node Memory Utilisation by Node Cluster | |||
Network Utilisation: Total Number of Gigabytes Sent over the Network | |
---|---|
Scalability Metrics | |
Fault Tolerance Metrics | |
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |
Windowing Rate versus Resource Utilisation Metrics | |
Container Co-location Metrics | |
Maximum Node Network Utilisation by Node Cluster | |
Network Utilisation: Total Number of Gigabytes Received over the Network | |
---|---|
Scalability Metrics | |
Fault Tolerance Metrics | |
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |
Windowing Rate versus Resource Utilisation Metrics | |
Container Co-location Metrics | |
Maximum Node Network Utilisation by Node Cluster | |
Total Gigabytes Received via the Network by the Job Manager | |
---|---|
Technology Agnosticism (Beam and Flink) for Three Workers | |
Technology Agnosticism (Beam and Flink) for Six Workers | |
Total Gigabytes Sent via the Network by the Job Manager | |
Technology Agnosticism Metrics (Beam and Flink) for Three Workers | |
Technology Agnosticism Metrics (Beam and Flink) for Six Workers | |
Total Gigabytes Received via the Network by the Job Manager | |
---|---|
Windowing Rate versus Resource Utilisation | |
Total Gigabytes Sent via the Network by the Job Manager | |
Windowing Rate versus Resource Utilisation | |
Total Gigabytes Received by the Job Manager over the Network by Number of Containers per Node | |
---|---|
Container Co-Location | |
Maximum Node Network Utilisation by Node Cluster | |
Total Number of GB Sent by the Job Manager over the Network by Number of Containers per Node | |
Container Co-Location | |
Maximum Node Network Utilisation by Node Cluster | |
Appendix A.2. Azure Log Analytics Queries
References
- Dean, J.; Ghemawat, S. MapReduce: Simplified data processing on large clusters. ACM. Commun. 2008, 51, 107. [Google Scholar] [CrossRef]
- Patel, K.; Sakaria, Y.; Bhadane, C. Real Time Data Processing Frameworks. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 49–63. [Google Scholar] [CrossRef]
- Li, J.; Maier, D.; Tufte, K.; Papadimos, V.; Tucker, P.A. Semantics and Evaluation Techniques for Window Aggregates in Data Streams. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 14–16 June 2005; pp. 311–322. [Google Scholar] [CrossRef]
- Akidau, T.; Balikov, A.; Bekiroğlu, K.; Chernyak, S.; Haberman, J.; Lax, R.; McVeety, S.; Mills, D.; Nordstrom, P.; Whittle, S. MillWheel: Fault-tolerant stream processing at internet scale. Proc. VLDB Endow. 2013, 6, 1033–1044. [Google Scholar] [CrossRef]
- Kreps, J. Questioning the Lambda Architecture—O’Reilly Media. 2 July 2014. Available online: https://www.oreilly.com/ideas/questioning-the-lambda-architecture (accessed on 28 October 2016).
- Chen, G.J.; Wiener, J.L.; Iyer, S.; Jaiswal, A.; Lei, R.; Simha, N.; Wang, W.; Wilfong, K.; Williamson, T.; Yilmaz, S. Realtime Data Processing at Facebook. In Proceedings of the 2016 International Conference on Management of Data, New York, NY, USA, 25 June 2016; pp. 1087–1098. [Google Scholar] [CrossRef]
- Krishnan, S. Discovery and Consumption of Analytics Data at Twitter. Twitter Engineering Blog. 29 June 2016. Available online: https://blog.twitter.com/engineering/en_us/topics/insights/2016/discovery-and-consumption-of-analytics-data-at-twitter.html (accessed on 9 February 2018).
- Kashlev, A.; Lu, S.; Mohan, A. Big Data Workflows: A Reference Architecture and The Dataview System. Serv. Trans. Big Data 2017, 4, 19. [Google Scholar] [CrossRef]
- Ta, V.-D.; Liu, C.-M.; Nkabinde, G.W. Big data stream computing in healthcare real-time analytics. In Proceedings of the 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), Chengdu, China, 5–7 July 2016; pp. 37–42. [Google Scholar] [CrossRef]
- Klein, J.; Buglak, R.; Blockow, D.; Wuttke, T.; Cooper, B. A Reference Architecture for Big Data Systems in the National Security Domain. In Proceedings of the 2016 IEEE/ACM 2nd International Workshop on Big Data Software Engineering (BIGDSE), 16 May 2016; pp. 51–57. [Google Scholar]
- Ardagna, C.A.; Ceravolo, P.; Damiani, E. Big data analytics as-a-service: Issues and challenges. In Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 17–20 December 2016; pp. 3638–3644. [Google Scholar] [CrossRef]
- Kalan, R.S.; Ünalir, M.O. Leveraging big data technology for small and medium-sized enterprises (SMEs). In Proceedings of the 2016 6th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 20–21 October 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Liu, Y.; Soroka, A.; Han, L.; Jian, J.; Tang, M. Cloud-based big data analytics for customer insight-driven design innovation in SMEs. Int. J. Inf. Manag. 2020, 51, 102034. [Google Scholar] [CrossRef]
- Sen, D.; Ozturk, M.; Vayvay, O. An Overview of Big Data for Growth in SMEs. Procedia Soc. Behav. Sci. 2016, 235, 159–167. [Google Scholar] [CrossRef]
- Vecchio, P.D.; Minin, A.D.; Petruzzelli, A.M.; Panniello, U.; Pirri, S. Big data for open innovation in SMEs and large corporations: Trends, opportunities, and challenges. Creat. Innov. Manag. 2018, 27, 6–22. [Google Scholar] [CrossRef]
- Shetty, J.P.; Panda, R. An overview of cloud computing in SMEs. J. Glob. Entrep. Res. 2021, 11, 175–188. [Google Scholar] [CrossRef]
- Sultan, N.A. Reaching for the “cloud”: How SMEs can manage. Int. J. Inf. Manag. 2011, 31, 272–278. [Google Scholar] [CrossRef]
- Opara-Martins, J.; Sahandi, R.; Tian, F. Critical analysis of vendor lock-in and its impact on cloud computing migration: A business perspective. J. Cloud Comput. 2016, 5, 4. [Google Scholar] [CrossRef]
- Bange, C.; Grosser, T.; Janoschek, N. Big Data Use Cases 2015—Getting Real on Data Monetization. July 2015. Available online: http://barc-research.com/research/big-data-use-cases-2015/ (accessed on 15 February 2019).
- Assis, M.R.M.; Bittencourt, L.F. A survey on cloud federation architectures: Identifying functional and non-functional properties. J. Netw. Comput. Appl. 2016, 72, 51–71. [Google Scholar] [CrossRef]
- Naik, N. Docker container-based big data processing system in multiple clouds for everyone. In Proceedings of the 2017 IEEE International Systems Engineering Symposium (ISSE), Vienna, Austria, 11–13 October 2017; pp. 1–7. [Google Scholar] [CrossRef]
- Satzger, B.; Hummer, W.; Inzinger, C.; Leitner, P.; Dustdar, S. Winds of Change: From Vendor Lock-In to the Meta Cloud. IEEE Internet Comput. 2013, 17, 69–73. [Google Scholar] [CrossRef]
- Silva, G.C.; Rose, L.M.; Calinescu, R. Towards a Model-Driven Solution to the Vendor Lock-In Problem in Cloud Computing. In Proceedings of the 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, 2–5 December 2013; Volume 1, pp. 711–716. [Google Scholar] [CrossRef]
- Toosi, A.N.; Calheiros, R.N.; Buyya, R. Interconnected Cloud Computing Environments: Challenges, Taxonomy, and Survey. ACM Comput. Surv. 2014, 47, 1–47. [Google Scholar] [CrossRef]
- Bernstein, D. Cloud Foundry Aims to Become the OpenStack of PaaS. IEEE Cloud Comput. 2014, 1, 57–60. [Google Scholar] [CrossRef]
- Leung, A.; Spyker, A.; Bozarth, T. Titus: Introducing Containers to the Netflix Cloud. Queue 2017, 15, 53–77. [Google Scholar] [CrossRef]
- Al-Dhuraibi, Y.; Paraiso, F.; Djarallah, N.; Merle, P. Elasticity in Cloud Computing: State of the Art and Research Challenges. IEEE Trans. Serv. Comput. 2018, 11, 430–447. [Google Scholar] [CrossRef]
- Rodriguez, M.A.; Buyya, R. Container-based cluster orchestration systems: A taxonomy and future directions. Softw. Pract. Exp. 2019, 49, 698–719. [Google Scholar] [CrossRef]
- Pahl, C. Containerization and the PaaS Cloud. IEEE Cloud Comput. 2015, 2, 24–31. [Google Scholar] [CrossRef]
- Pahl, C.; Lee, B. Containers and Clusters for Edge Cloud Architectures—A Technology Review. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud, Rome, Italy, 24–26 August 2015; pp. 379–386. [Google Scholar] [CrossRef]
- Vergilio, T.; Ramachandran, M. Non-functional Requirements for Real World Big Data Systems—An Investigation of Big Data Architectures at Facebook, Twitter and Netflix’. In Proceedings of the 13th International Conference on Software Technologies, Porto, Portugal, 26–28 July 2018; pp. 833–840. [Google Scholar] [CrossRef]
- Silva, G.C.; Rose, L.M.; Calinescu, R. A Systematic Review of Cloud Lock-In Solutions. In Proceedings of the 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, 2–5 December 2013; Volume 2, pp. 363–368. [Google Scholar] [CrossRef]
- Jokonya, O. Investigating Open Source Software Benefits in Public Sector. In Proceedings of the 2015 48th Hawaii International Conference on System Sciences, Kauai, HI, USA, 5–8 January 2015; pp. 2242–2251. [Google Scholar] [CrossRef]
- Palyart, M.; Murphy, G.C.; Masrani, V. A Study of Social Interactions in Open Source Component Use. IEEE Trans. Softw. Eng. 2018, 44, 1132–1145. [Google Scholar] [CrossRef]
- Al-Hazmi, Y.; Campowsky, K.; Magedanz, T. A monitoring system for federated clouds. In Proceedings of the 2012 IEEE 1st International Conference on Cloud Networking (CLOUDNET), Paris, France, 28–30 November 2012; pp. 68–74. [Google Scholar] [CrossRef]
- Palos-Sanchez, P.R. Drivers and Barriers of the Cloud Computing in SMEs: The Position of the European Union. Harv. Deusto Bus. Res. 2017, 6, 116–132. [Google Scholar] [CrossRef]
- Hui, K. AWS 101: Regions and Availability Zones. Rackspace Blog. 16 February 2017. Available online: https://blog.rackspace.com/aws-101-regions-availability-zones (accessed on 22 February 2019).
- Scott, R. Mitigating an AWS Instance Failure with the Magic of Kubernetes. Medium. 1 March 2017. Available online: https://medium.com/spire-labs/mitigating-an-aws-instance-failure-with-the-magic-of-kubernetes-128a44d44c14 (accessed on 24 January 2018).
- Brodkin, J.; EC, A. “Availability Zones” into Question. Network World. 21 April 2011. Available online: https://www.networkworld.com/article/2202805/cloud-computing/amazon-ec2-outage-calls--availability-zones--into-question.html (accessed on 22 February 2019).
- Dayaratna, A. Microsoft Azure Recovers From Multi-Region Azure DNS Service Disruption. Cloud Comput. Today 2020, 51–56. Available online: https://cloud-computing-today.com/2016/09/15/microsoft-azure-recovers-from-multi-region-azure-dns-service-disruption/ (accessed on 22 February 2019).
- Rattihalli, G. Exploring Potential for Resource Request Right-Sizing via Estimation and Container Migration in Apache Mesos. In Proceedings of the 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion), Zurich, Switzerland, 17–20 December 2018; pp. 59–64. [Google Scholar] [CrossRef]
- Bass, L.; Clements, P.; Kazman, R. Software Architecture in Practice, 3rd ed.; Addison-Wesley Professional: Upper Saddle River, NJ, USA, 2012. [Google Scholar]
- Chang, W.L.; Boyd, D.; Levin, O.; NIST Big Data Public Working Group. NIST Big Data Interoperability Framework; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2019. [CrossRef]
- Maier, M. Towards a Big Data Reference Architecture. Master’s Thesis, Eindhoven University of Technology, Eindhoven, The Netherlands, 2013. Available online: https://pure.tue.nl/ws/files/46951182/761622-1.pdf (accessed on 7 December 2020).
- Heilig, L.; Voß, S. Managing Cloud-Based Big Data Platforms: A Reference Architecture and Cost Perspective. In Big Data Management; Márquez, F.P.G., Lev, B., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 29–45. [Google Scholar] [CrossRef]
- Geerdink, B. A reference architecture for big data solutions introducing a model to perform predictive analytics using big data technology. In Proceedings of the 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013), London, UK, 9–12 December 2013; pp. 71–76. [Google Scholar] [CrossRef]
- Pääkkönen, P.; Pakkala, D. Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems. Big Data Res. 2015, 2, 166–186. [Google Scholar] [CrossRef]
- Belli, L.; Cirani, S.; Davoli, L.; Melegari, L.; Mónton, M.; Picone, M. An Open-Source Cloud Architecture for Big Stream IoT Applications. In Interoperability and Open-Source Solutions for the Internet of Things, Proceedings of the International Workshop, FP7 OpenIoT Project, Held in Conjunction with SoftCOM 2014, Split, Croatia, 18 September 2014; Springer: Berlin/Heidelberg, Germany, 2015; pp. 73–88. [Google Scholar] [CrossRef]
- Pellegrini, R.; Rottmann, P.; Strieder, G. Preventing vendor lock-ins via an interoperable multi-cloud deployment approach. In Proceedings of the 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST), Cambridge, UK, 11–14 December 2017; pp. 382–387. [Google Scholar] [CrossRef]
- Scolati, R.; Fronza, I.; El Ioini, N.; Samir, A.; Pahl, C. A Containerized Big Data Streaming Architecture for Edge Cloud Computing on Clustered Single-board Devices. In Proceedings of the 9th International Conference on Cloud Computing and Services Science, Heraklion, Greece, 2–4 May 2019; pp. 68–80. [Google Scholar] [CrossRef]
- Moreno, J.; Serrano, M.A.; Fernández-Medina, E.; Fernández, E.B. Towards a Security Reference Architecture for Big Data. In Proceedings of the 20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data Co-Located with 10th EDBT/ICDT Joint Conference (EDBT/ICDT 2018), Vienna, Austria, 26–29 March 2018; Volume 2062. Available online: http://ceur-ws.org/Vol-2062/paper04.pdf (accessed on 6 August 2020).
- Chen, H.; Wen, J.; Pedrycz, W.; Wu, G. Big Data Processing Workflows Oriented Real-Time Scheduling Algorithm using Task-Duplication in Geo-Distributed Clouds. IEEE Trans. Big Data 2020, 6, 131–144. [Google Scholar] [CrossRef]
- Verbitskiy, I.; Thamsen, L.; Kao, O. When to Use a Distributed Dataflow Engine: Evaluating the Performance of Apache Flink. In Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, France, 18–21 July 2016; pp. 698–705. [Google Scholar] [CrossRef]
- Verma, A.; Mansuri, A.H.; Jain, N. Big data management processing with Hadoop MapReduce and spark technology: A comparison. In Proceedings of the 2016 Symposium on Colossal Data Analysis and Networking (CDAN), Indore, India, 18–19 March 2016; pp. 1–4. [Google Scholar] [CrossRef]
- Sang, G.M.; Xu, L.; de Vrieze, P. A reference architecture for big data systems. In Proceedings of the 2016 10th International Conference on Software Knowledge, Information Management Applications (SKIMA), Chengdu, China, 15–17 December 2016; pp. 370–375. [Google Scholar] [CrossRef]
- Kant, I. Critique of Pure Reason; Penguin Classics: London, UK, 1781. [Google Scholar]
- Vergilio, T.; Ramachandran, M.; Mullier, D. Requirements Engineering for Large Scale Big Data Applications. In Software Engineering in the Era of Cloud Computing; Ramachandran, M., Mahmood, Z., Eds.; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Pattinson, C.; Kor, A.L.; Cross, R. Measuring Data Centre Efficiency; Leeds Beckett University: Leeds, UK, 2012. [Google Scholar]
- Dubé, L.; Paré, G. Rigor in information systems positivist case research: Current practices, trends, and recommendations. MIS Q. 2003, 27, 597–635. [Google Scholar] [CrossRef]
- Vergilio, T.; Ramachandran, M. PaaS-BDP—A Multi-Cloud Architectural Pattern for Big Data Processing on a Platform-as-a-Service Model. In Proceedings of the 3rd International Conference on Complexity, Future Information Systems and Risk, Funchal, Portugal, 20–21 March 2018; pp. 45–52. [Google Scholar] [CrossRef]
- Kolajo, T.; Daramola, O.; Adebiyi, A. Big data stream analysis: A systematic literature review. J. Big Data 2019, 6, 47. [Google Scholar] [CrossRef]
- Ramachandran, M.; Chang, V. Towards performance evaluation of cloud service providers for cloud data security. Int. J. Inf. Manag. 2016, 36, 618–625. [Google Scholar] [CrossRef]
- Ylonen, T. SSH-Secure Login Connections over the Internet. In Proceedings of the 6th USENIX Security Symposium, Focusing on Applications of Cryptography, San Jose, CA, USA, 22–25 July 1996; Available online: https://ci.nii.ac.jp/naid/10019459981/ (accessed on 10 April 2019).
- Weave Cloud: Kubernetes Automation for Developers. Weave Cloud. 2019. Available online: https://www.weave.works/product/cloud/ (accessed on 10 April 2019).
- Hiroishi, Y.; Fukuda, K.; Tagawa, I.; Iwasaki, H.; Takenoiri, S.; Tanaka, H.; Mutoh, H.; Yoshikawa, N. Future Options for HDD Storage. IEEE Trans. Magn. 2009, 45, 3816–3822. [Google Scholar] [CrossRef]
- ‘State Backends’. Apache Flink 1.3 Documentation. 2017. Available online: https://ci.apache.org/projects/flink/flink-docs-release-1.3/ops/state_backends.html (accessed on 10 April 2019).
- Myers, T.; Schonning, N.; King, J.; Stephenson, A.; Gries, W. Data redundancy in Azure Storage. Azure Storage Redundancy. 18 January 2019. Available online: https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy (accessed on 25 March 2019).
- Verbitski, A.; Gupta, A.; Saha, D.; Brahmadesam, M.; Gupta, K.; Mittal, R.; Krishnamurthy, S.; Maurice, S.; Kharatishvili, T.; Bao, X. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 1041–1052. [Google Scholar] [CrossRef]
- Paz, J.R.G. Introduction to Azure Cosmos DB. In Microsoft Azure Cosmos DB Revealed: A Multi-Model Database Designed for the Cloud; Paz, J.R.G., Ed.; Apress: Berkeley, CA, USA, 2018; pp. 1–23. [Google Scholar] [CrossRef]
- ‘Datastore–NoSQL Schemaless Database’. Datastore. 2019. Available online: https://cloud.google.com/datastore/ (accessed on 8 June 2019).
- Yasrab, R.; Gu, N. Multi-cloud PaaS Architecture (MCPA): A Solution to Cloud Lock-in. In Proceedings of the 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), Beijing, China, 8–10 July 2016; pp. 473–477. [Google Scholar] [CrossRef]
- Ranjan, R. The Cloud Interoperability Challenge. IEEE Cloud Comput. 2014, 1, 20–24. [Google Scholar] [CrossRef]
- Poulton, N. Docker Deep Dive; Independently Published: Chicago, IL, USA, 2017. [Google Scholar]
- Walli, S. Demystifying the Open Container Initiative (OCI) Specifications. Docker Blog. 19 July 2017. Available online: https://www.docker.com/blog/demystifying-open-container-initiative-oci-specifications/ (accessed on 8 May 2020).
- Carter, E. Sysdig 2019 Container Usage Report. Sysdig. 29 October 2019. Available online: https://sysdig.com/blog/sysdig-2019-container-usage-report/ (accessed on 8 May 2020).
- Combe, T.; Martin, A.; Pietro, R.D. To Docker or Not to Docker: A Security Perspective. IEEE Cloud Comput. 2016, 3, 54–62. [Google Scholar] [CrossRef]
- Petazzoni, J. GitHub—Software-Defined Networking Tools for LXC (LinuX Containers). 22 August 2017. Available online: https://github.com/jpetazzo/pipework (accessed on 27 March 2019).
- Yakubovich, E.; Denham, T. ‘Flannel’. CoreOS. March 2019. Available online: https://github.com/coreos/flannel (accessed on 27 March 2019).
- Open Virtual Networking with Docker. Open vSwitch Documentation. 2016. Available online: http://docs.openvswitch.org/en/latest/howto/docker/ (accessed on 27 March 2019).
- How the Weave Net Docker Network Plugins Work. Weaveworks. 2019. Available online: https://www.weave.works/docs/net/latest/install/plugin/plugin-how-it-works/ (accessed on 27 March 2019).
- Dua, R.; Kohli, V.; Konduri, S.K. Learning Docker Networking; Packt Publishing: Birmingham, UK, 2016. [Google Scholar]
- Vergilio, T. Multi-Cloud Big Data Processing with Flink, Docker Swarm and Weave Plugin. Weaveworks. 1 August 2018. Available online: https://www.weave.works/blog/multi-cloud-big-data-processing-with-flink-docker-swarm-and-weave-plugin (accessed on 15 February 2018).
- Zismer, A. Performance of Docker Overlay Networks; University of Amsterdam: Amsterdam, The Netherlands, 2016. [Google Scholar]
- Production-Grade Container Orchestration. Kubernetes. 2019. Available online: https://kubernetes.io/ (accessed on 27 March 2019).
- Syed, H.J.; Gani, A.; Nasaruddin, F.H.; Naveed, A.; Ahmed, A.I.A.; Khan, M.K. CloudProcMon: A Non-Intrusive Cloud Monitoring Framework. IEEE Access 2018, 6, 44591–44606. [Google Scholar] [CrossRef]
- Zacheilas, N.; Maroulis, S.; Priovolos, T.; Kalogeraki, V.; Gunopulos, D. Dione: A Framework for Automatic Profiling and Tuning Big Data Applications. In Proceedings of the 2018 IEEE 34th International Conference on Data Engineering (ICDE), Paris, France, 16–19 April 2018; pp. 1637–1640. [Google Scholar] [CrossRef]
- ‘Metrics’. Apache Flink 1.11 Documentation 2020. Available online: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html (accessed on 27 June 2020).
- Luzzardi, A. Announcing Swarm 1.0: Production-Ready Clustering at Any Scale. Docker Blog. 3 November 2015. Available online: https://blog.docker.com/2015/11/swarm-1-0/ (accessed on 8 September 2017).
- Confluent Control Center. Confluent Platform. 27 June 2020. Available online: https://docs.confluent.io/current/control-center/index.html (accessed on 27 June 2020).
- What Is Cloud Pub/Sub? Cloud Pub/Sub Documentation. 15 March 2019. Available online: https://cloud.google.com/pubsub/docs/overview (accessed on 28 March 2019).
- Amazon Kinesis Data Streams. 2020. Available online: https://aws.amazon.com/kinesis/data-streams/ (accessed on 10 July 2020).
- Apache Kafka. The Apache Software Foundation. 2019. Available online: https://github.com/apache/kafka (accessed on 28 March 2019).
- Vergilio, T. Data-Interpolator. 31 May 2018. Available online: https://bitbucket.org/vergil01/data-interpolator/src/master/ (accessed on 28 June 2020).
- Vergilio, T. Energy-Consumption-Producer’. 21 September 2018. Available online: https://bitbucket.org/vergil01/energy-consumption-producer/src/master/ (accessed on 28 June 2020).
- Vergilio, T. Energy-Consumption-Simulator’. 5 October 2018. Available online: https://bitbucket.org/vergil01/energy-consumption-simulator/src/master/ (accessed on 28 June 2020).
- Apache Beam Capability Matrix. 2017. Available online: https://beam.apache.org/documentation/runners/capability-matrix/ (accessed on 9 August 2017).
- Heitlager, I.; Kuipers, T.; Visser, J. A Practical Model for Measuring Maintainability. In Proceedings of the 6th International Conference on the Quality of Information and Communications Technology (QUATIC 2007), Lisbon, Portugal, 12–14 September 2007; pp. 30–39. [Google Scholar] [CrossRef]
- Bridgmon, K.D.; Martin, W.E. Quantitative and Statistical Research Methods: From Hypothesis to Results: 42, 1st ed.; Jossey-Bass: San Francisco, CA, USA, 2012. [Google Scholar]
- ‘Pods—Kubernetes’. Kubernetes. 12 May 2019. Available online: https://kubernetes.io/docs/concepts/workloads/pods/pod/#motivation-for-pods (accessed on 8 June 2019).
- Karabek, M.R.; Kleinert, J.; Pohl, A. Cloud Services for SMEs—Evolution or Revolution? Bus. Innov. 2011, 2, 26–33. [Google Scholar] [CrossRef]
- Hamburg, I.; Marian, M. Learning as a Service—A Cloud-based Approach for SMEs. In Proceedings of the the SERVICE COMPUTATION 2012, The Fourth International Conferences on Advanced Service Computing, Nice, France, 22–27 July 2012; pp. 53–57. Available online: https://www.thinkmind.org/index.php?view=article&articleid=service_computation_2012_3_30_10065 (accessed on 31 July 2020).
- Oyekola, O.; Xu, L. Selecting SaaS CRM Solution for SMEs. In Proceedings of the ICIST 2020: 10th International Conference on Information Systems and Technologies, Lecce, Italy, 4–5 June 2020; Available online: http://eprints.bournemouth.ac.uk/33047/ (accessed on 31 July 2020).
- Martino, B.D. Applications Portability and Services Interoperability among Multiple Clouds. IEEE Cloud Comput. 2014, 1, 74–77. [Google Scholar] [CrossRef]
- Finta, G. Mitigating the Effects of Vendor Lock-in in Edge Cloud Environments with Open-Source Technologies. October 2019. Available online: https://aaltodoc.aalto.fi:443/handle/123456789/40884 (accessed on 31 July 2020).
- Cammert, M.; Kramer, J.; Seeger, B.; Vaupel, S. A Cost-Based Approach to Adaptive Resource Management in Data Stream Systems. IEEE Trans. Knowl. Data Eng. 2008, 20, 230–245. [Google Scholar] [CrossRef]
- Li, P.; Guo, S.; Yu, S.; Zhuang, W. Cross-Cloud MapReduce for Big Data. IEEE Trans. Cloud Comput. 2020, 8, 375–386. [Google Scholar] [CrossRef]
- Košeleva, N.; Ropaitė, G. Big data in building energy efficiency: Understanding of big data and main challenges. Procedia Eng. 2017, 172, 544–549. [Google Scholar] [CrossRef]
- Akidau, T.; Bradshaw, R.; Chambers, C.; Chernyak, S.; Fern, R.J.; Lax, R.; Mcveety, S.; Mills, D.; Perry, F.; Schmidt, E.; et al. The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc. VLDB Endow. 2015, 8, 1792–1803. [Google Scholar] [CrossRef]
- Jha, D.N.; Garg, S.; Jayaraman, P.P.; Buyya, R.; Li, Z.; Ranjan, R. A Holistic Evaluation of Docker Containers for Interfering Microservices. In Proceedings of the 2018 IEEE International Conference on Services Computing (SCC), San Francisco, CA, USA, 2–7 July 2018; pp. 33–40. [Google Scholar] [CrossRef]
- Zhao, D.; Mohamed, M.; Ludwig, H. Locality-Aware Scheduling for Containers in Cloud Computing. IEEE Trans. Cloud Comput. 2020, 8, 635–646. [Google Scholar] [CrossRef]
- Goedtel, M.; Swathi, D.; Bradley, M.; Wren, B.; Wallace, G. Collect and Analyze Performance Counters in Azure Monitor. Azure Monitor Documentation. 28 November 2018. Available online: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/data-sources-performance-counters (accessed on 17 May 2019).
- Prometheus—Monitoring System & Time Series Database. Prometheus. 2019. Available online: https://prometheus.io/ (accessed on 17 May 2019).
Contribution | Domain | Primary Quantitative Evaluation | Data Processing Focus | Infrastructure | Virtualisation Level | Primary Qualitative Evaluation |
---|---|---|---|---|---|---|
[8] | Scientific Workflows | Y | Batch | Cloud | Hypervisor | N |
[43] | Generic | N | Batch/Stream | Cloud/Bare-Metal | Hypervisor | N |
[44] | Large Corporations | N | Batch | Cloud/Bare-Metal | N/A * | N |
[9] | Healthcare | N | Batch/Stream | N/A * | N/A * | N |
[45] | Generic | Y | Batch/Stream | Cloud | Hypervisor | N |
[46] | Generic | N | Batch | N/A * | N/A * | Y † |
[10] | National Security | Y | Batch/Stream | Bare-Metal | N/A * | N |
[47] | Large Corporations | N | Batch/Stream | Cloud/Bare-Metal | Any | N |
[48] | IoT | Y | Stream | Cloud/Bare-Metal | N/A * | N |
[49] | Generic | N | N/A * | Cloud | Container | N |
[50] | Edge Computing | Y | Stream | Bare-Metal | Container | N |
[51] | Security | N | N/A * | Cloud | Any | N |
[52] | Generic | Y | Stream | Cloud | Any | N |
MC-BDP | SMEOD | Y | Stream | Cloud/Bare-Metal | Container | Y |
Technology | Purpose | Alternatives Considered | Rationale |
---|---|---|---|
Virtual Machines from Azure, Google, and OSDC | Run Kafka server, orchestration, data processing, and monitoring containers | Virtual machines from different providers, private cloud, bare metal, CaaS | The experiments were designed to evaluate a multi-cloud setup using commercial providers. Grants were received from the providers selected. |
Linux Operating System | Run containers on a PaaS model | Windows, macOS, ChromeOS | Native integration with Docker, open-source, and free of costs. |
Docker Containers | Run the big data framework’s data processing containers | Linux Containers (LXC), OpenVZ, or Linux-VServer | Most popular container technology, open-source. |
Docker Swarms | Container orchestration | Kubernetes, Mesos | Although Kubernetes is more feature-rich and more widely used, Docker swarms are simpler to configure and native to Docker. |
Weave Net | Container networking across clouds | Pipework, Flannel, Open vSwitch | Simple to install, support for multi-cloud networking. |
Kafka | Stream data source. | Pub/Sub, Kinesis, RabbitMQ | Open-source, persistent, scalable, suitable for high data throughput. |
Flink | Stream processing | Spark, Dataflow | Flink was, at the time of implementation, the most capable open-source runner for Apache Beam [95]. It was also available as a set of containerised services for Docker. |
Average CPU Utilisation Hypothesis Testing Results, Average Container CPU Utilisation By Velocity | Scalability Metrics (for Workers) | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
2 records/time | Single % | Multi % | Single % | Multi % | Three Workers and Three Workers (Single and Multi) | ||||||
1 min | 3.52 | 3.44 | 2.51 | 3.61 | p-value = 0.8857, accept Ho | ||||||
1 sec | 37.94 | 40.21 | 18.22 | 23.17 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 42.25 | 36.39 | 25.46 | 26.24 | p-value = 0.1905, accept Ho (S,S) p-value = 0.2857, accept Ho (S,M) p-value = 0.1905, accept Ho (M,S) p-value = 0.2857, accept Ho (M,M) | ||||||
10 ms | 41.58 | 45.46 | 23.02 | 29.27 | |||||||
1 ms | NA | NA | 36.26 | 33.02 | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 31.32 | 31.38 | 17.30 | 20.57 | p-value = 0.5476, accept Ho | ||||||
Fault Tolerance Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
2 records/time | Single % | Multi % | Single % | Multi % | Three Workers and Three Workers (Single and Multi) | ||||||
1 min | 10.08 | 3.58 | 4.97 | 5.02 | p-value = 0.3429, accept Ho | ||||||
1 sec | 7.11 | 5.92 | 6.25 | 8.30 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 11.09 | 14.83 | 8.44 | 7.17 | p-value = 0.3429, accept Ho (S,S) p-value = 0.4857, accept Ho (S,M) p-value = 0.8857, accept Ho (M,S) p-value = 0.8857, accept Ho (M,M) | ||||||
10 ms | 22.68 | 8.83 | 16.55 | 16.90 | |||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 12.74 | 8.29 | 9.05 | 9.35 | p-value = 0.8857, accept Ho | ||||||
Technology Agnosticism Metrics (for Beam and Flink SDK) | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
2 records/time | Beam | Flink | Beam | Flink | Beam | Flink | |||||
S% | M% | S% | M% | S% | M% | S% | M% | Three Workers and Three Workers (Single and Multi) | |||
1 min | 3.4 | 2.9 | 3.4 | 1.9 | 1.5 | 2.6 | 2.1 | 2.6 | p-value = 0.6905, accept Ho | p-value = 0.4633, accept Ho | |
1 sec | 33.8 | 27.9 | 36.5 | 34.3 | 19.7 | 17.3 | 20.0 | 16.6 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 44.7 | 36.5 | 38.5 | 36.1 | 23.2 | 23.2 | 18.7 | 20.2 | p-value = 0.1425, accept Ho (S,S) p-value = 0.1508, accept Ho (S,M) p-value = 0.1425, accept Ho (M,S) p-value = 0.4206, accept Ho (M,M) | p-value = 0.09524, accept Ho (S,S) p-value = 0.09524, accept Ho (S,M) p-value = 0.1508, accept Ho (M,S) p-value =0.1508, accept Ho (M,M) | |
10 ms | 46.4 | 35.9 | 37.4 | 36.8 | 23.2 | 29.0 | 18.6 | 20.8 | |||
1 ms | 64.8 | 66.7 | 35.2 | 36.5 | 35.6 | 37.2 | 17.7 | 18.4 | Six Workers and Six Workers (Single and Multi) | ||
Average | 36.82 | 33.98 | 30.20 | 29.12 | 20.64 | 21.86 | 15.42 | 15.72 | p-value = 0.8325, accept Ho | p-value = 0.8413, accept Ho | |
Average of Container CPU Utilisation by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) (%) | Six Workers (n = 6) (%) | Ten Workers (n = 10) (%) | Mann–Whitney U Test Results | |||||||
0.1 | 80 | 46 | 29 | Three Workers and Six Workers | |||||||
0.2 | 39 | 19 | 16 | p-value = 0.4206, accept Ho | |||||||
1.0 | 16 | 9 | 6 | Three Workers and Ten Workers | |||||||
1.5 | 10 | 6 | 6 | p-value = 0.1719, accept Ho | |||||||
2.0 | 8 | 7 | 5 | Six Workers and Ten Workers | |||||||
Average | 30.60 | 10.25 | 12.40 | p-value = 0.3976, accept Ho | |||||||
Average node CPU utilisation by node Cluster | Container Co-Location Metrics | ||||||||||
Cluster (n = 1) (%) | Cluster (n = 2) (%) | Cluster (n = 4) (%) | |||||||||
13.39 | 15.96 | 14.47 |
Maximum Container CPU Utilisation By Velocity | Scalability Metrics | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single % | Multi % | Single % | Multi % | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 25 | 41 | 39 | 36 | p-value = 0.7715, accept Ho | ||||||
1 sec | 70 | 70 | 71 | 56 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 86 | 67 | 84 | 60 | p-value = 0.9048, accept Ho (S,S) p-value = 1.0952, accept Ho (S,M) p-value = 1.0952, accept Ho (M,S) p-value = 0.9048, accept Ho (M,M) | ||||||
10 ms | 90 | 89 | 64 | 88 | |||||||
1 ms | NA | NA | 88 | 92 | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 67.75 | 66.75 | 64.50 | 60.00 | p-value = 0.9166, accept Ho | ||||||
Fault Tolerance Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single % | Multi % | Single % | Multi % | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 24 | 7 | 17 | 15 | p-value = 0.08143, accept Ho | ||||||
1 sec | 17 | 13 | 13 | 28 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 25 | 24 | 22 | 20 | p-value = 0.3836, accept Ho (S,S) p-value = 0.8857, accept Ho (S,M) p-value = 0.3836, accept Ho (M,S) p-value = 0.1143, accept Ho (M,M) | ||||||
10 ms | 58 | 14 | 47 | 54 | |||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 31.00 | 14.50 | 24.75 | 29.25 | p-value = 0.6857, accept Ho | ||||||
Technology Agnosticism Metrics (for Beam and Flink SDK) | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S% | M% | S% | M% | S% | M% | S% | M% | Three Workers and Three Workers (Single and Multi) | |||
1 min | 27 | 27 | 27 | 20 | 27 | 38 | 19 | 18 | p-value = 0.8335, accept Ho | p-value = 0.5476, accept Ho | |
1 sec | 70 | 53 | 44 | 41 | 59 | 44 | 29 | 25 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 79 | 79 | 62 | 47 | 70 | 49 | 30 | 27 | p-value = 0.4606, accept Ho (S,S) p-value = 0.5476, accept Ho (S,M) p-value = 0.7533, accept Ho (M,S) p-value = 0.5476, accept Ho (M,M) | p-value = 0.09369, accept Ho (S,S) p-value = 0.04653, accept Ho (S,M) p-value = 0.09369, accept Ho (M,S) p-value =0.03615, accept Ho (M,M) | |
10 ms | 90 | 75 | 54 | 55 | 66 | 71 | 29 | 38 | |||
1 ms | 92 | 95 | 50 | 46 | 92 | 91 | 31 | 28 | Six Workers and Six Workers (Single and Multi) | ||
Average | 71.60 | 65.80 | 47.40 | 41.80 | 62.80 | 58.60 | 27.60 | 27.20 | p-value = 0.8413, accept Ho | p-value = 0.09524, accept Ho | |
Maximum Container CPU Utilisation by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) (%) | Six Workers (n = 6) (%) | Ten Workers (n = 10) (%) | Mann–Whitney U Test Results | |||||||
0.1 | 93 | 92 | 86 | Three Workers and Six Workers | |||||||
0.2 | 85 | 50 | 72 | p-value = 1.0000, accept Ho | |||||||
1.0 | 41 | 40 | 44 | Three Workers and Ten Workers | |||||||
1.5 | 32 | 45 | 42 | p-value = 0.6905, accept Ho | |||||||
2.0 | 24 | 26 | 40 | Six Workers and Ten Workers | |||||||
Average | 55.00 | 50.60 | 56.80 | p-value = 0.9166, accept Ho | |||||||
Maximum node CPU utilisation by node Cluster | Container Co-Location Metrics | ||||||||||
Cluster (n = 1) (%) | Cluster (n = 2) (%) | Cluster (n = 4) (%) | |||||||||
43.00 | 83.00 | 64.00 | |||||||||
Maximum node CPU utilisation by Windowing Rate and node Cluster | Windowing Rate (Period/Duration) | Cluster (n = 1) (%) | Cluster (n = 2) (%) | Cluster (n = 4) (%) | Mann–Whitney U Test Results | ||||||
0.1 | 43.00 | 83.00 | 63.50 | Three Workers and Six Workers | |||||||
0.2 | 26.17 | 69.00 | 40.00 | p-value = 0.2000, accept Ho | |||||||
1.0 | 11.58 | 30.00 | 17.00 | Three Workers and Ten Workers | |||||||
1.5 | 11.93 | 13.00 | 20.00 | p-value = 0.4857, accept Ho | |||||||
Average | 23.17 | 48.75 | 35.13 | Six Workers and Ten Workers | |||||||
p-value = 0.6857, accept Ho |
Average Container Memory Utilisation (MB) By Velocity | Scalability Metrics | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 824 | 865 | 790 | 812 | p-value = 0.2000, accept Ho | ||||||
1 sec | 955 | 976 | 933 | 1015 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 957 | 995 | 941 | 992 | p-value = 0.4217, accept Ho (S,S) p-value = 0.1905, accept Ho (S,M) p-value = 0.1905, accept Ho (M,S) p-value = 0.5556, accept Ho (M,M) | ||||||
10 ms | 968 | 1002 | 942 | 998 | |||||||
1 ms | NA | NA | 964 | 1058 | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 926.00 | 959.50 | 901.50 | 954.25 | p-value = 0.09524, accept Ho | ||||||
Fault Tolerance Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 837 | 758 | 882 | 684 | p-value = 0.6857, accept Ho | ||||||
1 sec | 839 | 842 | 717 | 758 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 833 | 829 | 879 | 776 | p-value = 0.2000, accept Ho (S,S) p-value = 0.05714, accept Ho (S,M) p-value = 0.3429, accept Ho (M,S) p-value =0.0294, accept Ho (M,M) | ||||||
10 ms | 699 | 849 | 897 | 904 | |||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 802.00 | 819.50 | 843.75 | 780.50 | p-value = 0.5614, accept Ho | ||||||
Technology Agnosticism Metrics (Azure) | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | |||
1 min | 869 | 861 | 914 | 911 | 783 | 764 | 880 | 876 | p-value = 1.0000, accept Ho | p-value = 1.0000, accept Ho | |
1 sec | 965 | 971 | 894 | 927 | 941 | 949 | 929 | 903 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 957 | 978 | 926 | 909 | 931 | 913 | 926 | 912 | p-value = 0.09524, accept Ho (S,S) p-value = 0.09524, accept Ho (S,M) p-value = 0.09524, accept Ho (M,S) p-value = 0.1508, accept Ho (M,M) | p-value = 0.8335, accept Ho (S,S) p-value = 0.4206, accept Ho (S,M) p-value = 0.9163, accept Ho (M,S) p-value = 0.402, accept Ho (M,M) | |
10 ms | 959 | 955 | 886 | 910 | 948 | 899 | 914 | 891 | |||
1 ms | 1000 | 987 | 924 | 911 | 946 | 956 | 909 | 916 | Six Workers and Six Workers (Single and Multi) | ||
Average | 950.0 | 950.4 | 908.8 | 913.6 | 909.8 | 896.2 | 911.6 | 899.6 | p-value = 1.0000, accept Ho | p-value = 0.3095, accept Ho | |
Average Container Memory Utilisation by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) (MB) | Six Workers (n = 6) (MB) | Ten Workers (n = 10) (MB) | Mann–Whitney U Test Results | |||||||
0.1 | 1008 | 986 | 980 | Three Workers and Six Workers | |||||||
0.2 | 974 | 1002 | 988 | p-value = 0.5476, accept Ho | |||||||
1.0 | 947 | 962 | 903 | Three Workers and Ten Workers | |||||||
1.5 | 968 | 943 | 892 | p-value = 0.4206, accept Ho | |||||||
2.0 | 972 | 910 | 815 | Six Workers and Ten Workers | |||||||
Average | 973.8 | 960.6 | 915.6 | p-value = 0.3095, accept Ho | |||||||
Average memory utilisation by node Cluster | Container Co-Location Metrics | ||||||||||
Cluster (n = 1) (MB) | Cluster (n = 2) (MB) | Cluster (n = 4) (MB) | |||||||||
39301 | 39727 | 41799 |
Maximum Container Memory Utilisation (MB) By Velocity | Scalability Metrics | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 947 | 987 | 946 | 934 | p-value = 0.02857, accept Ho | ||||||
1 sec | 959 | 983 | 939 | 1015 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 969 | 1004 | 947 | 1025 | p-value = 0.4606, accept Ho (S,S) p-value = 0.1905, accept Ho (S,M) p-value = 0.06349, accept Ho (M,S) p-value = 0.2957, accept Ho (M,M) | ||||||
10 ms | 977 | 1022 | 967 | 1037 | |||||||
1 ms | NA | NA | 990 | 1083 | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 963.00 | 999.00 | 949.75 | 1002.75 | p-value = 0.1508, accept Ho | ||||||
Fault Tolerance Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 906 | 831 | 764 | 815 | p-value = 0.1143, accept Ho | ||||||
1 sec | 897 | 879 | 762 | 847 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 928 | 911 | 764 | 872 | p-value = 0.0294, accept Ho (S,S) p-value = 0.3429, accept Ho (S,M) p-value = 0.0294, accept Ho (M,S) p-value = 0.6857, accept Ho (M,M) | ||||||
10 ms | 1003 | 876 | 825 | 1021 | |||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 933.50 | 874.25 | 778.75 | 888.75 | p-value = 0.05907, accept Ho | ||||||
Technology Agnosticism Metrics (Azure) | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | |||
1 min | 974 | 1012 | 934 | 923 | 961 | 953 | 935 | 909 | p-value = 0.5476, accept Ho | p-value = 0.7533, accept Ho | |
1 sec | 1006 | 1014 | 909 | 956 | 993 | 968 | 966 | 914 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 995 | 1001 | 941 | 939 | 973 | 969 | 961 | 935 | p-value = 0.2222, accept Ho (S,S) p-value = 0.05556, accept Ho (S,M) p-value = 0.05556, accept Ho (M,S) p-value = 0.01587, reject Ho (M,M) | p-value = 0.3457, accept Ho (S,S) p-value = 0.7533, accept Ho (S,M) p-value = 0.3457, accept Ho (M,S) p-value = 0.7533, accept Ho (M,M) | |
10 ms | 992 | 982 | 900 | 934 | 1000 | 973 | 930 | 946 | |||
1 ms | 1079 | 1064 | 962 | 935 | 991 | 999 | 941 | 944 | Six Workers and Six Workers (Single and Multi) | ||
Average | 1009 | 1014 | 929 | 941 | 983 | 972 | 946 | 929 | p-value = 0.3457, accept Ho | p-value = 0.3457, accept Ho | |
Maximum Container Memory Utilisation by Windowing Rate (Period/ Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) (MB) | Six Workers (n = 6) (MB) | Ten Workers (n = 10) (MB) | Mann–Whitney U Test Results | |||||||
0.1 | 10,176 | 1011 | 991 | Three Workers and Six Workers | |||||||
0.2 | 994 | 1003 | 1011 | p-value = 0.1412, accept Ho | |||||||
1.0 | 999 | 1010 | 1005 | Three Workers and Ten Workers | |||||||
1.5 | 986 | 1003 | 983 | p-value = 0.5476, accept Ho | |||||||
2.0 | 998 | 1011 | 952 | Six Workers and Ten Workers | |||||||
Average | 2830 | 1007 | 988 | p-value = 0.2031, accept Ho | |||||||
Maximum node Memory utilisation by Windowing Rate and node Cluster | Windowing Rate (Period/Duration) | Cluster (n = 1) (MB) | Cluster (n = 2) (MB) | Cluster (n = 4) (MB) | Mann–Whitney U Test Results | ||||||
0.1 | 1050 | 2161 | 4474 | Three Workers and Six Workers | |||||||
0.2 | 1085 | 2186 | 4517 | p-value = 0.01193, reject Ho | |||||||
1.0 | 1043 | 2174 | 4475 | Three Workers and Ten Workers | |||||||
1.5 | 1039 | 2171 | 4475 | p-value = 0.01193, reject Ho | |||||||
2.0 | 1078 | 2161 | 4441 | Six Workers and Ten Workers | |||||||
Average | 1059 | 2170 | 4476 | p-value = 0.01167, reject Ho | |||||||
Maximum node Memory utilisation by node Cluster | Container Co-Location Metrics | ||||||||||
Cluster (n = 1) (MB) | Cluster (n = 2) (MB) | Cluster (n = 4) (MB) | |||||||||
8091 | 8537 | 8824 |
Total Number of GB Sent over the Network by Velocity | Scalability Metrics | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 0.04 | 0.03 | 0.06 | 0.05 | p-value = 0.7702, accept Ho | ||||||
1 sec | 0.06 | 0.04 | 0.09 | 0.06 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 0.06 | 0.10 | 0.12 | 0.10 | p-value = 0.3832, accept Ho (S,S) p-value = 0.3832, accept Ho (S,M) p-value = 0.5556, accept Ho (M,S) p-value = 0.3532, accept Ho (M,M) | ||||||
10 ms | 0.48 | 0.47 | 0.21 | 0.49 | |||||||
1 ms | NA | NA | 0.36 | 3.30 | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 0.16 | 0.16 | 0.12 | 0.12 | p-value = 0.9166, accept Ho | ||||||
Fault Tolerance Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | |||||||
1 min | 0.84 | 0.34 | 4.50 | 0.80 | p-value = 0.8845, accept Ho | ||||||
1 sec | 0.76 | 1.44 | 2.62 | 0.95 | Three Workers and Six Workers ([S,S] x [M,M]) | ||||||
100 ms | 0.84 | 0.52 | 1.48 | 1.10 | p-value = 0.0294, accept Ho (S,S) p-value = 0.1102, accept Ho (S,M) p-value = 0.02857, accept Ho (M,S) p-value = 0.2000, accept Ho (M,M) | ||||||
10 ms | 0.33 | 0.63 | 1.70 | 1.88 | |||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | ||||||
Average | 0.69 | 0.73 | 2.58 | 1.18 | p-value = 0.1143, accept Ho | ||||||
Technology Agnosticism Metrics (for Beam and Flink SDK) | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | |||
1 min | 0.14 | 0.02 | 0.04 | 0.01 | 0.07 | 0.05 | 0.05 | 0.03 | p-value = 0.2948, accept Ho | p-value = 0.5219, accept Ho | |
1 sec | 0.11 | 0.02 | 0.04 | 0.01 | 0.08 | 0.06 | 0.05 | 0.03 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 0.08 | 0.06 | 0.03 | 0.03 | 0.14 | 0.10 | 0.05 | 0.04 | p-value = 1.0000, accept Ho (S,S) p-value = 0.4206, accept Ho(S,M) p-value = 0.2948, accept Ho(M,S) p-value = 0.7526, accept Ho(M,M) | p-value = 0.8668, accept Ho (S,S) p-value = 0.9131, accept Ho (S,M) p-value = 0.1599, accept Ho (M,S) p-value = 0.4564, accept Ho (M,M) | |
10 ms | 0.37 | 0.35 | 0.04 | 0.05 | 0.48 | 0.27 | 0.08 | 0.05 | |||
1 ms | 3.52 | 3.25 | 0.17 | 0.17 | 6.26 | 3.02 | 0.28 | 0.24 | Six Workers and Six Workers (Single and Multi) | ||
Average | 0.84 | 0.74 | 0.06 | 0.05 | 1.41 | 0.70 | 0.10 | 0.08 | p-value = 0.5476, accept Ho | p-value = 0.1599, accept Ho | |
Total Number of GB Sent over the Network by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10) | Mann–Whitney U Test Results | |||||||
0.1 | 0.49 | 0.63 | 1.02 | Three Workers and Six Workers | |||||||
0.2 | 0.34 | 0.44 | 0.56 | p-value = 0.6004, accept Ho | |||||||
1.0 | 0.13 | 0.14 | 0.23 | Three Workers and Ten Workers | |||||||
1.5 | 0.09 | 0.11 | 0.19 | p-value = 0.3095, accept Ho | |||||||
2.0 | 0.10 | 0.10 | 0.12 | Six Workers and Ten Workers | |||||||
Average | 0.23 | 0.28 | 0.42 | p-value = 0.4206, accept Ho | |||||||
Total Number of GB Sent over the Network by Number of Containers per Node (by Windowing Rate and Node Cluster) | Container Colocation Metrics | ||||||||||
Windowing Rate (Period/Duration) | Cluster (n = 1) (GB) | Cluster (n = 2) (GB) | Cluster (n = 4) (GB) | Mann–Whitney U Test Results | |||||||
0.1 | 0.70 | 0.80 | 0.30 | Three Workers and Six Workers | |||||||
0.2 | 0.30 | 0.40 | 0.30 | p-value = 0.7526, accept Ho | |||||||
1.0 | 0.00 | 0.00 | 0.00 | Three Workers and Ten Workers | |||||||
1.5 | 0.20 | 0.10 | 0.10 | p-value = 0.3398, accept Ho | |||||||
2.0 | 0.42 | 0.10 | 0.11 | Six Workers and Ten Workers | |||||||
Average | 0.32 | 0.28 | 0.16 | p-value = 0.9153, accept Ho |
Total Number of GB Received over the Network by Velocity | Scalability Metrics | |||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | |||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | ||||||||
1 min | 3.91 | 3.53 | 7.94 | 5.17 | p-value = 0.4857, accept Ho | |||||||
1 sec | 4.80 | 4.17 | 7.57 | 5.18 | Three Workers and Six Workers ([S,S] x [M,M]) | |||||||
100 ms | 3.32 | 3.58 | 7.38 | 4.84 | p-value = 0.01587, reject Ho (S,S) p-value = 0.02684, accept Ho (S,M) p-value = 0.01587, reject Ho (M,S) p-value = 0.01587, reject Ho (M,M) | |||||||
10 ms | 4.84 | 3.36 | 5.72 | 5.73 | ||||||||
1 ms | NA | NA | 9.64 | 12.49 | Six Workers and Six Workers (Single and Multi) | |||||||
Average | 4.22 | 3.66 | 7.15 | 5.98 | p-value = 0.2222, reject Ho | |||||||
Fault Tolerance Metrics | ||||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | |||||||||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | ||||||||
1 min | 1.91 | 2.02 | 5.52 | 5.58 | p-value = 0.8857, accept Ho | |||||||
1 sec | 1.66 | 1.94 | 6.34 | 4.71 | Three Workers and Six Workers ([S,S] x [M,M]) | |||||||
100 ms | 2.95 | 1.90 | 4.64 | 4.75 | p-value = 0.02857, accept Ho (S,S) p-value = 0.02857, accept Ho (S,M) p-value = 0.02857, accept Ho (M,S) p-value = 0.02857, accept Ho (M,M) | |||||||
10 ms | 1.46 | 1.62 | 5.27 | 5.64 | ||||||||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | |||||||
Average | p-value = 0.2222, reject Ho | |||||||||||
Technology Agnosticism Metrics (for Beam and Flink SDK) | ||||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | |||||||||
Beam | Flink | Beam | Flink | Beam | Flink | |||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | ||||
1 min | 4.84 | 2.28 | 1.83 | 1.03 | 8.75 | 7.10 | 4.19 | 3.45 | p-value = 0.09269, accept Ho | p-value = 0.007937, reject Ho | ||
1 sec | 4.84 | 2.28 | 2.25 | 1.09 | 9.16 | 7.23 | 4.73 | 3.27 | Three Workers and Six Workers ([S,S] x [M,M]) | |||
100 ms | 4.61 | 2.31 | 2.32 | 1.16 | 9.47 | 6.51 | 4.57 | 2.92 | p-value = 0.05933, accept Ho (S,S) p-value = 0.09369, accept Ho (S,M) p-value = 0.03615, accept Ho (M,S) p-value = 0.5296, accept Ho (M,M) | p-value = 0.007937, reject Ho (S,S) p-value = 0.01587, reject Ho (S,M) p-value = 0.007937, accept Ho (M,S) p-value = 0.01587, accept Ho (M,M) | ||
10 ms | 5.14 | 4.19 | 2.04 | 1.92 | 10.06 | 7.96 | 4.66 | 3.79 | ||||
1 ms | 10.03 | 9.26 | 3.14 | 3.07 | 20.79 | 16.21 | 7.76 | 5.57 | Six Workers and Six Workers (Single and Multi) | |||
Average | 5.89 | 4.06 | 2.32 | 1.65 | 11.65 | 9.00 | 5.18 | 3.80 | p-value = 0.09524, accept Ho | p-value = 0.09524, accept Ho | ||
Total Number of GB Received over the Network by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | |||||||||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10) | Mann–Whitney U Test Results | ||||||||
0.1 | 4.46 | 7.40 | 13.49 | Three Workers and Six Workers | ||||||||
0.2 | 4.45 | 7.81 | 13.70 | p-value = 0.007937, reject Ho | ||||||||
1.0 | 4.39 | 7.57 | 13.69 | Three Workers and Ten Workers | ||||||||
1.5 | 4.23 | 7.56 | 12.91 | p-value = 0.007937, reject Ho | ||||||||
2.0 | 4.12 | 7.42 | 12.87 | Six Workers and Ten Workers | ||||||||
Average | 4.33 | 7.55 | 13.33 | p-value = 0.007937, reject Ho | ||||||||
Total Number of GB Received over the Network by Number of Containers per Node (by Windowing Rate and Node Cluster) | Container Co-Location Metrics | |||||||||||
Windowing Rate (Period/Duration) | Cluster (n = 1) (GB) | Cluster (n = 2) (GB) | Cluster (n = 4) (GB) | Mann–Whitney U Test Results | ||||||||
0.1 | 9.88 | 9.96 | 8.64 | Three Workers and Six Workers | ||||||||
0.2 | 8.51 | 8.90 | 8.42 | p-value = 0.4206, accept Ho | ||||||||
1.0 | 9.11 | 8.97 | 7.68 | Three Workers and Ten Workers | ||||||||
1.5 | 9.10 | 8.70 | 9.21 | p-value = 0.1508, accept Ho | ||||||||
2.0 | 9.15 | 7.91 | 8.18 | Six Workers and Ten Workers | ||||||||
Average | 9.15 | 8.89 | 8.43 | p-value = 0.3095, accept Ho |
Total Number of GB Received by the Job Manager over the Network | Technology Agnosticism Metrics (for Beam and Flink SDK) | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | |||
1 min | 5.87 | 4.59 | 2.46 | 2.25 | 8.55 | 10.25 | 4.60 | 5.04 | p-value = 0.1425, accept Ho | p-value = 1.0000, accept Ho | |
1 sec | 4.88 | 4.65 | 2.31 | 2.25 | 9.42 | 10.35 | 5.02 | 4.38 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 5.14 | 4.74 | 2.47 | 2.51 | 10.27 | 11.18 | 4.99 | 4.98 | p-value = 0.01193, reject Ho (S,S) p-value = 0.007937, reject Ho (S,M) p-value = 0.001193, reject Ho (M,S) p-value = 0.4020, accept Ho (M,M) | p-value = 0.007937, reject Ho (S,S) p-value = 0.01193, reject Ho (S,M) p-value = 0.01193, reject Ho (M,S) p-value = 0.01167, reject Ho (M,M) | |
10 ms | 5.17 | 4.71 | 2.30 | 2.53 | 10.27 | 10.33 | 5.52 | 4.98 | |||
1 ms | 6.98 | 7.31 | 3.80 | 3.38 | 15.78 | 15.66 | 7.68 | 7.76 | Six Workers and Six Workers (Single and Multi) | ||
Average | 5.61 | 5.20 | 2.67 | 2.58 | 10.86 | 11.55 | 5.56 | 5.43 | p-value = 0.01167, reject Ho | p-value = 0.6572, accept Ho | |
Total Number of GB Sent by the Job Manager over the Network | Technology Agnosticism Metrics (for Beam and Flink SDK) | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Beam | Flink | ||||||
S | M | S | M | S | M | S | M | Three Workers and Three Workers (Single and Multi) | |||
1 min | 16.13 | 13.53 | 6.60 | 6.52 | 49.13 | 58.96 | 25.92 | 28.27 | p-value = 0.1425, accept Ho | p-value = 1.0000, accept Ho | |
1 sec | 13.62 | 13.56 | 6.50 | 6.52 | 54.26 | 58.98 | 28.24 | 23.59 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
100 ms | 14.83 | 13.59 | 7.12 | 7.13 | 59.12 | 63.85 | 28.29 | 28.23 | p-value = 0.007937, reject Ho (S,S) p-value = 0.007937, reject Ho (S,M) p-value = 0.01193, reject Ho (M,S) p-value = 0.01193, reject Ho (M,M) | p-value = 0.007937, reject Ho (S,S) p-value = 0.007937, reject Ho (S,M) p-value = 0.01193, reject Ho (M,S) p-value = 0.01193, reject Ho (M,M) | |
10 ms | 14.84 | 13.59 | 6.54 | 7.09 | 59.11 | 58.94 | 30.70 | 28.24 | |||
1 ms | 19.77 | 21.02 | 10.72 | 9.52 | 88.56 | 88.42 | 42.57 | 42.43 | Six Workers and Six Workers (Single and Multi) | ||
Average | 15.84 | 15.06 | 7.50 | 7.36 | 62.04 | 65.83 | 31.14 | 30.15 | p-value = 0.8413, accept Ho | p-value = 0.4633, accept Ho |
Total Number of GB Received by the Job Manager over the Network | Windowing Rate versus Resource Utilisation Metrics | ||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10) | Mann–Whitney U Test Results | |
0.1 | 5.21 | 9.76 | 8.29 | Three Workers and Six Workers | |
0.2 | 4.77 | 10.51 | 18.02 | p-value = 0.01587, reject Ho | |
1.0 | 2.42 | 4.89 | 18.17 | Three Workers and Ten Workers | |
1.5 | 4.79 | 9.73 | 7.01 | p-value = 0.007937, reject Ho | |
2.0 | 4.87 | 8.94 | 16.56 | Six Workers and Ten Workers | |
Average | 4.41 | 8.77 | 13.61 | p-value = 0.4206, accept Ho | |
Total Number of GB Sent by the Job Manager over the Network | Windowing Rate versus Resource Utilisation Metrics | ||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10)) | Mann–Whitney U Test Results | |
0.1 | 14.90 | 54.93 | 76.20 | Three Workers and Six Workers | |
0.2 | 13.75 | 59.49 | 165.26 | p-value = 0.007937, reject Ho | |
1.0 | 6.91 | 27.45 | 165.13 | Three Workers and Ten Workers | |
1.5 | 13.76 | 54.94 | 63.68 | p-value = 0.007937, reject Ho | |
2.0 | 13.78 | 50.35 | 139.77 | Six Workers and Ten Workers | |
Average | 12.62 | 49.43 | 122.01 | p-value = 0.07937, reject Ho |
Total Number of GB Received by the Job Manager over the Network by Number of Containers per Node | Container Co-Location Metrics | ||||
Windowing Rate (Period/Duration) | Cluster (n = 1) | Cluster (n = 2) | Cluster (n = 4) | Mann–Whitney U Test Results | |
0.1 | 1.67 | 1.36 | 1.64 | Three Workers and Six Workers | |
0.2 | 1.64 | 1.37 | 1.51 | p-value = 0.02733, accept Ho | |
1.0 | 1.53 | 1.50 | 1.36 | Three Workers and Ten Workers | |
1.5 | 1.65 | 1.36 | 1.64 | p-value = 0.05701, accept Ho | |
2.0 | 1.84 | 1.64 | 1.50 | Six Workers and Ten Workers | |
Average | 12.62 | 49.43 | 122.01 | p-value = 0.3337, accept Ho | |
Total Number of GB Sent by the Job Manager over the Network by Number of Containers per Node | Container Co-Location Metrics | ||||
Windowing Rate (Period/Duration) | Cluster (n = 1) | Cluster (n = 2) | Cluster (n = 4) | Mann–Whitney U Test Results | |
0.1 | 12.20 | 10.17 | 12.21 | Three Workers and Six Workers | |
0.2 | 12.22 | 10.16 | 11.18 | p-value = 0.01167, reject Ho | |
1.0 | 11.20 | 11.18 | 10.17 | Three Workers and Ten Workers | |
1.5 | 12.20 | 10.17 | 12.19 | p-value = 0.09269, accept Ho | |
2.0 | 12.28 | 10.18 | 10.17 | Six Workers and Ten Workers | |
Average | 1.67 | 1.45 | 1.53 | p-value = 0.2343, accept Ho |
Total Records Processed | Fault Tolerance Metrics | |||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | |||
Single | Multi | Single | Multi | Three Workers and Three Workers (Single and Multi) | ||
1 min | 50 | 50 | 50 | 50 | p-value = 1.0000, accept Ho | |
1 sec | 2987 | 2998 | 2990 | 2986 | Three Workers and Six Workers ([S,S] x [M,M]) | |
100 ms | 29,478 | 30,030 | 29,902 | 29,876 | p-value = 0.7715, accept Ho (S,S) p-value = 1.0000, accept Ho (S,M) p-value = 1.0000, accept Ho (M,S) p-value = 1.0000, accept Ho (M,M) | |
10 ms | 295,765 | 294,179 | 300,673 | 297,689 | ||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | |
Average | 82,070 | 81,814 | 83,403 | 82,650 | p-value = 0.7715, accept Ho | |
Data Loss % | Fault Tolerance Metrics | |||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | |||
Single % | Multi % | Single % | Multi % | Three Workers and Three Workers (Single and Multi) | ||
1 min | 0.00 | 0.00 | 0.00 | 0.00 | p-value = 0.6573, accept Ho | |
1 sec | 0.44 | 0.07 | 0.33 | 0.47 | Three Workers and Six Workers ([S,S] x [M,M]) | |
100 ms | 1.77 | 0.00 | 0.33 | 0.42 | p-value = 0.1804, accept Ho (S,S) p-value = 0.5614, accept Ho (S,M) p-value = 0.877, accept Ho (M,S) p-value = 0.6573, accept Ho (M,M) | |
10 ms | 1.43 | 1.98 | 0.00 | 0.78 | ||
1 ms | NA | NA | NA | NA | Six Workers and Six Workers (Single and Multi) | |
Average | 0.91 | 0.51 | 0.17 | 0.42 | p-value = 0.1804, accept Ho |
Data Ingestion (KB) by Velocity | Technology Agnosticism Metrics (for Beam and Flink SDK) | ||||||||||
Velocity | Three Workers (n = 3) | Six Workers (n = 6) | Mann–Whitney U Test Results | ||||||||
Beam | Flink | Beam | Flink | Three Workers and Three Workers (Single and Multi) | |||||||
S | M | S | M | S | M | S | M | p-value = 1.0000, accept Ho | p-value = 0.9166, accept Ho | ||
1 min | 172 | 74 | 55 | 39 | 259 | 205 | 158 | 126 | Three Workers and Six Workers ([S,S] x [M,M]) | ||
1 sec | 2373 | 2328 | 98 | 90 | 4824 | 4824 | 242 | 230 | p-value = 0.6905, accept Ho (S,S) p-value = 0.6905, accept Ho (S,M) p-value = 0.8413, accept Ho (M,S) p-value = 0.8413, accept Ho (M,M) | p-value = 0.5476, accept Ho (S,S) p-value = 0.5476, accept Ho (S,M) p-value = 0.5476, accept Ho (M,S) p-value = 0.5476, accept Ho (M,M) | |
100 ms | 21,900 | 219,300 | 510 | 543 | 44,580 | 44,580 | 1086 | 1074 | |||
10 ms | 215,700 | 215700 | 4680 | 4680 | 432,600 | 432,600 | 9480 | 9420 | |||
1 ms | 2,154,000 | 2,154,000 | 46,500 | 34,500 | 4,308,000 | 4,308,000 | 93,000 | 93,000 | Six Workers and Six Workers (Single and Multi) | ||
Average | 478,829 | 518,280 | 10,368 | 7970 | 958,052 | 958,041 | 20,793 | 20,770 | p-value = 1.000, accept Ho | p-value = 0.583, accept Ho |
Total KB Ingested by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10) | Mann–Whitney U Test Results | |
0.1 | 143,700 | 143,800 | 144,000 | Three Workers and Six Workers | |
0.2 | 71,900 | 71,990 | 72,280 | p-value = 0.6905, accept Ho | |
1.0 | 14,500 | 14,650 | 14,814 | Three Workers and Ten Workers | |
1.5 | 9690 | 9892 | 10,052 | p-value = 0.6905, accept Ho | |
2.0 | 7410 | 7515 | 7638 | Six Workers and Ten Workers | |
Average | 49,440 | 49,569 | 49,756 | p-value = 0.6905, accept Ho | |
Total Number of Records Ingested by Windowing Rate (Period/Duration) | Windowing Rate versus Resource Utilisation Metrics | ||||
Windowing Rate (Period/Duration) | Three Workers (n = 3) | Six Workers (n = 6) | Ten Workers (n = 10) | Mann–Whitney U Test Results | |
0.1 | 600,000 | 600,000 | 600,000 | Three Workers and Six Workers | |
0.2 | 300,000 | 300,000 | 300,000 | p-value = 0.9161, accept Ho | |
1.0 | 60,000 | 60,000 | 60,000 | Three Workers and Ten Workers | |
1.5 | 39,945 | 40,106 | 39,920 | p-value = 0.9161, accept Ho | |
2.0 | 30,361 | 30,197 | 29,637 | Six Workers and Ten Workers | |
Average | 206,061 | 206,060 | 205,911 | p-value = 0.9161, accept Ho | |
Total KB ingested by Number of Containers per Node Windowing Rate (Period/Duration) | Container Co-Location Metrics | ||||
Windowing Rate (Period/Duration) | Cluster (n = 1) | Cluster (n = 2) | Cluster (n = 4) | Mann–Whitney U Test Results | |
0.1 | 144,100 | 144,000 | 143,900 | Three Workers and Six Workers | |
0.2 | 72,150 | 72,090 | 72,140 | p-value = 0.8413, accept Ho | |
1.0 | 14,746 | 14,724 | 14,711 | Three Workers and Ten Workers | |
1.5 | 10,044 | 9935 | 9998 | p-value = 0.8413, accept Ho | |
2.0 | 7548 | 7569 | 7593 | Six Workers and Ten Workers | |
Average | 49,717 | 49,663 | 49,668 | p-value = 1.0000, accept Ho | |
Total Number of Records Ingested by Number of Containers per Node by Windowing Rate (Period/Duration) | Container Co-Location Metrics | ||||
Windowing Rate (Period/Duration) | Cluster (n = 1) | Cluster (n = 2) | Cluster (n = 4) | Mann–Whitney U Test Results | |
0.1 | 600,000 | 600,000 | 600,000 | Three Workers and Six Workers | |
0.2 | 300,000 | 300,000 | 300,000 | p-value = 0.9161, accept Ho | |
1.0 | 60,000 | 60,000 | 60,000 | Three Workers and Ten Workers | |
1.5 | 39,827 | 39,869 | 40,073 | p-value = 0.9161, accept Ho | |
2.0 | 28,881 | 29,898 | 30,037 | Six Workers and Ten Workers | |
Average | 205,741 | 205,953 | 206,022 | p-value = 0.9161, accept Ho |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vergilio, T.; Kor, A.-L.; Mullier, D. A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment. Appl. Sci. 2023, 13, 12635. https://doi.org/10.3390/app132312635
Vergilio T, Kor A-L, Mullier D. A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment. Applied Sciences. 2023; 13(23):12635. https://doi.org/10.3390/app132312635
Chicago/Turabian StyleVergilio, Thalita, Ah-Lian Kor, and Duncan Mullier. 2023. "A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment" Applied Sciences 13, no. 23: 12635. https://doi.org/10.3390/app132312635
APA StyleVergilio, T., Kor, A.-L., & Mullier, D. (2023). A Unified Vendor-Agnostic Solution for Big Data Stream Processing in a Multi-Cloud Environment. Applied Sciences, 13(23), 12635. https://doi.org/10.3390/app132312635