Next Article in Journal
Impact of Structural Parameters on the Auditory Perception of Musical Sounds in Closed Spaces: An Experimental Study
Previous Article in Journal
High-Performance White Organic Light-Emitting Diodes Using Distributed Bragg Reflector by Atomic Layer Deposition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment

Faculty of Science and Technology, Middlesex University, The Burroughs, London NW4 4BT, UK
*
Author to whom correspondence should be addressed.
Appl. Sci. 2019, 9(7), 1417; https://doi.org/10.3390/app9071417
Submission received: 28 March 2019 / Revised: 28 March 2019 / Accepted: 29 March 2019 / Published: 4 April 2019
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

Processing big data on traditional computing infrastructure is a challenge as the volume of data is large and thus high computational complexity. Recently, Apache Hadoop has emerged as a distributed computing infrastructure to deal with big data. Adopting Hadoop to dynamically adjust its computing resources based on real-time workload is itself a demanding task, thus conventionally a pre-configuration with adequate resources to compute the peak data load is set up. However, this may cause a considerable wastage of computing resources when the usage levels are much lower than the preset load. In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the real-time data load. A cost-effective auto-scaling (CEAS) framework is first proposed for an Amazon Web Services (AWS) Cloud environment. The proposed CEAS framework allows us to scale the computing resources of Hadoop cluster so as to either reduce the computing resource use when the workload is low or scale-up the computing resources to speed up the data processing and analysis within an adequate time. To validate the effectiveness of the proposed framework, a case study with real-time sentiment analysis on the universities’ tweets is provided to analyse the reviews/tweets of the people posted on social media. Such a dynamic scaling method offers a reference to improving the Twitter data analysis in a more cost-effective and flexible way.
Keywords: big data; cloud computing; Apache Hadoop; Amazon web service; Twitter big data; cloud computing; Apache Hadoop; Amazon web service; Twitter

Share and Cite

MDPI and ACS Style

Jannapureddy, R.; Vien, Q.-T.; Shah, P.; Trestian, R. An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment. Appl. Sci. 2019, 9, 1417. https://doi.org/10.3390/app9071417

AMA Style

Jannapureddy R, Vien Q-T, Shah P, Trestian R. An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment. Applied Sciences. 2019; 9(7):1417. https://doi.org/10.3390/app9071417

Chicago/Turabian Style

Jannapureddy, Rachana, Quoc-Tuan Vien, Purav Shah, and Ramona Trestian. 2019. "An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment" Applied Sciences 9, no. 7: 1417. https://doi.org/10.3390/app9071417

APA Style

Jannapureddy, R., Vien, Q.-T., Shah, P., & Trestian, R. (2019). An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment. Applied Sciences, 9(7), 1417. https://doi.org/10.3390/app9071417

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop