This paper is published in Volume-4, Issue-3, 2018
Area
Big Data Processing and Analytics
Author
Athira Soman, Smitha Jacob
Org/Univ
St. Joseph's College of Engineering and Technology, Palai, Kerala, India
Pub. Date
08 May, 2018
Paper ID
V4I3-1281
Publisher
Keywords
Big data processing, Batch processing, Stream processing, Lambda architecture

Citationsacebook

IEEE
Athira Soman, Smitha Jacob. A worthwhile performance framework modeling hinge on lambda architecture for batch and stream big data, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Athira Soman, Smitha Jacob (2018). A worthwhile performance framework modeling hinge on lambda architecture for batch and stream big data. International Journal of Advance Research, Ideas and Innovations in Technology, 4(3) www.IJARIIT.com.

MLA
Athira Soman, Smitha Jacob. "A worthwhile performance framework modeling hinge on lambda architecture for batch and stream big data." International Journal of Advance Research, Ideas and Innovations in Technology 4.3 (2018). www.IJARIIT.com.

Abstract

The amount of data we are now generating is astonishing. Data has also evolved dramatically in recent years, in type, volume, and velocity. The emerging technologies like smartphones and sensors present opportunities for data exploitation, streaming and collecting from heterogeneous device every second. Analyzing these large datasets can unlock multiple behaviors previously unknown, a help optimizes approaches to many applications. However, collecting and handling of these massive datasets present challenges in how to perform optimized the large data. There are several frameworks available for handling the big data applications. The Lambda Architecture is data processing framework that can handle both batch and stream processing. The batch layer is implemented using Pig and hive, the streaming layer is built by using the Spark streaming and Spark SQL. This presents a need for developing the new framework for handling the big data applications particularly using public clouds to minimize cost, resource availability.