This paper is published in Volume-5, Issue-3, 2019
Area
Computer Science Engineering
Author
Amogh Mudholkar, Akash S., Ajay C., Gowramma G. S.
Org/Univ
Don Bosco Institute of Technology, Bengaluru, Karnataka, India
Pub. Date
14 June, 2019
Paper ID
V5I3-1880
Publisher
Keywords
Machine Learning, Big Data, Statistical Analysis, Arima, Multiple Linear Regression

Citationsacebook

IEEE
Amogh Mudholkar, Akash S., Ajay C., Gowramma G. S.. Air pollution data analysis using ARIMA model, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Amogh Mudholkar, Akash S., Ajay C., Gowramma G. S. (2019). Air pollution data analysis using ARIMA model. International Journal of Advance Research, Ideas and Innovations in Technology, 5(3) www.IJARIIT.com.

MLA
Amogh Mudholkar, Akash S., Ajay C., Gowramma G. S.. "Air pollution data analysis using ARIMA model." International Journal of Advance Research, Ideas and Innovations in Technology 5.3 (2019). www.IJARIIT.com.

Abstract

This work addresses the prediction of the possible amount of air pollution in the atmosphere. Initially, we convert the raw data of pollutants to a standard that is AQI (Air Quality Index) then we carry out Exploratory data analysis such as seasonal pollution analysis, correlation matrix to find out the relation between various factors contributing to air pollution, box plots to find out a summary of a particular factor, etc. Further, we build models for prediction and forecasting future values. Forecasting consists of taking models to fit on historical information/data and using it to predict future observations. We bui1d two models subjected for this purpose namely, Multiple linear regression model and ARIMA (Auto-Regressive Integrated Moving Averages) model. We use the results obtained in exploratory analysis to find the dependent variables for AQI and build a Multiple linear regression model to predict future values. ARIMA model is used for forecasting our pollution time series to predict future values based on the parameters found using the auto-correlation function.