This paper is published in Volume-3, Issue-2, 2017
Area
Data Mining
Author
Padmapriya .R, D. Maheswari
Org/Univ
RVS College of Arts and Science, Sulur, Tamil Nadu, India
Pub. Date
26 April, 2017
Paper ID
V3I2-1536
Publisher
Keywords
Web Mining, Data Preprocessing, Path Completion Algorithm, User Session Identification.

Citationsacebook

IEEE
Padmapriya .R, D. Maheswari. A Novel Technique for Path Completion in Web Usage Mining, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Padmapriya .R, D. Maheswari (2017). A Novel Technique for Path Completion in Web Usage Mining. International Journal of Advance Research, Ideas and Innovations in Technology, 3(2) www.IJARIIT.com.

MLA
Padmapriya .R, D. Maheswari. "A Novel Technique for Path Completion in Web Usage Mining." International Journal of Advance Research, Ideas and Innovations in Technology 3.2 (2017). www.IJARIIT.com.

Abstract

World Wide Web is a huge repository of web pages and links. The Web mining field encompasses a wide array of issues, primarily aimed at deriving actionable knowledge from the Web, and includes researchers from information retrieval, database technologies, and artificial intelligence. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Most data used for mining is collected from Web servers, clients, proxy servers, or server databases, all of which generate noisy data. Because Web mining is sensitive to noise, data cleaning methods are necessary. Web usage mining consists of three phases preprocessing, pattern discovery and pattern analysis. Web log data is usually noisy and ambiguous and data preprocessing system for web usage mining is an important process. A data processioning includes data cleaning, user identification, session identification and path completion. The inexact data in web access log are mainly caused by local caching and proxy servers which are used to improve performance and minimize network traffic. The proposed method uses path completion algorithm to preprocess the data. The proposed path completion algorithm efficiently appends the lost information and improves the consistency of access data for further web usage mining calculations.