This paper is published in Volume-3, Issue-2, 2017
Area
Text Mining
Author
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R
Org/Univ
R.V College of Engineering, India
Keywords
Document Retrieval, Text Mining, Personalization, Tf-Idf, Cosine Similarity, Personalized Search.
Citations
IEEE
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R. Personalized Document Retrieval Using Text Mining, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R (2017). Personalized Document Retrieval Using Text Mining. International Journal of Advance Research, Ideas and Innovations in Technology, 3(2) www.IJARIIT.com.
MLA
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R. "Personalized Document Retrieval Using Text Mining." International Journal of Advance Research, Ideas and Innovations in Technology 3.2 (2017). www.IJARIIT.com.
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R. Personalized Document Retrieval Using Text Mining, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R (2017). Personalized Document Retrieval Using Text Mining. International Journal of Advance Research, Ideas and Innovations in Technology, 3(2) www.IJARIIT.com.
MLA
Meda Sai Kheerthana, Sushmitha K. S, Geethika .R. "Personalized Document Retrieval Using Text Mining." International Journal of Advance Research, Ideas and Innovations in Technology 3.2 (2017). www.IJARIIT.com.
Abstract
The data produces in the last two years has outweighed all the data existing up until then. Therefore, there is a need to organize and classify this information so that its retrieval is ideally relevant and smooth. The project in hand employs text mining and machine learning techniques to offer a solution to the problem. The project enables a user to upload a document and search for the document. A graphical user Interface is developed to enable a user to upload and type his search query. The documents are stored in a database. Naïve Bayesian classification algorithm is used to classify the uploaded documents into respective categories. A novel algorithm is developed based on tf - idf and cosine similarity and used for searching the database and retrieving documents relevant to the user’s query by considering user’s personal interests.