This paper is published in Volume-4, Issue-2, 2018
Area
Data Warehousing
Author
Kamod Kumar, Dr. Alok Ranjan Tripathy
Org/Univ
Vaishali Institute of Business and Rural Management, Muzaffarpur, Bihar, India
Pub. Date
30 April, 2018
Paper ID
V4I2-2161
Publisher
Keywords
Column-oriented database, Information, Customer, Column-by-column and row-by-row Storage.

Citationsacebook

IEEE
Kamod Kumar, Dr. Alok Ranjan Tripathy. Performance analysis of column-oriented database for data warehouse system, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Kamod Kumar, Dr. Alok Ranjan Tripathy (2018). Performance analysis of column-oriented database for data warehouse system. International Journal of Advance Research, Ideas and Innovations in Technology, 4(2) www.IJARIIT.com.

MLA
Kamod Kumar, Dr. Alok Ranjan Tripathy. "Performance analysis of column-oriented database for data warehouse system." International Journal of Advance Research, Ideas and Innovations in Technology 4.2 (2018). www.IJARIIT.com.

Abstract

Column-oriented database systems, also known as column-stores, have an important demand in the past few years. Basically, it is about storing each database column separately so that the attributes belonging to the same column would be stored contiguously, compressed and densely-packed in the disk. This method has advantages in reading the records faster as compared to Traditional row stores in which every row are stored one after another in the disk. These databases are more suitable for data warehousing system to get an analysis done faster as data is stored in columnar form. Indexes are much faster in column-oriented databases which results in faster data retrieval and hence data analysis. This is an alternate database technology over row-oriented database systems. There are two approaches to map database tables onto a one-dimensional interface: store the table row-by-row or store the table column-by-column. The row-by-row approach keeps all information about an entity together. In the customer example above, it will store all information about the first customer, and then all information about the second customer, etc.  The column-by-column approach keeps all attribute information together: the entire customer names will be stored consecutively, then all of the customer addresses, etc. Both approaches are reasonable designs and typically a choice is made based on performance expectations. If the expected workload tends to access data on the granularity of an entity, then the row-by-row storage is preferable since all of the needed information will be stored together. On the other hand, if the expected workload tends to read per query only a few attributes from many records, then column-by-column storage is preferable since irrelevant attributes for a particular query do not have to be accessed.