The use of alternative data models in data warehousing environments
Abstract
Data Warehouses are increasing their data volume at an accelerated rate; high disk
space consumption; slow query response time and complex database administration are
common problems in these environments. The lack of a proper data model and an
adequate architecture specifically targeted towards these environments are the root
causes of these problems.
Inefficient management of stored data includes duplicate values at column level and
poor management of data sparsity which derives from a low data density, and affects
the final size of Data Warehouses. It has been demonstrated that the Relational Model
and Relational technology are not the best techniques for managing duplicates and data
sparsity.
The novelty of this research is to compare some data models considering their data
density and their data sparsity management to optimise Data Warehouse environments.
The Binary-Relational, the Associative/Triple Store and the Transrelational models
have been investigated and based on the research results a novel Alternative Data
Warehouse Reference architectural configuration has been defined.
For the Transrelational model, no database implementation existed. Therefore it was
necessary to develop an instantiation of it’s storage mechanism, and as far as could be
determined this is the first public domain instantiation available of the storage
mechanism for the Transrelational model.