Vyoms OneStopTesting.com - Testing EBooks, Tutorials, Articles, Jobs, Training Institutes etc.
OneStopGate.com - Gate EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopMBA.com - MBA EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopIAS.com - IAS EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopSAP.com - SAP EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
OneStopGRE.com - of GRE EBooks, Tutorials, Articles, FAQs, Jobs, Training Institutes etc.
Bookmark and Share Rss Feeds

Building a Hybrid Data Warehouse Model | Articles | Recent Articles | News Article | Interesting Articles | Technology Articles | Articles On Education | Articles On Corporate | Company Articles | College Articles | Articles on Recession
Sponsored Ads
Hot Jobs
Fresher Jobs
Experienced Jobs
Government Jobs
Walkin Jobs
Placement Section
Company Profiles
Interview Questions
Placement Papers
Resources @ VYOMS
Companies In India
Consultants In India
Colleges In India
Exams In India
Latest Results
Notifications In India
Call Centers In India
Training Institutes In India
Job Communities In India
Courses In India
Jobs by Keyskills
Jobs by Functional Areas
Learn @ VYOMS
GATE Preparation
GRE Preparation
GMAT Preparation
IAS Preparation
SAP Preparation
Testing Preparation
MBA Preparation
News @ VYOMS
Freshers News
Job Articles
Latest News
India News Network
Interview Ebook
Get 30,000+ Interview Questions & Answers in an eBook.
Interview Success Kit - Get Success in Job Interviews
  • 30,000+ Interview Questions
  • Most Questions Answered
  • 5 FREE Bonuses
  • Free Upgrades

VYOMS TOP EMPLOYERS

Wipro Technologies
Tata Consultancy Services
Accenture
IBM
Satyam
Genpact
Cognizant Technologies

Home » Articles » Building a Hybrid Data Warehouse Model

Building a Hybrid Data Warehouse Model








Article Posted On Date : Friday, May 22, 2009


Building a Hybrid Data Warehouse Model
Advertisements

HTML clipboard

Building a Hybrid Data Warehouse Model

As suggested by this reference implementation, in some cases blending the relational and dimensional models may be the right approach to data warehouse design.

Published April 2007

Relational and dimensional modeling are often used separately, but they can be successfully incorporated into a single design when needed. Doing so starts with a normalized relational model and then adds dimensional constructs, primarily at the physical level. The result is a single model that can provide the strengths of its parent models fairly well: it represents entities and relationships with the precision of the traditional relational model, and it processes dimensionally filtered, fact-aggregated queries with speed approaching that of the traditional dimensional model.

Real-world experience was the motivation for this analysis: on three separate data warehousing projects where I worked as programmer, architect, and manager, respectively, I found a consistent pattern of data/database behavior that lent itself far more to a hybrid combination of dimensional and relational modeling than to either one alone.

This article discusses the hybrid design and provides a fully functional reference implementation. The system runs on Oracle Database 10g. It contains all code needed to build the database schemas, generate sample data, load it into the schemas, build the indexes and materialized views, run the sample queries, capture the runtimes, and provide statistics on the runtimes.

The hybrid model is not a one-size-fits-all solution. Many projects are best served by either using only one of the traditional models or using both models separately with a feed between them. But if the objective is to create a single database that can both store data in its properly normalized form and run aggregation queries with good performance, the hybrid model is a design pattern to consider.

Sample Business Domain

The sample business domain is in the insurance industry and uses the following entities:

Entity Description
ACCOUNT Information about a customer and its activities with the insurance company
POLICY An insurance contract representing a specific agreement with the customer
VEHICLE A vehicle belonging to the customer and covered by a policy
COVERAGE The kinds of losses that are covered for a vehicle on this policy
PREMIUM A monthly payment from the customer for coverage on vehicles in this policy

The sample business questions used to analyze the performance of the system have some parallel with reality but also cover extremes of behavior: scanning the fact table for many rows, retrieving a tiny percentage of fact rows, restricting to only the top table, restricting to every table, restricting to only the lower tables, and so on. They are the kinds of questions business users ask of dimensional models, not the kinds of questions that are typically asked of relational models. The relational model questions are not addressed, because it is assumed that the relational model will outperform the dimensional model for questions of a relational nature, such as "Show me all the vehicles on this policy." The questions used in this analysis are the following:

ID# Business Questions of a Dimensional Nature
1 What was the total premium collected by year as far back as we can go?
2 What was the premium collected in the New England states in 2002?
3 How much premium did we get for medium catastrophe risks in Connecticut as far back as we can go?
4 How much premium did we get for time-managed plan types in California in 2001?
5 How many passenger cars had collision coverage in November 2003?
6 What was the premium for red vehicles in Vermont with primary usage that had a $1,000 deductible? Break the numbers down per person and by accident limits.
7 What was the premium for coverages with a $1,000 deductible, a $100,000 per-person limit, and an $800,000 accident limit in 2000?
8 What was the monthly premium in 1999 for red cars with 750cc engines?

Models

The three models are presented in Figures 1, 2, and 3. The hybrid model is based on the relational model, with two changes that derive from dimensional modeling practices: (1) Create a relationship from the PREMIUM table to each table in the upper portion of the hierarchy, and (2) Add the time dimension.


Figure 1. Relational model


 


Figure 2. Dimensional model


 


Figure 3. Hybrid model

Implementation

Largely standard techniques were used to convert the models into their physical implementation in database schemas. The relational schema was created with normalized modeling techniques, and the dimensional schema was done according to Ralph Kimball's work. Creating the hybrid meant copying the relational schema and then layering the dimensional constructs on top of it. (The "File Descriptions" sidebar lists the most important files in the implementation--which includes those files with DDL, the system validation, the queries, and the automated analysis used to generate the sample code.)

Because only three nonkey attributes are used, a SIZING attribute is added to each table, with a type of CHAR(100) to make the row size more realistic.

Certain database parameters must be set so that star joins will occur and materialized views will be used. The important parameters are shown here:

NAME                           VALUE  ------------------------------ --------------------  compatible                     10.2.0.1.0  optimizer_features_enable      10.2.0.1  optimizer_mode                 first_rows  pga_aggregate_target           83886080  query_rewrite_enabled          true  query_rewrite_integrity        stale_tolerated  sga_target                     167772160  star_transformation_enabled    true    Verifying that a star join is occurring is done with EXPLAIN PLAN, as detailed in Oracle documentation. 

All three schemas were loaded with the same data. The best evidence of consistent data loading is that all three schemas produce the same answers for the sample queries.

The volume of data used for the analysis is shown below.

OWNER  TABLE_NAME     NUM_ROWS AVG_ROW_LEN LAST_ANALYZED  ------ ------------ ---------- ----------- -------------------  DIM    ACCOUNT_DIM        2000         128 2006-01-14:19-51-56         COVERAGE_DIM        900          17 2006-01-14:19-51-57         POLICY_DIM         6000         128 2006-01-14:19-51-58         PREMIUM_FACT    1371183          23 2006-01-14:19-52-14         TIME_DIM           3600          21 2006-01-14:19-52-39         VEHICLE_DIM       24000         130 2006-01-14:19-52-39  HYB    ACCOUNT            2000         128 2006-01-14:19-53-42         COVERAGE         144000          28 2006-01-14:19-53-47         POLICY             6000         142 2006-01-14:19-53-53         PREMIUM         1373463          49 2006-01-14:19-54-41         TIME_DIM           3600          21 2006-01-14:19-55-08         VEHICLE           24000         144 2006-01-14:19-55-10  REL    ACCOUNT            2000         124 2006-01-14:19-39-22         COVERAGE         144288          27 2006-01-14:19-39-30         POLICY             6000         138 2006-01-14:19-39-31         PREMIUM         1389963          29 2006-01-14:19-40-08         VEHICLE           24000         139 2006-01-14:19-40-13      The goal was to provide a sufficiently large volume to prevent the optimizer from taking shortcuts, such as reading entire tables instead of using indexes and other such optimization techniques that would undermine the analysis. According to Oracle Database Data Warehousing Guide 10 g Release 2 (10.2), Schema Modeling Techniques, a star transformation might not occur if the optimizer finds "tables that are too small for the transformation to be worthwhile." 

A fairly arbitrary goal of the implementation was to have at least 1 million rows in the fact table. Given that all dimensional and hybrid query plans generated by QUERIES.SQL meet the criteria of star joins, the data volume used appears to be sufficient for the current analysis.

The number of COVERAGE_DIM rows is smaller in the dimensional schema than in the DIMENSION tables of the other two schemas because of the way a weak entity has to be represented in the dimensional schema.

Here is the amount of space consumed by the various schemas:

OWNER           TOTAL_SIZE  --------------- ----------------  DIM                  129,499,136  HYB                  244,056,064  REL                  130,023,424    Because the hybrid schema is a combination of the relational and the dimensional, it follows that it should be roughly the size of both, minus any common elements, and the numbers bear this out. 






Sponsored Ads



Interview Questions
HR Interview Questions
Testing Interview Questions
SAP Interview Questions
Business Intelligence Interview Questions
Call Center Interview Questions

Databases

Clipper Interview Questions
DBA Interview Questions
Firebird Interview Questions
Hierarchical Interview Questions
Informix Interview Questions
Microsoft Access Interview Questions
MS SqlServer Interview Questions
MYSQL Interview Questions
Network Interview Questions
Object Relational Interview Questions
PL/SQL Interview Questions
PostgreSQL Interview Questions
Progress Interview Questions
Relational Interview Questions
SQL Interview Questions
SQL Server Interview Questions
Stored Procedures Interview Questions
Sybase Interview Questions
Teradata Interview Questions

Microsof Technologies

.Net Database Interview Questions
.Net Deployement Interview Questions
ADO.NET Interview Questions
ADO.NET 2.0 Interview Questions
Architecture Interview Questions
ASP Interview Questions
ASP.NET Interview Questions
ASP.NET 2.0 Interview Questions
C# Interview Questions
Csharp Interview Questions
DataGrid Interview Questions
DotNet Interview Questions
Microsoft Basics Interview Questions
Microsoft.NET Interview Questions
Microsoft.NET 2.0 Interview Questions
Share Point Interview Questions
Silverlight Interview Questions
VB.NET Interview Questions
VC++ Interview Questions
Visual Basic Interview Questions

Java / J2EE

Applet Interview Questions
Core Java Interview Questions
Eclipse Interview Questions
EJB Interview Questions
Hibernate Interview Questions
J2ME Interview Questions
J2SE Interview Questions
Java Interview Questions
Java Beans Interview Questions
Java Patterns Interview Questions
Java Security Interview Questions
Java Swing Interview Questions
JBOSS Interview Questions
JDBC Interview Questions
JMS Interview Questions
JSF Interview Questions
JSP Interview Questions
RMI Interview Questions
Servlet Interview Questions
Socket Programming Interview Questions
Springs Interview Questions
Struts Interview Questions
Web Sphere Interview Questions

Programming Languages

C Interview Questions
C++ Interview Questions
CGI Interview Questions
Delphi Interview Questions
Fortran Interview Questions
ILU Interview Questions
LISP Interview Questions
Pascal Interview Questions
Perl Interview Questions
PHP Interview Questions
Ruby Interview Questions
Signature Interview Questions
UML Interview Questions
VBA Interview Questions
Windows Interview Questions
Mainframe Interview Questions


Copyright © 2001-2024 Vyoms.com. All Rights Reserved. Home | About Us | Advertise With Vyoms.com | Jobs | Contact Us | Feedback | Link to Us | Privacy Policy | Terms & Conditions
Placement Papers | Get Your Free Website | IAS Preparation | C++ Interview Questions | C Interview Questions | Report a Bug | Romantic Shayari | CAT 2024

Fresher Jobs | Experienced Jobs | Government Jobs | Walkin Jobs | Company Profiles | Interview Questions | Placement Papers | Companies In India | Consultants In India | Colleges In India | Exams In India | Latest Results | Notifications In India | Call Centers In India | Training Institutes In India | Job Communities In India | Courses In India | Jobs by Keyskills | Jobs by Functional Areas

Testing Articles | Testing Books | Testing Certifications | Testing FAQs | Testing Downloads | Testing Interview Questions | Testing Jobs | Testing Training Institutes

Gate Articles | Gate Books | Gate Colleges | Gate Downloads | Gate Faqs | Gate Jobs | Gate News | Gate Sample Papers | Gate Training Institutes

MBA Articles | MBA Books | MBA Case Studies | MBA Business Schools | MBA Current Affairs | MBA Downloads | MBA Events | MBA Notifications | MBA FAQs | MBA Jobs
MBA Job Consultants | MBA News | MBA Results | MBA Courses | MBA Sample Papers | MBA Interview Questions | MBA Training Institutes

GRE Articles | GRE Books | GRE Colleges | GRE Downloads | GRE Events | GRE FAQs | GRE News | GRE Training Institutes | GRE Sample Papers

IAS Articles | IAS Books | IAS Current Affairs | IAS Downloads | IAS Events | IAS FAQs | IAS News | IAS Notifications | IAS UPSC Jobs | IAS Previous Question Papers
IAS Results | IAS Sample Papers | IAS Interview Questions | IAS Training Institutes | IAS Toppers Interview

SAP Articles | SAP Books | SAP Certifications | SAP Companies | SAP Study Materials | SAP Events | SAP FAQs | SAP Jobs | SAP Job Consultants
SAP Links | SAP News | SAP Sample Papers | SAP Interview Questions | SAP Training Institutes |


Copyright ©2001-2024 Vyoms.com, All Rights Reserved.
Disclaimer: VYOMS.com has taken all reasonable steps to ensure that information on this site is authentic. Applicants are advised to research bonafides of advertisers independently. VYOMS.com shall not have any responsibility in this regard.