Advertisements
HTML clipboardAccelerating Data Warehouses
Oracle OLAP 11g brings high-performance data warehouse features to Oracle Database 11g. An option of Oracle Database 11g Enterprise Edition, Oracle OLAP 11g is a full-featured online analytical processing engine. Because Oracle OLAP 11g is embedded in Oracle Database, it benefits from database scalability, security, and manageability features.
Key new features in Oracle OLAP 11g include database-managed relational views of a cube, a cube scan row source that is used by the SQL optimizer, and cube-organized materialized views. This article focuses on enhancements in Oracle OLAP 11g, how OLAP cubes fit into the Oracle data warehouse, and how they are used within SQL-based business intelligence applications to provide enriched content with high performance.
Why OLAP?
Users' queries are often unpredictable. On different days, the same users will perform trend analysis, drill down on specific product lines, and compare a week's sales against those of the same week last year. With standard relational systems, it is difficult to optimize data structures that provide consistently good query performance for such an unpredictable query pattern.
To address this need, DBAs and designers frequently create a system of summary tables or materialized views. OLAP cubes, which provide consistently fast query performance across an entire data model, often provide a better alternative to summary management. Sophisticated calculations can be easily embedded within the cube to enhance the analytic content of applications.
These calculations often rely on data from many rows and interrow calculations. For example, an OLAP cube might include a calculation that compares the current year's sales for each region and product line with those from the same period last year and two years ago. The cube structure is optimized to handle this kind of querying and analysis.
Why Oracle OLAP?
Oracle OLAP uses an analytic workspace in the database to perform OLAP analysis. Oracle OLAP stores data in the database as multidimensional cubes, which are designed for fast incremental update and query. Cubes are organized by dimensions, which act as keys to the fact data and define the basic structure of the cube. In many ways, a cube is similar to a star schema. The cube plays the role of the fact table, and an OLAP dimension plays the role of a dimension table. Dimensions can be simple lists of members, or they can be organized into levels and hierarchies. Hierarchical dimensions enable data to be aggregated from lower levels to higher levels of summarization. They support navigation such as drill-down and certain types of calculations such as Share to Parent, Share within Ancestor, and Rank within Parent. They also support many time-series calculations such as Year to Date. These types of calculations are easy to define within the analytic workspace manager (the administrative tool of Oracle OLAP 11g) and are efficiently computed within the cube at runtime.
Oracle OLAP 11g can significantly shorten query processing times for users of SQL-based business intelligence (BI) tools such as Oracle Business Intelligence Suite Enterprise Edition and other third-party tools. Oracle OLAP 11g also makes it easy to embed computations in an OLAP cube. Oracle OLAP 11g capabilities are all provided within Oracle Database, which allows for centralized control of data, business rules, and security.
Creating Cube-Organized Materialized Views for Summary Management
Cube-organized materialized views are a new capability in Oracle OLAP 11g that enables cubes to be used as a summary management solution, often replacing relational summary tables and table-based materialized views. In this scenario, an application queries the fact table by using aggregation functions (such as SUM) and a GROUP BY clause, and the query rewrite feature of the database automatically redirects the query to the cube-organized materialized view. The cube manages summary data transparently to the querying application, and users benefit from improved query performance.
To create a cube-organized materialized view, DBAs build a cube from the fact and dimension tables that applications currently query. They then enable the cube to allow materialized view refresh and query rewrite. The materialized view can be created via an API call or, more commonly, via certain options in the analytic workspace manager, as shown in Figure 1. The cube is self-indexing and manages the entire aggregate space.
|
Figure 1: Analytic workspace manager |
Build the GLOBAL sample cube. To create and use the GLOBAL sample OLAP cube and walk through the Oracle OLAP 11g capabilities this article describes, install and set up the products, schema, and workspace as follows:
1. Install the Oracle Database 11g database server and client. Note that the server and client are separate downloads, both available on Oracle Technology Network at otn.oracle.com. From this page, select Downloads -> Database, and then click the See All link next to the name of your operating system in the Oracle Database 11g download section. When installing the database server, you can use default options. When installing the client, make sure you select the Administrator option to install the extended set of tools, which includes the analytic workspace manager.
2. Download the GLOBAL schema (global_11g_schema.zip) from otn.oracle.com/products/bi/olap. From this page, click the 11.1 link next to the Global Schema selection in the Downloads section of this page. Unzip the file contents to a directory.
3. Download otn.oracle.com/oramag/oracle/08-may/o38olap.zip, and unzip the contents, the global_oramag.xml file, to a directory.
4. Follow the directions specified in the global_11g_readme.html documentation file, which is provided in the GLOBAL schema zip file downloaded in step 2. These directions enable you to create the GLOBAL schema with the tables necessary to build an analytic workspace.
5. Launch the analytic workspace manager. From the Windows Start menu, you can find it at {Oracle11g Client Home} -> Integrated Management Tools -> Oracle OLAP Analytic Workspace Manager and Worksheet.
6. Create a connection to your instance, by right-clicking Databases and selecting Add Database to tree. . . . Enter a description for the connection and the appropriate connection string for your instance (such as localhost:1521:orcl11). Then click Create.
7. Expand the Database tree until your new connection appears, and then click + (plus sign) next to your connection to connect to the instance. Enter GLOBAL for the username, and enter the password you assigned in step 4.
8. Expand the Schemas tree until you see Analytic Workspaces. Right-click Analytic Workspaces, and select Create Analytic Workspace from Template. . . .
9. Navigate the file chooser to find the global_oramag.xml template file you downloaded in step 3. Choose this file, and click Create. Wait for a few minutes while it creates the analytic workspace.
10. Right-click the new GLOBAL analytic workspace, and select Maintain Analytic Workspace GLOBAL.
11. Click >> (Add All), and then click Finish. This step loads data into the cube and aggregates it according to the settings specified in the template file. It may take a few minutes to complete. Once you're done, click Close to close the build log.
Run the OLAP query. With the GLOBAL cube loaded and the data aggregated, now enter the following�
SELECT t.calendar_year_id time, p.class_id product, c.region_id region, SUM(f.sales) sales FROM time_dim t, product_dim p, customer_dim c, units_fact f WHERE t.month_id = f.month_id AND p.item_id = f.item_id AND c.ship_to_id = f.ship_to_id AND c.region_id = 'AMER' GROUP BY t.calendar_year_id, p.class_id, c.region_id ORDER BY t.calendar_year_id, p.class_id;
�using Oracle SQL Developer (or another SQL tool of choice). This typical OLAP query summarizes the data in the UNITS_FACT fact table by year, class, and region for only those rows that have a REGION_ID value of AMER.
If there were no summary tables, this query would have to sift through the CUSTOMER_DIM (dimension) table to find all rows with REGION_ID='AMER' and then find rows in the fact table with those SHIP_TO_ID values. Finally, it would have to summarize the SALES column in the fact table, grouping the data by year, class, and region.
With Oracle OLAP 11g, using materialized views, these totals are immediately accessible. It simply fetches the rows of summarized data and returns them to the SQL processor. The result is much faster query response, with a simple summary management system that is easy to maintain.
If you run an explain plan against the query, you'll see that the Oracle optimizer uses the CUBE_SCAN operation instead of the much less efficient HASH JOIN operation, which is required when you don't have the materialized view. To see the difference, execute the commands alter materialized view cb$units_cube disable query rewrite and alter materialized view cb$units_cube enable query rewrite and view the resulting explain plans in each case. You will also see a huge difference in performance. In my tests on the small (300,000 lowest-level rows) GLOBAL fact table, I found that with the query rewrite feature enabled (using the cube), the query returned results two to five times as fast as without the cube. You will find larger differences in performance when using larger, more-realistically sized cubes.
The summary management system is also easy to maintain. It has one materialized view, rather than a separate materialized view for each combination of levels of each dimension. You can control the extent to which the data is precomputed, but the cube always presents itself as fully solved for every combination of the levels of the dimensions involved. A materialized view log keeps track of the rows that change in the base table, and it controls how much of the cube is incrementally updated and aggregated and when.