Search

Top 60 Oracle Blogs

Recent comments

Bloom Filter

Bloom Filter Efficiency And Cardinality Estimates

I've recently came across an interesting observation I've not seen documented yet, so I'm publishing a simple example here to demonstrate the issue.

In principle it looks like that the efficiency of Bloom Filter operations are dependent on the cardinality estimates. This means that in particular cardinality under-estimates of the optimizer can make a dramatic difference how efficient a corresponding Bloom Filter operation based on such a cardinality estimate will work at runtime. Since Bloom Filters are crucial for efficient processing in particular when using Exadata or In Memory column store this can have significant impact on the performance of affected operations.

Data Warehouse Design: To Index, or Not to Index, that is the question

This post is part of a series that discusses some common issues in data warehouses.

When you query a star schema, you essentially have two choices;

  • bitmap index and star transformation 
  • full scan, bloom filter, and hash join

Star Transformation 

Star transformation was introduced in Oracle 8(see also Oracle Optimizer Blog: Optimizer Transformations: Star Transformation).   A star transformation requires:

Reading Parallel Execution Plans With Bloom Pruning And Composite Partitioning

You’ve probably heard sayings like “sometimes things aren’t always what they seem” and “people lie”. Well, sometimes execution plans lie. It’s not really by intent, but it is sometimes difficult (or impossible) to represent everything in a query execution tree in nice tabular format like dbms_xplan gives. One of the optimizations that was introduced back in 10gR2 was the use of bloom filters. Bloom filters can be used in two ways: 1) for filtering or 2) for partition pruning (bloom pruning) starting with 11g. Frequently the data models used in data warehousing are dimensional models (star or snowflake) and most Oracle warehouses use simple range (or interval) partitioning on the fact table date key column as that is the filter that yields the largest I/O reduction from partition pruning (most queries in a time series star schema include a time window, right!). As a result, it is imperative that the join between the date dimension and the fact table results in partition pruning. Let’s consider a basic two table join between a date dimension and a fact table. For these examples I’m using STORE_SALES and DATE_DIM which are TPC-DS tables (I frequently use TPC-DS for experiments as it uses a [...]