0
7.2kviews
Design star schema and the max fact records for a Super market chain.
1 Answer
0
145views

Star Schema:

  • A star schema is generally considered to be the most efficient design for two reasons.

  • A design with de-normalized tables encounters fewer join operations.

  • most optimizers are smart enough to recognize a star schema and generate access plans that use efficient "star join" operations.

  • It has been established that a "standard templateā€ data warehouse query directly maps to a star schema.

  • The goal of a star schema design is to simplify the physical data model so that RDBMS optimizers can exploit advanced indexing and join techniques in a straightforward manner, as shown in Fig enter image description here

  • Each dimension table has a primary key on its Id column, relating to one of the columns (viewed as rows in the example schema) of the Fact_Sales table's three-column (compound) primary key (Date_Id, Store_Id, Product_Id).

  • The non-primary key Units_Sold column of the fact table in this example represents a measure or metric that can be used in calculations and analysis. The non-primary key columns of the dimension tables represent additional attributes of the dimensions (such as the Year of the Dim_Date dimension).

Fact tables:

  • Fact tables record measurements or metrics for a specific event. Fact tables generally consist of numeric values, and foreign keys to dimensional data where descriptive information is kept.

  • Fact tables are designed to a low level of uniform detail (referred to as "granularity" or "grain"), meaning facts can record events at a very atomic level.

  • This can result in the accumulation of a large number of records in a fact table over time. Fact tables are defined as one of three types:

  • Transaction fact tables record facts about a specific event (e.g., sales events)

  • Snapshot fact tables record facts at a given point in time (e.g., account details at month end)
  • Accumulating snapshot tables record aggregate facts at a given point in time (e.g., total month-to-date sales for a product)

  • Fact tables are generally assigned a surrogate key to ensure each row can be uniquely identified. This key is a simple primary key.

  • A Fact table is calculated based on "lowest level granualarity". The lowest level granualarity can be determined from the dimension table (lowest level in each hierarchy).

eg:- 5 years of historical data for 40,0000 products of 300 stores can be calculated as

  (5*365)*40,000*300
Please log in to add an answer.