Star and SnowFlake Schema in Data Warehousing

The star schema and snowflake schema are two different ways of organizing data warehouses. Both schemas use dimension tables that describe the information contained within a fact table

Types of Data Warehouse Schema:
  • Star Schema
  • Snowflake Schema

In this tutorial, you will learn more about:

What is a Star Schema ?

In the star schema design, the fact table sits in the middle and is connected to dimension lookup tables like a star. Each dimension is represented as a single table. The primary key in each dimension table is related to a foreign key in the fact table

All measures in the fact table are related to all the dimensions that fact table is related to. In other words, they all have the same level of granularity. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table

Data Warehouse Star Schema

Star Schema Best Practices

  • Every dimension in a star schema is represented with the only one-dimension table
  • The dimension table should contain the set of attributes
  • The dimension table is joined to the fact table using a foreign key
  • The dimension table are not joined to each other
  • Fact table would contain key and measure
  • The Star schema is easy to understand and provides optimal disk usage.
  • The dimension tables are not normalized. For instance, in the above figure, Country_ID does not have Country lookup table as an OLTP design would have.
  • The schema is widely supported by BI Tools

What is a Snowflake Schema ?

The snowflake schema is an extension of the star schema, The snowflake schema splits the fact table into a series of normalized dimension tables. Normalizing creates more dimension tables with multiple joins and reduces data integrity issues. However, querying is more challenging using the snowflake schema, because queries need to dig deeper to access the relevant data

Data Warehouse Snowflake Schema

Snowflake Schema Best Practices

  • Uses smaller disk space
  • Due to multiple tables query performance is reduced - Use Snowflake schema with caution
  • The primary challenge that you will face while using the snowflake Schema is that you need to perform more maintenance efforts because of the more lookup tables

Difference between : Star Vs Snowflake Schema

Star Schema Snow Flake Schema
Hierarchies for the dimensions are stored in the dimensional table. Hierarchies are divided into separate tables.
It contains a fact table surrounded by dimension tables. One fact table surrounded by dimension table which are in turn surrounded by dimension table
In a star schema, only single join creates the relationship between the fact table and any dimension tables. A snowflake schema requires many joins to fetch the data.
Simple DB Design. Very Complex DB Design.
Denormalized Data structure and query also run faster. Normalized Data Structure.
High level of Data redundancy Very low-level data redundancy
Single Dimension table contains aggregated data. Data Split into different Dimension Tables.
Cube processing is faster. Cube processing might be slow because of the complex join.
Offers higher performing queries using Star Join Query Optimization. Tables may be connected with multiple dimensions. The Snow Flake Schema is represented by centralized fact table which unlikely connected with multiple dimensions.

Next Section: Types of Fact Tables