Dimensional modelling is used in Data Warehouse to organize data effectively and assist analytical operations on huge volumes of data. The concept of Dimensional Modelling was developed by Ralph Kimball and is comprised of "fact" and "dimension" tables
Data Warehouse uses denormalized tables (flat tables). (Whereas, database or OLTP systems use Normalized tables). So, to create this denormalized model effectively you will use a modelling technique called "Dimensional Modelling". A dimensional model is a data structure technique optimized for Data warehousing
In this tutorial, you will learn
Fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables
Eg. Monthly sales volume, Average Customer Balance etc...
A Fact Table contains
A category of information. For example, the time dimension. In simple terms, they give who, what, where of a fact. In the Sales business process, for the fact quarterly sales number, dimensions would be
The Attributes are the various characteristics of the dimension.In the Location dimension, the attributes can be
Attributes are used to search, filter, or classify facts. Dimension Tables contain Attributes
A good dimensional model determines the success of your data warehouse implementation
Here are the steps in dimenstional data modellingIdentifying the actual business process a datawarehouse should cover. This could be Marketing, Sales, HR, etc. as per the data analysis needs of the organization. It is the most important step of the Data Modelling process, and a failure here would have cascading and irreparable defects
The Grain describes the level of detail for the business problem/solution. It is the process of identifying the lowest level of information for any table in your data warehouse. If a table contains sales data for every day, then it should be daily granularity. If a table contains total sales data for each month, then it has monthly granularity. During this stage, you answer questions like
Dimensions are nouns like date, store, inventory, etc. These dimensions are where all the data should be stored. For example, the date dimension may contain data like a year, month and weekday
Example of Dimensions:The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis
Dimensions: Product, Location and Time Attributes: For Product: Product key (Foreign Key), Name, Type, Specifications Hierarchies: For Location: Country, State, City, Street Address, Name
This step is co-associated with the business users of the system because this is where they get access to data stored in the data warehouse. Most of the fact table rows are numerical values like price or cost per unit, etc.
Example of Facts:The CEO at an MNC wants to find the sales for specific products in different locations on a daily basis
The fact here is Sum of Sales by product by location by time
The Database schema is designed and finalized in this step. There are two popular schemas
Star SchemaThe star schema architecture is easy to design. It is called a star schema because diagram resembles a star, with points radiating from a center. The center of the star consists of the fact table, and the points of the star is dimension tables
The fact tables in a star schema which is third normal form whereas dimensional tables are de-normalized Snowflake SchemaThe snowflake schema is an extension of the star schema. In a snowflake schema, each dimension are normalized and connected to more dimension tables