IntroductionGetting Stared with Power BIData Sources in Power BIData Modelling in Power BIDAX In Power BIPower BI VisualizationPower Bi Service
Star Schema vs Snowflake Schema
Star Schema:
- Definition: A star schema consists of a central fact table (containing metrics or key performance indicators) surrounded by dimension tables (containing descriptive attributes). The relationships are simple, with the fact table related to the dimension tables through foreign keys.
- Advantages: Star schema is easy to understand and query. It’s highly optimized for report performance because it minimizes joins.
- Use Case: Best for business intelligence applications where performance and simplicity are crucial.
Snowflake Schema:
- Definition: A snowflake schema is a more complex version of the star schema, where dimension tables are normalized into multiple related tables. This schema uses additional levels of hierarchy within the dimensions.
- Advantages: Reduces redundancy and can be more space-efficient for large datasets.
- Disadvantages: More complex to query and can require additional joins, which might impact query performance.
- Use Case: Suitable for situations where data consistency and normalization are important.
Fact and Dimension Tables
Fact Tables:
- Definition: A fact table contains quantitative data, such as sales, revenue, or quantities. It is typically the central table in a star schema.
- Use Case: Example: A “Sales” table that records each sale transaction, containing metrics like the amount, quantity, and date.
Dimension Tables:
- Definition: Dimension tables contain descriptive or categorical data related to the facts, such as customer information, product details, or time periods.
- Use Case: Example: A “Customer” dimension table might contain customer names, regions, and age groups, providing context to the fact table.
Creating a Model for Performance
- Designing for Performance: When building a data model, it’s crucial to design it for efficient querying. This can involve:
- Using star schema or normalized snowflake schema based on the data size.
- Minimizing the number of columns in large tables to improve performance.
- Using measures and aggregated tables for faster calculations instead of relying on calculated columns for large datasets.
- Ensuring proper indexing of tables in the source database.