Loading Knowledge Sharing

Power BI

Why Your Power BI Reports Crawl, and How Star Schema Fixes Most of It

By Syed Hussnain Sherazi | July 1, 2025 | Power BI | Star Schema | Performance

Why star schema modelling is one of the fastest ways to improve Power BI report performance.

Most slow Power BI reports have nothing to do with capacity, network speed, or visuals. The fault sits in the data model. After looking at dozens of underperforming tenants over the past few years, the same picture keeps showing up. Analysts flatten everything into one wide table because it feels easier coming from Excel, and the storage engine quietly suffers under the weight.

This article walks through the single biggest fix you can apply to a Power BI semantic model. It covers the reasoning behind dimensional modelling, the practical steps to refactor a flat table into a clean star, and the kind of gains you should expect after the work is finished.

The Situation

A retail analytics team I worked with had a fact looking table holding 47 columns and roughly 180 million rows. Sales transactions, customer demographics, product attributes, store information, and even calendar fields were jammed into one place. Their numbers told the story.

Initial report load took 22 seconds. Slicer interactions sat at 8 to 12 seconds. The PBIX file weighed 3.4 GB. Memory at refresh time hit 14 GB. A full refresh ran for 47 minutes.

After moving to a star schema with five dimensions and a single fact table, the same workload behaved very differently. Initial load dropped below 2 seconds. Slicers responded in under half a second. The file shrank to 412 MB. Refresh memory came down to 2.1 GB. The whole refresh finished in 6 minutes.

Same data. Same visuals. Same DAX. Only the model changed.

Why VertiPaq Prefers Stars

The Power BI engine, called VertiPaq, is a columnar store that uses dictionary compression and run length encoding. It performs best with one specific access pattern. Scan a fact table, filter through small dimension lookups, then aggregate.

Flattening everything into one table breaks this pattern in three painful ways. First, repeated text values bloat the dictionary, so compression ratios collapse. Second, low cardinality columns such as Country, Category or Status repeat across millions of rows instead of being stored once in a 200 row dimension. Third, the query optimiser cannot push filters efficiently because there is no clean propagation path from a slicer to the fact rows.

A star schema solves all three issues at once. Dimensions stay small and compress brilliantly. The fact carries only foreign keys and additive measures, with no descriptive text repeating itself. Query plans become very short. Filter the dimensions, propagate to the fact, aggregate.

There is a useful saying in the Power BI community attributed to Matthew Roche. Push transformations as far upstream as possible, and as far downstream as necessary. Star schema modelling lives squarely in that philosophy.

A Picture of the Refactor

A Picture of the Refactor
flowchart LR
    subgraph Before["Before, One Big Table"]
        OBT[("Sales_Wide<br/>180M rows, 47 columns<br/>3.4 GB")]
    end

    subgraph After["After, Star Schema"]
        F[("FactSales<br/>180M rows, 8 columns<br/>380 MB")]
        D1[("DimDate<br/>3,650 rows")]
        D2[("DimCustomer<br/>2.1M rows")]
        D3[("DimProduct<br/>45K rows")]
        D4[("DimStore<br/>1,200 rows")]
        D5[("DimPromotion<br/>800 rows")]

        D1 --> F
        D2 --> F
        D3 --> F
        D4 --> F
        D5 --> F
    end

A Walkthrough You Can Replicate

Suppose your source is a flat Sales table with the columns below.

SaleID, SaleDate, CustomerID, CustomerName, CustomerCity, CustomerCountry,
ProductID, ProductName, ProductCategory, ProductSubcategory,
StoreID, StoreName, StoreRegion, PromoCode, PromoDescription,
Quantity, UnitPrice, Discount, TotalAmount

Step 1, Create a Reference Query

In Power Query Editor, right click the Sales query and pick Reference. Rename the new query FactSales. We will keep this one slim.

Step 2, Build DimCustomer

Reference Sales again, rename it DimCustomer, then apply this M code.

let
    Source = Sales,
    SelectedColumns = Table.SelectColumns(Source,
        {"CustomerID", "CustomerName", "CustomerCity", "CustomerCountry"}),
    RemoveDuplicates = Table.Distinct(SelectedColumns, {"CustomerID"}),
    SetTypes = Table.TransformColumnTypes(RemoveDuplicates, {
        {"CustomerID", Int64.Type},
        {"CustomerName", type text},
        {"CustomerCity", type text},
        {"CustomerCountry", type text}
    })
in
    SetTypes

Repeat the same recipe for DimProduct, DimStore and DimPromotion.

Step 3, Build a Proper DimDate

A real date dimension is non negotiable for serious time intelligence work. Use the generator below.

let
    StartDate = #date(2018, 1, 1),
    EndDate = #date(2027, 12, 31),
    NumberOfDays = Duration.Days(EndDate - StartDate) + 1,
    Dates = List.Dates(StartDate, NumberOfDays, #duration(1,0,0,0)),
    DateTable = Table.FromList(Dates, Splitter.SplitByNothing(), {"Date"}),
    AddYear = Table.AddColumn(DateTable, "Year", each Date.Year([Date]), Int64.Type),
    AddQuarter = Table.AddColumn(AddYear, "Quarter", each "Q" & Text.From(Date.QuarterOfYear([Date])), type text),
    AddMonth = Table.AddColumn(AddQuarter, "MonthNumber", each Date.Month([Date]), Int64.Type),
    AddMonthName = Table.AddColumn(AddMonth, "MonthName", each Date.MonthName([Date]), type text),
    AddYearMonth = Table.AddColumn(AddMonthName, "YearMonth", each [Year] * 100 + [MonthNumber], Int64.Type),
    AddDayOfWeek = Table.AddColumn(AddYearMonth, "DayOfWeek", each Date.DayOfWeekName([Date]), type text),
    AddIsWeekend = Table.AddColumn(AddDayOfWeek, "IsWeekend", each Date.DayOfWeek([Date]) >= 5, type logical)
in
    AddIsWeekend

After loading, mark this table as a date table in Model view. Right click DimDate, pick Mark as date table, then choose the Date column.

Step 4, Slim Down FactSales

A fact table should carry only foreign keys, degenerate dimensions, and additive measures.

let
    Source = Sales,
    SelectedColumns = Table.SelectColumns(Source, {
        "SaleID", "SaleDate", "CustomerID", "ProductID",
        "StoreID", "PromoCode", "Quantity", "UnitPrice",
        "Discount", "TotalAmount"
    }),
    RenameToKey = Table.RenameColumns(SelectedColumns, {{"SaleDate", "DateKey"}})
in
    RenameToKey

Drop every descriptive text column from the fact. Those values live in dimensions from now on.

Step 5, Build the Relationships

In Model view, create one to many relationships from each dimension into the fact.

DimDate Date links to FactSales DateKey. DimCustomer CustomerID links to FactSales CustomerID. DimProduct ProductID links to FactSales ProductID. DimStore StoreID links to FactSales StoreID. DimPromotion PromoCode links to FactSales PromoCode.

Keep cross filter direction set to Single on all of them unless you have a specific many to many requirement that you have already validated. Bidirectional filtering is a common source of slowness and bugs.

Step 6, Hide the Keys

In Report view, hide every ID column on both the fact and the dimensions. End users should only ever see descriptive attributes such as Customer Name, Product Category, or Store Region. Surrogate keys are plumbing.

Step 7, Replace Calculated Columns With Measures

Replace any calculated columns on the fact with explicit measures.

Total Sales =
SUMX(
    FactSales,
    FactSales[Quantity] * FactSales[UnitPrice] * (1 - FactSales[Discount])
)

Total Sales LY =
CALCULATE(
    [Total Sales],
    SAMEPERIODLASTYEAR(DimDate[Date])
)

Sales YoY % =
DIVIDE(
    [Total Sales] - [Total Sales LY],
    [Total Sales LY]
)

How to Verify the Gain

Open Performance Analyzer from the View tab and capture the timings before and after the refactor. For deeper diagnostics, install DAX Studio and run VertiPaq Analyzer against your model. The output shows per column compression, per table size, and query plans.

A healthy star schema usually displays a handful of recognisable signals. The fact carries 60 to 80 percent of the model size. Combined dimensions stay below five percent. High cardinality columns are confined to fact keys. No bidirectional relationships exist outside very specific scenarios. No calculated columns sit on the fact.

Mistakes That Are Easy to Make

A common trap is keeping calculated columns on the fact table. Each one gets materialised and stored uncompressed, which inflates memory. Push that work upstream into Power Query, or even better into the source SQL.

Another trap is the snowflake schema. Splitting a dimension into smaller related tables looks tidier from a normalisation point of view, but it forces VertiPaq to traverse multiple joins for every query. Flatten subcategories and categories back into DimProduct.

A third trap is the role playing dimension done badly. If you have OrderDate, ShipDate and DeliveryDate, do not build three date tables. Build one DimDate and use USERELATIONSHIP inside measures, or create a clearly named duplicate dimension when the analysis really demands it.

When the Rules Should Bend

Star schemas are a default, not a religion. Two situations justify deviation. Header detail patterns such as orders and order lines often need two fact tables that share dimensions. Slowly Changing Dimensions of Type 2 require surrogate keys and effective date logic that adds complexity to the simple star. Both of these are well understood, and there is plenty of writing on each of them.

A Word on the Bigger Picture

Dimensional modelling predates Power BI by about three decades. Ralph Kimball codified the approach in the 1990s, and the principles have outlasted every BI tool that came and went. The reason is straightforward. The shape of a star schema mirrors the shape of an analytical question. Show me sales by region by quarter for top tier customers. Each clause maps onto a dimension filter, and the result rolls up against a fact.

There is also a forward looking reason to care about modelling discipline. As more teams begin to use AI assistants and agents to build, modify, and explain reports, the quality of those agents depends on the structure they are reading. A flat table gives an AI nothing to reason about. A clean star schema gives it a vocabulary. Models that machines and humans can both understand become a competitive advantage in any analytics function.

A Practical Next Step

Pick the slowest report in your tenant and refactor it into a star schema this week. Measure before, measure after. The result is rarely smaller than ten times faster on queries, and around five times smaller on the PBIX file. Users notice within a day. Refresh windows shrink within the week. Every enhancement you build on top of the new model is cheaper, simpler and more reliable than the one before it.

A clean star schema is one of the very few free lunches in BI engineering. Eat it.

References and Further Reading

#SourceTypeLink
1Microsoft Learn, Understand star schema and the importance for Power BIFree official documentationhttps://learn.microsoft.com/en-us/power-bi/guidance/star-schema
2Kimball Group, Dimensional Modeling TechniquesFree articles, foundational referencehttps://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/
3SQLBI, The importance of star schemas in Power BIFree practitioner articlehttps://www.sqlbi.com/articles/the-importance-of-star-schemas-in-power-bi/
4VertiPaq AnalyzerOpen source GitHub projecthttps://github.com/sql-bi/VertiPaq-Analyzer
5DAX StudioOpen source diagnostic toolhttps://daxstudio.org/
6Microsoft Learn, Power Query M referenceFree official documentationhttps://learn.microsoft.com/en-us/powerquery-m/
7Microsoft Learn, Optimization guide for Power BIFree official guidancehttps://learn.microsoft.com/en-us/power-bi/guidance/power-bi-optimization
8Roche's Maxim of Data Transformation, by Matthew RocheFree public blog posthttps://ssbipolar.com/2021/05/31/roches-maxim/
Back to Knowledge SharingContact Syed Hussnain

Reader Comments

Add a comment with your name and email. Your email is used only for basic validation and is not shown publicly.