Loading Knowledge Sharing

Data Governance

Power BI Governance at Enterprise Scale, and Why Agentic AI Changes the Conversation

By Syed Hussnain Sherazi | November 18, 2025 | Governance | Agentic AI | Power BI

How agentic AI can support Power BI governance without replacing the platform team.

Most large Power BI estates begin with a clean architecture and slowly drift into chaos. Workspaces multiply. Datasets duplicate. Reports diverge. The same KPI appears with different definitions in different parts of the business. Audits become painful. New people join and cannot find anything. The platform that was supposed to centralise analytics becomes a distributed mess.

Governance is the discipline that prevents this drift. It used to be a manual, mostly defensive activity, run by a small platform team that wrote policies and audited compliance. Agentic AI is changing that picture meaningfully. The same agents that build reports can monitor the estate, flag anomalies, and propose remediations at a pace no human team can match.

This article looks at what enterprise grade governance actually requires today, where AI agents fit into the picture, and how to set up a programme that scales without becoming a bureaucracy.

What Governance Has to Cover

Three concerns dominate any serious governance conversation.

Definition consistency is the first. The business should have one and only one definition of revenue, of customer, of churn rate. When the same word means different things in different reports, decisions diverge and trust erodes. Governance enforces that definitions live in shared semantic models and that report authors use them.

Access and security is the second. The platform should expose only what users are entitled to see, with row level rules that match HR records and entitlements. Misconfigured access is the source of most genuine compliance failures, and it is the area where audits focus.

Operational health is the third. Refreshes should succeed. Capacities should run within their limits. Reports should load quickly. Stale and unused content should be retired. This is the part that tends to be neglected because nothing dramatic happens when it fails. The estate just gets slower and more expensive.

A governance programme that handles all three deliberately is the difference between a platform that scales and a platform that quietly suffocates.

The Reference Operating Model

The Reference Operating Model
flowchart LR
    subgraph Strategy["Strategy Layer"]
        Council[Data Governance Council]
        Standards[Standards and Policies]
    end

    subgraph Platform["Platform Layer"]
        COE[Centre of Excellence]
        Pipelines[CI CD Pipelines]
        Monitoring[Monitoring Agents]
    end

    subgraph Domains["Domain Layer"]
        Finance[Finance Workspaces]
        Sales[Sales Workspaces]
        Ops[Operations Workspaces]
    end

    Council --> Standards
    Standards --> COE
    COE --> Pipelines
    COE --> Monitoring
    Pipelines --> Finance
    Pipelines --> Sales
    Pipelines --> Ops
    Monitoring --> Finance
    Monitoring --> Sales
    Monitoring --> Ops
    Finance -.metrics.-> Monitoring
    Sales -.metrics.-> Monitoring
    Ops -.metrics.-> Monitoring

The strategy layer sets policy. The platform layer enforces it through tooling. The domain layer builds and consumes content within the guardrails. This shape works because it puts policy where it belongs, in a small body of senior stakeholders, and pushes execution to the teams closest to the work.

The Manual Approach and Where It Breaks

Traditional governance relies on three artefacts. A policy document that describes the rules. A monthly review where a small team audits the estate against the policy. A ticketing process for exceptions and remediations.

This approach worked when estates were small. It collapses at scale for predictable reasons. The policy document becomes outdated as soon as it is published. The monthly review can only sample a tiny fraction of the estate. The remediation backlog grows faster than the team can clear it. By the time a problem is identified, it has often been propagated to dozens of dependent reports.

The arithmetic is unforgiving. A platform team of five people cannot manually review ten thousand reports every month. They will inevitably focus on the visible problems and leave the long tail unaddressed.

Where Agentic AI Helps

The opportunity is to automate the parts of governance that are mechanical, repetitive, and well defined. Several tasks fit this description perfectly.

Definition drift detection is the first. An agent reads the DAX of every measure in the estate, clusters measures by name, and flags pairs that share a name but have different formulas. The output is a list of likely conflicts that human reviewers triage.

Unused content identification is the second. An agent reads usage metrics and identifies datasets that have not been queried, reports that have not been opened, and workspaces with no active membership. It generates a recommendation to retire each item, complete with a list of stakeholders to notify.

Refresh failure analysis is the third. When a refresh fails, an agent fetches the error message, the dataset history, the source connection details, and the recent changes. It diagnoses the most likely cause and either applies a fix from a known catalogue or escalates to a human with a summary.

Access review acceleration is the fourth. Quarterly access reviews require someone to look at every workspace and approve every member. An agent can pre populate the review with summaries, flag unusual patterns, and recommend approve or remove for each entry.

Lineage and impact analysis is the fifth. Before any breaking change to a dataset, an agent traces every downstream report and dashboard, identifies the impact, and notifies the owners. This converts a manual hour of investigation into a thirty second query.

A Tutorial, Building a Definition Drift Agent

The most valuable agent for many teams is the one that watches definition drift. Building it takes a few hours and the value compounds quickly.

Step 1, Pull All Measures from the Tenant

Use the Power BI scanner API to enumerate all datasets, then for each dataset call the metadata endpoint to extract measure definitions.

def scan_all_measures(token):
    workspaces = list_workspaces(token)
    measures = []
    for ws in workspaces:
        datasets = list_datasets(ws.id, token)
        for ds in datasets:
            try:
                meta = get_dataset_metadata(ws.id, ds.id, token)
                for measure in meta.measures:
                    measures.append({
                        "workspace": ws.name,
                        "dataset": ds.name,
                        "measure": measure.name,
                        "formula": measure.expression
                    })
            except Exception as e:
                log_warning(ws.id, ds.id, e)
    return measures

For a tenant with thousands of datasets, this scan takes a few hours and runs nicely as an overnight job.

Step 2, Embed Each Measure

Use a language model to compute a semantic embedding for each measure. The embedding captures intent rather than text similarity alone. Measures with the same intent but different wording cluster together.

def embed_measures(measures, embedder):
    for m in measures:
        text = f"Measure named {m['measure']} with formula {m['formula']}"
        m["embedding"] = embedder.embed(text)
    return measures

Step 3, Cluster and Compare

For each measure, find every other measure with the same name. For each pair, compute cosine similarity between the embeddings of their formulas. A high similarity means the two measures probably mean the same thing. A low similarity means they share a name but mean something different, which is the drift signal.

def find_drift(measures):
    by_name = defaultdict(list)
    for m in measures:
        by_name[m["measure"].lower()].append(m)
    drifts = []
    for name, group in by_name.items():
        if len(group) < 2:
            continue
        for i, a in enumerate(group):
            for b in group[i+1:]:
                sim = cosine(a["embedding"], b["embedding"])
                if sim < 0.85:
                    drifts.append({
                        "name": name,
                        "instance_a": a,
                        "instance_b": b,
                        "similarity": sim
                    })
    return drifts

Step 4, Triage With Language Model Help

For each detected drift, ask a language model to explain whether the difference is meaningful or cosmetic. Cosmetic differences (variable names, whitespace) are filtered out automatically. Meaningful differences (different filter contexts, different aggregation logic) become tickets for human review.

def triage(drift, llm):
    prompt = f"""
    Two measures named {drift['name']} have different formulas.

    Formula A from {drift['instance_a']['workspace']}:
    {drift['instance_a']['formula']}

    Formula B from {drift['instance_b']['workspace']}:
    {drift['instance_b']['formula']}

    Are these definitions semantically the same or different? If different, summarise the business impact in one sentence.
    """
    return llm.complete(prompt)

Step 5, Publish to a Backlog

The output goes into a governance backlog, prioritised by usage. Drifts on heavily used measures are tackled first. Drifts on rarely used measures are documented but deprioritised.

This single agent can find issues that a manual review would never surface. In a typical large tenant, the first run uncovers dozens of unintended divergences across departments, many of which trace back to a copy paste from years earlier.

Operational Health Agents

A second agent watches operational health. It monitors capacity utilisation, refresh success rates, and report performance, then opens tickets when patterns cross thresholds.

A simple pattern that pays off quickly is the slow report agent. Every day, the agent ranks all reports by average load time and flags those above a threshold. For each flagged report, it gathers diagnostics, identifies the slow query, and proposes an optimisation. The proposal is delivered to the report owner with the diagnostic evidence, so they can apply it themselves.

Another high value pattern is the orphaned workspace agent. The agent identifies workspaces whose only admin has left the organisation, whose last edit is more than a year old, or whose membership consists only of disabled accounts. It proposes either reassignment or retirement, and routes the proposal to the most likely current owner based on usage patterns.

These agents do not make irreversible decisions. They surface candidates and suggest actions. Humans approve, and the agents execute approved actions through the platform APIs.

Where to Draw the Line

Agents should not be trusted with destructive actions without explicit human approval. Deleting a workspace, removing a dataset, revoking access, applying a sensitivity label that restricts sharing. These should always pass through a human checkpoint.

Agents are excellent at producing the evidence and the recommendation. They are not yet ready to be the final authority on changes that cannot easily be reversed.

A useful policy is the two key rule. Anything reversible can be done by an agent autonomously, with logging. Anything irreversible needs a human key turn before execution. This policy lets agents handle the bulk of operational work while keeping humans in the loop on the consequential decisions.

The Governance Stack of the Future

Five components define a mature, AI augmented governance stack.

A canonical metrics layer that holds the agreed business definitions, exposed to humans through a metadata catalogue and to agents through a structured API.

A continuous scanner that walks the estate daily and computes drift, freshness, and usage metrics for every artefact.

A diagnostic agent layer that produces explanations for every operational event, from a refresh failure to a sudden change in a KPI.

A remediation agent layer that proposes and, where authorised, executes fixes.

A human review surface that aggregates everything the agents have found, in priority order, so the platform team focuses on decisions rather than discovery.

Most large tenants today have parts of this stack. Few have it integrated. The teams that build the integration first will run estates that are dramatically cleaner, cheaper, and more trustworthy than their peers.

A Final Reflection

Governance has historically been the least loved part of analytics. It is the work that prevents bad outcomes rather than producing visible wins, which makes it hard to fund and easy to deprioritise. Agentic AI changes the calculation. When a single agent can do the work of an entire governance team for a fraction of the cost, the trade off shifts.

The aim is to augment the platform team, not replace it. A team that previously spent most of its time on routine audits can spend more time designing the platform, training the agents, and tackling the genuinely hard problems that no agent can yet address.

The estates that survive the next decade will be the ones whose owners realised this early. The estates that drift into chaos will be the ones whose owners kept treating governance as a manual chore. The platform itself is ready. The decision is whether to build the agents now or to be playing catch up two years from now.

References and Further Reading

#SourceTypeLink
1Microsoft Learn, Power BI Adoption RoadmapFree official white paperhttps://learn.microsoft.com/en-us/power-bi/guidance/powerbi-adoption-roadmap-overview
2Microsoft Learn, Power BI tenant settingsFree official documentationhttps://learn.microsoft.com/en-us/power-bi/admin/service-admin-portal
3Microsoft Learn, Power BI activity log and auditFree official documentationhttps://learn.microsoft.com/en-us/power-bi/admin/service-admin-auditing
4Microsoft Purview documentationFree official documentationhttps://learn.microsoft.com/en-us/purview/
5Microsoft Learn, Workspace governanceFree official documentationhttps://learn.microsoft.com/en-us/power-bi/collaborate-share/service-new-workspaces
6Microsoft Learn, Power BI scanner APIFree official documentationhttps://learn.microsoft.com/en-us/power-bi/enterprise/service-admin-metadata-scanning
7DAX GuideFree function referencehttps://dax.guide/
8Anthropic, Building Effective AgentsFree engineering articlehttps://www.anthropic.com/engineering/building-effective-agents
Back to Knowledge SharingContact Syed Hussnain

Reader Comments

Add a comment with your name and email. Your email is used only for basic validation and is not shown publicly.