Share company news

Clear data, strong AI: data governance with Microsoft Purview

Company news

September 15, 2025
Symbolic image of data streams from binary code

Many AI projects fail not because of the AI itself, but because of poor data quality, a lack of data governance and inadequate processes. In this article we take a look at how governance and data quality are becoming the key success factors for AI projects - and how modern tools such as Microsoft Purview help with this, clear the data jungle and make data reliably usable.

Data governance: the basis for successful AI projects

The AI community likes to put data scientists and algorithm developers in the spotlight. However, the real data heroes often work in the background: they ensure that the database is correct. Without clean, well-managed data, even the best AI algorithm is of little use - "garbage in, garbage out" still applies.

Studies back this up: Gartner predicts that by 2027 around 60% of companies will not realize the expected benefits of their AI use cases due to a lack of a coherent data governance framework. Before we get into the technology, let's briefly clarify what is meant by data governance and why it forms the basis of successful AI projects.

What is data governance?

Data governance refers to all processes, roles and guidelines that ensure that company data can be found, is trustworthy, of high quality and secure. It creates a framework that clarifies who manages which data, how data is defined (keyword: single source of truth) and which rules apply to its use.

This is essential for AI: if data is incorrect or nobody knows its origin and context, the quality of the models and trust in the results suffers. Microsoft puts it in a nutshell: the reliability of data directly influences the accuracy of AI findings - without trustworthy data, there is a risk of a loss of trust in AI systems.

Good data governance also means breaking down data silos. In many organizations, data is distributed across different departments or systems. Without overarching governance, there are different data definitions, versions and quality levels. The result: contradictory reports, manual reconciliation efforts and frustration.

Data governance starts here by defining responsibilities (e.g. data managers per department), defining standards (e.g. for data quality or metadata) and making data centrally findable in a data catalog. This "data-driven order" is a must, especially for AI projects that often use data from different sources.

Graphic with 6 different interlocking building blocks that represent the motivation for data governance

Microsoft Purview: Data governance from the cloud

This is where modern tools come into play - first and foremost Microsoft Purview, a SaaS tool that supports data governance in practice. Purview (originally launched at the end of 2020 as Azure Purview and expanded in 2022) is a cloud platform for standardized data management. As part of the Microsoft Azure and Microsoft Fabric world, it integrates seamlessly into existing data platforms and bundles all important functions under one interface. Companies can use Purview immediately without major installation projects - the ramp-up is short as Microsoft operates the infrastructure. For many existing customers, the tool is often already covered by a license or can at least be easily integrated into existing contracts.

In short: Purview promises a quick start to tackling the data governance homework.

Central functions of Microsoft Purview at a glance:

  • Uniform data catalog: All data assets - from database tables and data lake files to BI reports - are recorded in a central catalog. Users (from data experts to business users) can use a search function to quickly find data assets, view metadata and browse the database. This unified data catalog forms the foundation for making data findable and visibly overcoming silos.
  • Traceable data origin (data lineage): Purview documents the data flows. You can see which sources a dataset comes from, which transformation pipelines it has passed through and where it is ultimately used (e.g. in which BI dashboard). This transparency of data origin helps to build trust and assess the impact of changes - an important feature when AI models are regularly trained with new data.
  • Data quality management: In addition to pure cataloging, Purview offers functions to measure and monitor data quality. These include profiling (statistical analyses such as value ranges, distributions, duplicates, etc.), defining quality rules and calculating data quality scores for data sets. Data managers can use dashboards to keep an eye on the "health" of their databases.
  • Access control and security: Purview integrates with Azure's identity and access controls. This allows you to fine-tune roles and permissions - for example, who can see or edit which data in the catalog. In addition, access guidelines can be defined centrally in Purview, which are then enforced on the connected data sources. Especially in regulated industries, such centrally managed policies are worth their weight in gold when it comes to data protection and compliance.

Data governance in practice: catalog, roles and domains

How does Purview specifically support data governance in an organization?

Automated data discovery through a searchable data catalog

A central element is the aforementioned data catalog. Purview automatically scans (via scanner) connected data sources and captures metadata: Table and column names, file structures, schema information, even automatic classifications (e.g. recognizing personal data or credit card numbers). All of this information ends up in the catalog, which users canusers searchable for users via a web portal. This makes data discovery child's play - a data scientist can search for "customer data turnover", for example, and find relevant data records including a description and the responsible owner. owner.

Role model for the visualization of responsibilities

Keyword owner: Data governance is inconceivable without the human component. Purview offers a role model to map responsibilities. Data owners and data stewards can be appointed for each data source or domain. These roles are stored in Purview so that it is clear who the contact person is for each data asset.

This makes the often abstract governance tangible: Specific people - the "data heroes" in the specialist departments - take care of "their" data. The role model also ensures that changes, approvals and quality checks take place in an orderly manner and are not left to chance.

Another governance component in Purview is the domain model. This allows companies to structure their data world along business areas or data domains. Instead of having a monolithic catalog, data is assigned to domains such as marketing, sales, production or finance. These governance domains are aligned with the business concepts and make the data organization logically comprehensible for the company. Each domain can contain its own data products (a collection of data assets for a specific purpose) and have its own owners.

Domain structure and federated governance for scalable implementation

This federated governance approach - centralized standards but decentralized responsibility in the specialist departments - combines the best of both worlds: There is a framework of uniform rules and quality standards for everyone, but at the same time the flexibility for each area to manage its data according to its own requirements. This means that data governance remains scalable, even in heterogeneous organizations. In practice, this means that IT or the CDO sets the rules of the game (e.g. which metadata is maintained, which quality criteria apply), while the specialist departments can work independently within their domain.

Purview supports this approach technically by allowing both global policies and domain-specific settings. Ultimately, a living data ecosystem is created: centrally coordinated, but lived decentrally - a decisive factor in anchoring data governance not only on paper, but also in the organizational culture.

Data quality under control: rules, scorecards and monitoring

Solid governance creates the prerequisite, but only high data quality gives AI projects the necessary fuel. Purview addresses this issue with a whole package of data quality functions.

Data profiling - quickly record the current situation

First of all, data helps profiling helps to understand the actual situation of the data: With one click, a data manager can, for example, profile a table profiled have a table profiled - Purview then automatically determines key figures such as value distributions, minimum/maximum, average, number of unique values, blank steps, etc. This reveals outliers or anomalies (such as a date field with an unusually high number of zero values) at an early stage.

Define rules and check them automatically

Data quality rules can be defined on this basis. Purview provides predefined rules for industry-standard quality dimensions (Out-of-the-box), for example completeness, consistency, conformity to standards, accuracy, timeliness and uniqueness of data. These rules can be applied without programming and can be adapted or supplemented with your own rules as required. Simple examples: "Field X must not be empty", "Value Y must be within the range 0-100", "Attribute Z must have a valid date format" etc. - through to more complex cross-field checks.

Scorecards, aggregation and monitoring for continuous improvement

A threshold value can be assigned to each rule (e.g. at least 95% of the data records must fulfill the rule). If a data quality scan is now carried out, Purview evaluates all selected data columns against the defined rules. The result is a data quality score - usually as a percentage - which shows at a glance how "well" the data meets the set requirements.

Purview aggregates these scores across different levels: Scores are obtained at column and table level, but can also be summarized per data product or domain. For example, the "Finance" domain could have a total score of 88%, which indicates that on average 88% of all quality checks in this area are fulfilled. Such quantified metrics are helpful in making progress measurable and identifying problem areas.

A data steward can, for example, call up an overview page in Purview that displays the quality of all critical data elements, with traffic light colors or trend arrows. Automatic notifications can be triggered in the event of deterioration (e.g. because a source system suddenly delivers an increased number of incorrect entries). This makes data quality a continuous task - deviations are detected at an early stage instead of only in the finished AI report.

Microsoft describes this approach as follows: "Through systematic data quality management, organizations can effectively measure, monitor and improve the quality of their data, which strengthens the reliability of AI-based insights and promotes confidence in data-driven decisions."

In other words, high-quality data means more reliable AI results - which has a direct impact on the success of the project.

Advantages of Microsoft Purview at a glance

Of course, there are various data governance solutions on the market - but Purview has some tangible advantages:

  • Quick start: As a cloud service, Purview could be used without any major installation effort. The first data sources could be connected and results (e.g. filled catalog, first quality metrics) achieved within just a few days. This time-to-value is a plus, especially for companies that want to see their first governance successes quickly.
  • Integration into the Microsoft world: Purview demonstrates its strengths in Microsoft-centric data landscapes in particular. It integrates seamlessly with important Azure services such as Azure Data Lake, Azure SQL, Synapse Analytics or Power BI. For organizations that already rely heavily on Microsoft, Purview is therefore a logical extension - many components (authentication via Azure AD, user management, security and compliance features) interlock seamlessly. This also makes operation familiar for users and significantly reduces the implementation effort.
  • Microsoft Fabric + Purview - a powerful combination: The synergy between Microsoft Purview and the Microsoft Fabric data platform is particularly noteworthy. Fabric bundles central analytics and AI services from Microsoft in an integrated environment. Purview ideally complements Fabric by providing governance functions such as data discovery, lineage, sensitivity classification and data quality management End2End for the entire data value chain mapped in Fabric. This allows users to benefit from a consistent, seamless user experience, access data quickly and ensure that all data assets are used within clear governance guidelines.
  • A wide range of functions under one roof: As described above, Purview covers a great many aspects - from catalog to lineage to data quality. This "one-stop store" approach prevents isolated solutions. Microsoft is also rapidly developing Purview: in 2024, for example, the new catalog interface, expanded data lineage tools and the comprehensive data quality module were added. For users, this means that Purview is a future-proof platform that grows with their requirements.
  • Licensing and costs: For many existing customers, Purview is (partially) included in existing Microsoft license packages or can be financed via Azure credits. For example, companies already use Purview components for compliance with Microsoft 365 E5; and Purview Data Governance in Azure also follows a pay-as-you-go model, where you pay per data asset actually managed. This means that there are often no high entry costs - an important point for decision-makers.

Limits and challenges

Of course, it's not all sunshine and rainbows - Microsoft Purview also has its limits, which should be viewed realistically:

  • Limited connectors beyond the MS world: Purview shines above all with Azure and Microsoft data sources. The connection of third-party systems (other clouds or on-premises databases) is possible, but still limited in some cases. Purview supports Snowflake databases, for example, and Google BigQuery can also be integrated - although (as of 06/2025) only in a preview version and with a limited range of functions. For very heterogeneous data landscapes that rely heavily on non-Microsoft technologies, it must be checked whether all important sources can be integrated or whether workarounds are necessary.
  • Rapid development - increasing complexity: On the one hand, it is positive that Microsoft Purview delivers new features quickly. On the other hand, the rapid pace of change also means that the interface and functions change frequently. Users report that some modules still seem immature or were initially unstable. For example, the entire catalog interface was overhauled in 2024, which initially required training and adjustment. For users, this means that Purview is not a static tool, but rather a journey. You should plan a certain reserve for updates and training, as "living" tools naturally bring movement into processes.
  • Range of functions vs. specialization: Purview covers a lot of ground, but perhaps doesn't go as deep as specialized tools in certain areas. For example, the business glossary in Purview is useful, but a dedicated metadata management tool could offer more sophistication here. Similarly with Master Data Management (MDM), which Purview does not directly map (Microsoft relies on partner integration here, e.g. Profisee). It is important to recognize that Purview is primarily a governance superstructure - supplementary tools could remain useful for operational data quality assurance or MDM.
  • Dealing with failed data quality scans. Purview does not currently allow detailed information to be viewed directly from the interface at the level of individual faulty data records (e.g.(e.g. affected IDs or specific rows) directly from the interface. This makes it considerably more difficult to initiate concrete measures for rectification - for example, by directly forwarding a ticket for correction to a responsible team, including all the necessary details. Although such a scenario can theoretically be realized with Purview's workflow function, it requires additional manual configuration and integration (e.g.(e.g. connection to a ticket system such as Jira or ServiceNow). Compared to specialized data quality tools that already provide such workflows as standard, this aspect is not yet optimally solved with Purview.

Despite these points, the added value that Purview offers outweighs them for many companies. The challenges mentioned are usually manageable, especially as Microsoft is in close contact with the community and responds to feedback. For example, new connectors are constantly being added and teething problems are being fixed.

Anyone opting for Purview should be aware of the dynamics - but it is precisely these dynamics that ensure that Purview keeps its finger on the pulse (keyword: integration of AI into data management, which Microsoft is already hinting at).

Conclusion: Data governance as a starting point for the successful use of AI

AI projects have enormous potential - but their fate depends largely on the database. Poor data quality, unclear responsibilities and silo thinking act like sand in the gears and cause many an ambitious AI project to stall. Conversely, it is clear that investments in data governance and data quality management pay off directly in the speed, quality and acceptance of AI solutions. Tools such as Microsoft Purview offer a practical way to quickly bring order to the data chaos and create a consistent view of the data - from its origin to its use and quality.

People and organizational culture as a success factor

In the end, however, it is not the tools alone that make the difference, but the people who use them. The true data heroes are those decision-makers and employees who have recognized that data-driven success starts at the grassroots level. They establish a culture in their companies that treats data as a valuable asset - with clear rules and responsibilities. Microsoft Purview can serve as an enabler: It simplifies the path by providing many governance tasks out-of-the-box.

But organizations have to take this step themselves. For decision-makers with an interest in the subject, who may not be deep technology specialists, the key message is: AI success is no coincidence, it can be planned - if you lay the right foundations. Instead of just chasing after the next machine learning algorithm, it's worth asking the less glamorous questions: Is our data good enough? Do we know where it comes from? Who is responsible for it? Answering these questions with the help of data governance lays the foundation for your data scientists to shine.

Data governance as a long-term investment

In a world where data is seen as "the new oil", we need people who can refine this oil - data heroes. They ensure that raw data is turned into trustworthy information. And in the end, it is precisely this information that provides the power to make AI projects a success. With a solution like Microsoft Purview, they have all the tools they need to create value from data. It's up to us to use them.

Conclusion: If you want to introduce AI tomorrow, you should start with data governance today. Because the winners of tomorrow will be those who already have their data under control today. The data heroes.

Portrait of Richard Madsack from Dataciders

About the author

Richard Madsack is Lead Data Strategy & Data Governance and has been working in IT for over 15 years, including 6 years focusing on data governance and data excellence. His focus is on data governance, data quality and data catalogs - with the aim of enabling organizations to view data holistically and derive real added value from it.

Curious about specific use cases and practical steps to use AI effectively?

Then come to our onsite event "Becoming a Frontier Firm" on October 1, 2025 in Cologne - with Microsoft experts, exciting practical reports and plenty of opportunities to network.

Secure your ticket now!

Share company news

Further company news

47 posts
company news
company news
company news