Cloud Data Governance and Catalog

Digital transformation initiatives are driving business value for organizations, propelling them forward. Unfortunately, despite sustained investments, some organizations have struggled to reap the benefits of this transformation due to the inadequate focus on one of their most valuable assets: data. Successful digital transformation heavily depends on providing trusted data for data consumers; however, complex data landscapes and fragmentation have made this task increasingly challenging.

To effectively leverage enterprise data for various use cases, including enhancing customer experience, enabling innovation, and ensuring greater compliance with regulatory authorities, data consumers must be able to trust and have visibility into the data. Comprehensive data intelligence is essential for organizations aspiring to accelerate their digital transformation journeys.

Cloud Data Governance and Catalog: Predictive Data Intelligence for Data and Analytics Governance

Informatica® Cloud Data Governance and Catalog, a service of Informatica Intelligent Data Management Cloud™ (IDMC), combines data governance, data catalog and data quality capabilities into a singular tool for automating data intelligence insights. This IDMC service is built for organizations that want to maximize their investments by deriving value from their vast data assets.

Cloud Data Governance and Catalog delivers predictive data intelligence powered by the Informatica CLAIRE® AI and ML engine.

Organizations that want to drive business value from trusted data will appreciate its automated and recommendation-driven data classification, bulk data curation, relationship discovery and sensitive data discovery. Just as importantly, they can provide data consumers with the business context they need. The IDMC service enables efficient self-service analytics and data governance by unifying the capabilities of data discovery, data lineage, data profiling, data quality, business glossary creation, stakeholder and policy management, and the ability to document and govern AI models and their implementations.

Cloud Data Governance and Catalog integrates into your existing data landscape and scans hybrid sources, including cloud data lakes and warehouses, analytics/BI systems, databases, ETL tools, and other enterprise systems. The IDMC service is cloud-native, meaning you can deploy it into your existing infrastructure almost immediately and at the scale needed.

Key Capabilities

Broad and Deep Metadata Connectivity

Cloud Data Governance and Catalog offers broad and deep metadata connectivity that spans multi-cloud and on-premises environments. Applying wide and deep data source connectivity, it allows you to extract metadata across:

  • Cloud platforms

  • BI tools

  • Databases

  • Multi-vendor ETL

  • Data science tools

  • Various enterprise applications

  • File formats

  • SQL dialects

  • Stored procedures

With universal metadata connectivity to nearly all your data sources and a runtime option to run serverless or within your on-premises or virtual private cloud, the IDMC service provides a centralized, comprehensive view of your data.

Inspect scripts, procedures and processes to fully understand logic and internal data flow. Obtain complete column-level data lineage, including an inventory of potential lineage sources with rich details. Scan static and dynamic code and perform language parsing for automated data lineage across the enterprise.

With the Cloud Data Governance and Catalog custom metadata framework, you can use simple Excel files to ingest custom metadata and derive data lineage and relationship links from critical systems where automated scanners are unavailable. Model virtually any data source or data lineage across systems.

Data sources supported include:

  • Informatica: PowerCenter, Data Integration, Multidomain Master Data Management and Business 360 applications

  • Cloud Platforms: Amazon Web Services (AWS) S3, AWS Redshift, AWS RDS (Oracle, MS SQL Server, PostgreSQL, MySQL), DynamoDB, Azure SQL DB, Azure Synapse, Azure ADLS Gen 2, Azure Blob, Google Cloud Storage, Google BigQuery, Snowflake, Databricks Delta Tables, Oracle Cloud Storage, Oracle ADB

  • On-Premises: Oracle, IBM Db2, Netezza, SQL Server, Teradata, JDBC, MySQL, SAP HANA DB, Postgres, MongoDB, Local/Shared Filesystem

  • Database Scripts: DB2 LUW SQL, Microsoft SQL Server SQL, Oracle SQL, Snowflake SQL, Teradata BTEQ

  • BI and Analytics Platforms: Tableau, Microsoft Power BI, QlikView, Qlik Sense, Microsoft SSRS, Cognos, Google Looker

  • Other ETL and Data Science Platforms: Azure Data Factory, Databricks Notebooks, Databricks Unity Catalog, Microsoft SSIS, Microsoft SSAS, Talend

  • Enterprise Applications: Salesforce, Kafka, Workday, Marketo, SAP BW, SAP BW4/Hana, SAP ECC, SAP S/4Hana, SAP Business Objects, Dynamics CRM, Microsoft OneDrive, Microsoft SharePoint

  • File Formats: CSV, Delimited, JSON, Avro, Parquet, SFTP, XMI

Contact Informatica for the most current list of supported data sources.

AI-Powered CLAIRE Engine to Drive Insights from Metadata

Automation is critical to manage and govern large data estates. Users will appreciate how Cloud Data Governance and Catalog uses intelligent data element and entity classification to help automate metadata management and extraction from heterogeneous sources. You can also automate data profiling and classification across data assets at the field, column and table levels.

The solution offers an automated approach to data discovery and classification to reduce the time and effort spent on tedious manual processes that do not scale. Data stewards can review and curate (accept/reject) from more than 215 out-of-the-box automated data classification associations recommended by CLAIRE. Users can modify and extend these classifications or add new ones as needed.

Cloud Data Governance and Catalog also learns from associations and can auto-tag similar fields and columns across the enterprise using rule-based and AI-based methodologies. The IDMC service also can automatically associate glossary terms to data and infer relationships, such as joins among datasets using AI/ML capabilities, including schema matching.

With the CLAIRE activity page, users can view analytics related to automated glossary and classification associations, including metrics on accepted, pending and declined associations. The page provides a central location to identify and act on pending curation actions. These insights help drive the usage of automated associations powered by CLAIRE and can be utilized to calculate the time saved for curation activities across the organization.