Friday, June 24, 2016

Informatica Data Validation Option

Informatica Power Center Data Validation Option enables you to test and validate data. Data Validation Option requires the installation of Informatica PowerCenter which carries out majority of work. DVO tool provides complete and effective data validation and testing—without having to write complicated, error-prone SQL statements or use Excel spreadsheets. This option provides reliable and independent method to verify that moves and transformations have not compromised the integrity of your data.
DVO expedite the process for Developers and business analyst to create rules that test the data being transformed during the data integration process. A typical workflow for data validation consists of multiple tasks.


  1. Data Validation Option reads one or more PowerCenter metadata repositories.
  2. Define the validation rules in Data Validation Option.
  3. Run the rules to ensure the data conforms to the validation rules. When you do this, Data Validation Option performs the following tasks:
    • Creates and executes all tests through PowerCenter.
    • Loads results into the Data Validation Option results database and displays them in the Data Validation Option Client.
  4. Examine the results and identify sources of inconsistencies in the ETL process or the source systems.
  5. Repeat this process for new records.
Figure 1 shows DVO architecture. DVO client reads Source and target data table and file definitions from PowerCenter repositories. The test metadata is stored in the Data Validation Option repository. During test execution, Data Validation Option communicates with PowerCenter through an API to create appropriate mappings, sessions, and workflows, and execute them.


Figure 1 Data Validation Option Architecture
Traditional methods of data validation like writing complex SQL queries excel sheet comparisons are error prone, complicated and time consuming. DVO reduces the time required for data validation and production data auditing and verification.
Data Validation Option stores all test definitions and test results in the Data Validation Option repository. It provides three types of reporting types – Summary of Testing Activities, Table Pair Summary and Detailed Test Results. All Data Validation Option reports run against database views that are set up as part of the installation process so you can write custom reports against the results database.
Key Features:
  • Robust Data Validation Environment
  • Packaged Testing Rules and Complete Expressions Library
  • Support for Production Validation Testing
Benefits:
  • Ensure Data’s Accuracy and Trustworthiness,
  • Reduce Data Testing and Validation Time and Resources by 50 to 90 Percent.
  • Make Business Decisions Knowing That Production Systems Are Accurate.
  • Eliminate the manual errors caused while writing complex SQL queries while data testing.
Limitations:
  • Requires basic PowerCenter knowledge. For troubleshooting errors users need to open monitor/designer or workflow and locate error. The tool redirects for more errors refer to workflow log. It requires basic Informatica workflow/session log debugging.
  • Invalid test case creation will result in invalid mapping thus show of no result or run error.
  • Any modifications in source/target metadata require a folder refresh which usually time consuming.
  • Report generation is slow specially when the report you generate contains hundreds of tests or test runs.
Use Case:
Following are some of the use case examples:
  • Folder Comparison/Automatic Test Generation
  • Compare Individual Records between Sources and Targets
  • Constraint based data validation
  • Expression based tests, Aggregate Functions tests
  • Join View, Validity of the lookup logic, SQL View
In entire process of data integration DVO fits typically in phase of data testing and validation. Developers/ Testers can leverage the tool while unit testing, integration testing, migration testing etc. Easy-to-use GUI includes prebuilt operators—no programming skills are needed to create validation tests. Business decisions are based on the available trusted data, it makes more important to leverage DVO capabilities to eliminate risk of introducing bad data and errors—move errors, transformation errors, or operational errors.

8 comments: