HOW TO: Implement SCD Type-3 Mapping in Informatica Cloud (IICS)

Spread the love

What is Slowly Changing Dimension (SCD) Type-3?

A Slowly Changing Dimension (SCD) Type 3 is used to maintain partial history of a record in the dimension table. The record stores the previous value of a chosen attribute along with the current attribute value in SCD Type-3 table.

The implementation of the SCD Type-3 mapping in Informatica Cloud involves

  • Identifying the new record and inserting it into dimension table.
  • Identifying the existing record in dimension table and update it with latest data if there is a change in the data. Also maintain history by storing the previous value of chosen attribute.
  • Identifying the existing record and dropping it in the mapping if there is no change in data.

Setting up Source and Target data objects for Demonstration

For the demonstration purpose consider a flat file employees.csv as a source which provides employees information.

EMP_ID,EMP_NAME,EMP_SALARY
100,Jennifer,6000
101,Micheal,8000

The data should be loaded into the target table Dim_Employees. The structure of the table is as follows.

CREATE TABLE Dim_Employees
   (       
    Employee_Key     Number(6,0),
    Employee_Id        Number(6,0),
    Name                    Varchar2(20),
    Salary                   Number(8,2),
    Prev_Salary          Number(8,2)
    )
;
  • Employee_Key is the surrogate key which increments the value plus one for each record inserted into the table.
  • Prev_Salary field stores the previous salary amount of an employee.
    • If the employee record is inserted for the first time, the Prev_Salary amount would be null.
    • The salary amount of current record becomes the value of Prev_Salary field when an updated record of employee is received.

Steps to Create SCD Type-3 Mapping

The below mapping illustrates the design of SCD Type-3 implementation in Informatica Cloud Data Integration.

SCD Type-3 Mapping in Informatica Cloud

Follow below steps to create a SCD Type-1 mapping in Informatica Cloud.

1. Select the Source Object

In the source transformation, select the employees.csv as the source object. Under Formatting Options, select the delimiter and other flat file related properties. Verify if the source fields are properly read under Fields section.

Source Transformation
Source Transformation

2. Create Output Fields using Expression Transformation

Pass the data from source transformation to the expression transformation.

  • All the fields read from flat file will be of type String. These fields should be converted to the respective data types defined as in the target dimension table.
Creating output fields
Creating output fields

3. Look up on Target Object based on Natural Key

Pass the data to the Lookup transformation. Under Lookup object tab, write a query reading the required fields from the Dim_Employees table as shown below.

SELECT
    Employee_Key   AS    Lkp_Employee_Key ,
    Employee_Id      AS    Lkp_Employee_Id ,
    Salary                 AS    Lkp_Salary
FROM Dim_Employees

Under Lookup Condition tab, select the condition based on employee id fields from source and lookup objects as shown below.

Lookup Transformation
Lookup Transformation

4. Flag data based on output from Lookup

Based on output from lookup transformation, the data needs to be flagged for either Insert or Update.

  • If there is no matching record found for the employee_id read from source in the lookup object, flag the record for Insert.
  • If there is a matching record found for the employee_id read from source in the lookup object and the data is changed, flag the record for Update.
  • If there is a matching record found for the employee_id read from source in the lookup object and the data is unchanged, ignore it.

Pass the data from Lookup to an Expression transformation and create an output field Flag and assign the below field expression to flag the records.

IIF( ISNULL(LKP_EMPLOYEE_KEY), 'I',
    IIF(O_SALARY != LKP_SALARY, 'U')
)

5. Route data to different targets based on Flag

Create two output groups in the Router transformation to route the date to two different targets based on the flag value for Insert (Flag=’I’) and Update (Flag=’U’) operations as shown below.

Router Transformation
Router Transformation

Route the data to two different expression transformation from each of the output group before mapping to target.

6. Configure Sequence Generator for generating Surrogate Keys

A Sequence Generator transformation could be used for generating surrogate keys for the target dimension table. Pass the data from sequence generator to the expression transformation linked to the Insert group of the Router transformation.

  • Create an output field O_EMPLOYEE_KEY of type bigint and assign the field value as NEXTVAL coming from sequence generator.
Creating output surrogate key field using Sequence Generator
Creating output surrogate key field using Sequence Generator

If the surrogate key value generation is handled at the database level, there is no need to use the Sequence generator transformation.

7. Configure Target for Insert Operations

Pass the data from expression to a target transformation. Select the dimension table Dim_Employees as target object with operation defined as Insert.

Target transformation Inserting data into table
Target transformation Inserting data into table

Under Field Mapping section of the target transformation, map the output fields created in the mapping to the respective target fields. The prev_salary will not be mapped with any output field as value is expected to be null for insert operations.

Field Mapping of Target transformation Inserting data into table
Field Mapping of Target transformation Inserting data into table

8. Configure Target for Update Operations

Pass the data from Update group of the router transformation to an expression transformation.

  • Create an output field O_EMPLOYEE_KEY and assign the LKP_EMPLOYEE_KEY field as the value which is read from lookup. This field is used to identify the existing record in the dimension table to update the information.
  • Create an output field O_PREV_SALARY and assign the LKP_SALARY field as the value which is read from lookup.
Creating output fields from data read from Lookup transformation
Creating output fields from data read from Lookup transformation

Pass the data from expression to a target transformation. Select the dimension table Dim_Employees as target object with operation defined as Update. Select the EMPLOYEE_KEY as the Update column. It is the column based on which record is identified in the target and updated.

Target transformation Updating data into table
Target transformation Updating data into table

Under Field Mapping section of the target transformation, map the output fields created in the mapping to the respective target fields.

Field Mapping of Target transformation Updating data into table
Field Mapping of Target transformation Updating data into table

Validate, Save and trigger the mapping.

Verifying data in dimension table after the initial run

The data in the Dim_Employees table after the source data is processed is as below.

Dim_Employees table after the initial data load
Dim_Employees table after the initial data load

Since there is no data in the dimension table during the initial run, both the records are flagged for Insert and the data is inserted.The prev_salary value is null as expected.

Verifying data in dimension table after the second run with modified source data

Consider the salary of employee with id 100 is modified to 9000 from 6000. The other employee data remains unchanged.

EMP_ID,EMP_NAME,EMP_SALARY
100,Jennifer,9000
101,Micheal,8000

After the data is processed by Informatica, the data in the dimension table is as below.

Dim_Employees table after the second load with modified data
Dim_Employees table after the second load with modified data

The salary information of employee with Id 100 is now updated to 9000 and also the previous salary amount of 6000 is stored in prev_salary column. The information of employee with id 101 remains unchanged as there is no change in data.

Conclusion

Please note the below best practices to be followed while developing the mapping.

  • Follow proper naming convention while naming transformation and fields. Observe that all the output fields in the mapping demonstrated are prefixed with ‘O_’ and the lookup fields with ‘LKP_’
  • If there are multiple fields in the dimension table for which change in data needs to be tracked, make sure to include a checksum field in the table as it makes easy to track the changes in the data.
  • In the target transformation, limit the incoming fields by only selecting only the fields with prefix ‘O_’ using Field Rules. This eases the process of field mapping by allowing Automap feature as only required fields are allowed into the target.

Subscribe to our Newsletter !!

Related Articles:

  • HOW TO: Implement SCD Type-2 using Dynamic Lookup in Informatica Cloud (IICS)

    SCD Type-2 Implementation using Dynamic Lookup transformation in Informatica Cloud Data Integration

    READ MORE

  • HOW TO: Implement SCD Type-1 Mapping in Informatica Cloud (IICS)

    A complete guide to implementing SCD Type-1 mapping in Informatica Cloud Data Integration

    READ MORE

  • HOW TO: Implement SCD Type-2 Mapping in Informatica Cloud (IICS)

    A complete guide to implementing SCD Type-2 mapping in Informatica Cloud Data Integration

    READ MORE

Leave a Comment

Related Posts