HOW TO: Denormalize data in Informatica Cloud (IICS)?

Spread the love

Chapter-5

1. Introduction

The Normalizer transformation is used to normalize data i.e. convert one incoming row into multiple output rows. But what if you need to do the exact opposite?

Denormalizing data in Informatica is exact opposite what Normalizer transformation does where data is read in multiple incoming rows, aggregated and converted into columns. In order to Denormalize the data, there is no pre-built transformation available in Informatica Cloud and it needs to be implemented using a mapping logic.

2. Problem Statement

Consider below Student data as the source table

Source Data:

IDNameSubjectScore
1RamMaths50
1RamPhysics65
1RamChemistry70
2NeenaMaths88
2NeenaPhysics90
2NeenaChemistry65
3JohnMaths100
3JohnPhysics45
3JohnChemistry52

You want to denormalize the data into below structure

Target Data:

IDNameMathsPhysicsChemistry
1Ram506570
2Neena889065
3John1004552

3. Solution

To accomplish this follow below steps

1 Select the object with above mentioned data as source and pass the data to an expression transformation.

2. Make sure the data is sorted else sort data using Sorter transformation before passing it to expression transformation.

3. In the expression create an output field for each subject in the source data and assign the value as Score when the record belongs to that particular subject, else assign value as null.

O_MathsIIF(Subject = ‘Maths’, Score, NULL)
O_PhysicsIIF(Subject = ‘Physics’, Score, NULL)
O_ChemistryIIF(Subject = ‘Chemistry’, Score, NULL)

The output from expression will be as below

IDNameO_MathsO_PhysicsO_Chemistry
1Ram50  
1Ram 65 
1Ram  70
2Neena88  
2Neena 90 
2Neena  65
3John100  
3John 45 
3John  52

4. In the next step pass the data from expression transformation to an Aggregator transformation.

5. In the Group By tab of aggregator, select ID and Name fields as Group by fields.

6. In the Aggregate tab of aggregator, create below output fields.

MathsMAX(O_Maths)
PhysicsMAX(O_Physics)
ChemistryMAX(O_Chemistry)

7. Enable Sorted Input in Advanced tab for a better performance in aggregator.

8. Pass the data from aggregator transformation to a target transformation and configure the target.

4. Approach

  • When we compare the source data and the target data, it is evident that some of the fields in target (Maths, Physics, and Chemistry) are not present in the source transformation. So it is clear that these fields’ needs to be created using an expression transformation and the values to be assigned as per the subject.
  • The output from expression transformation gives same number of rows as received from source with newly created fields where Score for each subject is present in only one row with rest of the subject scores as NULL.
  • The introduction of Aggregator transformation is used to combine these records of each student into one record. Hence they are grouped by ID and Name fields. Depending on source data, either Max() or Sum() functions can be used to aggregate the data.

5. How to Denormalize data in Oracle using a SQL query?

The same approach that we applied in Informatica can be applied to denormalize the data in Oracle using a SQL query.

Consider the same source data with table name as Students. The students data can be Denormalized using below SQL query.

SELECT 
   a.ID, 
   a.Name, 
   MAX(a.Maths) AS Maths, 
   MAX(a.Physics) AS Physics, 
   MAX(a.Chemistry) AS Chemistry 
FROM(
SELECT 
	ID, 
	NAME,
	CASE WHEN Subject = 'Maths' THEN Score ELSE null END AS Maths,
	CASE WHEN Subject = 'Physics' THEN Score Else null END AS Physics,
	CASE WHEN Subject = 'Chemistry' THEN Score Else null END AS Chemistry
FROM Students) a 
GROUP BY ID, Name

6. Denormalizing data using PIVOT function

Alternatively you can use PIVOT function in Oracle to denormalize the data to convert the rows into the columns. The PIVOT functions takes data in separate rows, aggregates it and converts it into columns.

The SQL query using PIVOT function to denormalize the data in Oracle is as below

SELECT * FROM Students 
PIVOT (MAX(Score) FOR Subject IN ('Maths' AS Maths, 'Physics' AS Physics,'Chemistry' AS Chemistry))
ORDER BY ID

7. Conclusion

There is another way of viewing the denormalizing data where all the column values are concatenated using a delimiter and stored in a single column. This can be achieved by comparing current record with previous record and the final record for each ID will have the concatenated values of all the records with the same ID. This final record can be later filtered out using Aggregator. The method we discussed here in this article is different as we are converting the row values into separate columns instead of loading the concatenated field values into a single column.

2 thoughts on “HOW TO: Denormalize data in Informatica Cloud (IICS)?”

  1. The same solution can be implemented in a little shorter way:

    For Informatica mapping:
    1. No need to introduce Expression transformation, because Aggregator transformation supports conditional aggregation.
    2. Create following 3 ports at Aggregator transformation (select ID and Name fields as Group by fields):
    Maths —-> MAX(Score, Subject=‘Maths’)
    Physics —> MAX(Score, Subject=‘Physics’)
    Chemistry —> MAX(Score, Subject=‘Chemistry’)
    And map these ports to the Target columns.

    Similarly, in SQL query:
    SELECT
    ID,
    Name,
    MAX(CASE WHEN Subject = ‘Maths’ THEN Score END) AS Maths,
    MAX(CASE WHEN Subject = ‘Physics’ THEN Score END) AS Physics,
    MAX(CASE WHEN Subject = ‘Chemistry’ THEN Score END) AS Chemistry
    FROM Students
    GROUP BY ID, Name

    Reply
    • Hi Sachin,

      Your solution is correct too. But think from the perspective of explaining some one new to concept.
      Breaking down each step using expression transformation helps visualize data flow and understand better.
      Once the concept is understood, it is up to you to optimize the code.

      Reply

Leave a Comment

Related Posts