HOW TO: Denormalize data in Informatica Cloud (IICS)?

Spread the love

Chapter-5

Contents hide

1. Introduction

2. Problem Statement

3. Solution

4. Approach

5. How to Denormalize data in Oracle using a SQL query?

6. Denormalizing data using PIVOT function

7. Conclusion

1. Introduction

The Normalizer transformation is used to normalize data i.e. convert one incoming row into multiple output rows. But what if you need to do the exact opposite?

Denormalizing data in Informatica is exact opposite what Normalizer transformation does where data is read in multiple incoming rows, aggregated and converted into columns. In order to Denormalize the data, there is no pre-built transformation available in Informatica Cloud and it needs to be implemented using a mapping logic.

2. Problem Statement

Consider below Student data as the source table

Source Data:

ID	Name	Subject	Score
1	Ram	Maths	50
1	Ram	Physics	65
1	Ram	Chemistry	70
2	Neena	Maths	88
2	Neena	Physics	90
2	Neena	Chemistry	65
3	John	Maths	100
3	John	Physics	45
3	John	Chemistry	52

You want to denormalize the data into below structure

Target Data:

ID	Name	Maths	Physics	Chemistry
1	Ram	50	65	70
2	Neena	88	90	65
3	John	100	45	52

3. Solution

To accomplish this follow below steps

1 Select the object with above mentioned data as source and pass the data to an expression transformation.

2. Make sure the data is sorted else sort data using Sorter transformation before passing it to expression transformation.

3. In the expression create an output field for each subject in the source data and assign the value as Score when the record belongs to that particular subject, else assign value as null.

O_Maths	IIF(Subject = ‘Maths’, Score, NULL)
O_Physics	IIF(Subject = ‘Physics’, Score, NULL)
O_Chemistry	IIF(Subject = ‘Chemistry’, Score, NULL)

The output from expression will be as below

ID	Name	O_Maths	O_Physics	O_Chemistry
1	Ram	50
1	Ram		65
1	Ram			70
2	Neena	88
2	Neena		90
2	Neena			65
3	John	100
3	John		45
3	John			52

4. In the next step pass the data from expression transformation to an Aggregator transformation.

5. In the Group By tab of aggregator, select ID and Name fields as Group by fields.

6. In the Aggregate tab of aggregator, create below output fields.

Maths	MAX(O_Maths)
Physics	MAX(O_Physics)
Chemistry	MAX(O_Chemistry)

7. Enable Sorted Input in Advanced tab for a better performance in aggregator.

8. Pass the data from aggregator transformation to a target transformation and configure the target.

4. Approach

When we compare the source data and the target data, it is evident that some of the fields in target (Maths, Physics, and Chemistry) are not present in the source transformation. So it is clear that these fields’ needs to be created using an expression transformation and the values to be assigned as per the subject.
The output from expression transformation gives same number of rows as received from source with newly created fields where Score for each subject is present in only one row with rest of the subject scores as NULL.
The introduction of Aggregator transformation is used to combine these records of each student into one record. Hence they are grouped by ID and Name fields. Depending on source data, either Max() or Sum() functions can be used to aggregate the data.

5. How to Denormalize data in Oracle using a SQL query?

The same approach that we applied in Informatica can be applied to denormalize the data in Oracle using a SQL query.

Consider the same source data with table name as Students. The students data can be Denormalized using below SQL query.

SELECT 
   a.ID, 
   a.Name, 
   MAX(a.Maths) AS Maths, 
   MAX(a.Physics) AS Physics, 
   MAX(a.Chemistry) AS Chemistry 
FROM(
SELECT 
	ID, 
	NAME,
	CASE WHEN Subject = 'Maths' THEN Score ELSE null END AS Maths,
	CASE WHEN Subject = 'Physics' THEN Score Else null END AS Physics,
	CASE WHEN Subject = 'Chemistry' THEN Score Else null END AS Chemistry
FROM Students) a 
GROUP BY ID, Name

6. Denormalizing data using PIVOT function

Alternatively you can use PIVOT function in Oracle to denormalize the data to convert the rows into the columns. The PIVOT functions takes data in separate rows, aggregates it and converts it into columns.

The SQL query using PIVOT function to denormalize the data in Oracle is as below

SELECT * FROM Students 
PIVOT (MAX(Score) FOR Subject IN ('Maths' AS Maths, 'Physics' AS Physics,'Chemistry' AS Chemistry))
ORDER BY ID

7. Conclusion

There is another way of viewing the denormalizing data where all the column values are concatenated using a delimiter and stored in a single column. This can be achieved by comparing current record with previous record and the final record for each ID will have the concatenated values of all the records with the same ID. This final record can be later filtered out using Aggregator. The method we discussed here in this article is different as we are converting the row values into separate columns instead of loading the concatenated field values into a single column.

Sachin Kamal
May 22, 2022 at 10:30 pm
The same solution can be implemented in a little shorter way:
For Informatica mapping:
1. No need to introduce Expression transformation, because Aggregator transformation supports conditional aggregation.
2. Create following 3 ports at Aggregator transformation (select ID and Name fields as Group by fields):
Maths —-> MAX(Score, Subject=‘Maths’)
Physics —> MAX(Score, Subject=‘Physics’)
Chemistry —> MAX(Score, Subject=‘Chemistry’)
And map these ports to the Target columns.
Similarly, in SQL query:
SELECT
ID,
Name,
MAX(CASE WHEN Subject = ‘Maths’ THEN Score END) AS Maths,
MAX(CASE WHEN Subject = ‘Physics’ THEN Score END) AS Physics,
MAX(CASE WHEN Subject = ‘Chemistry’ THEN Score END) AS Chemistry
FROM Students
GROUP BY ID, Name
- ThinkETL
  May 30, 2022 at 1:02 pm
  Hi Sachin,
  Your solution is correct too. But think from the perspective of explaining some one new to concept.
  Breaking down each step using expression transformation helps visualize data flow and understand better.
  Once the concept is understood, it is up to you to optimize the code.
  - Prajwal Kalal
    September 8, 2023 at 1:13 am
    Thats exactly y ThinkETL is best
    Best wishes !!
Abhijit
February 11, 2023 at 2:24 am
Thanks ThinkETL team for providing the detailed information.
- ThinkETL
  February 22, 2023 at 11:27 pm
  Thank Abhijit. Glad it helped!!