The Normalizer transformation is used to normalize data i.e. convert one incoming row into multiple output rows. But what if you need to do the exact opposite?
Denormalizing data in Informatica is exact opposite what Normalizer transformation does where data is read in multiple incoming rows, aggregated and converted into columns. In order to Denormalize the data, there is no pre-built transformation available in Informatica Cloud and it needs to be implemented using a mapping logic.
2. Problem Statement
Consider below Student data as the source table
You want to denormalize the data into below structure
To accomplish this follow below steps
1 Select the object with above mentioned data as source and pass the data to an expression transformation.
2. Make sure the data is sorted else sort data using Sorter transformation before passing it to expression transformation.
3. In the expression create an output field for each subject in the source data and assign the value as Score when the record belongs to that particular subject, else assign value as null.
|O_Maths||IIF(Subject = ‘Maths’, Score, NULL)|
|O_Physics||IIF(Subject = ‘Physics’, Score, NULL)|
|O_Chemistry||IIF(Subject = ‘Chemistry’, Score, NULL)|
The output from expression will be as below
4. In the next step pass the data from expression transformation to an Aggregator transformation.
5. In the Group By tab of aggregator, select ID and Name fields as Group by fields.
6. In the Aggregate tab of aggregator, create below output fields.
7. Enable Sorted Input in Advanced tab for a better performance in aggregator.
8. Pass the data from aggregator transformation to a target transformation and configure the target.
- When we compare the source data and the target data, it is evident that some of the fields in target (Maths, Physics, and Chemistry) are not present in the source transformation. So it is clear that these fields’ needs to be created using an expression transformation and the values to be assigned as per the subject.
- The output from expression transformation gives same number of rows as received from source with newly created fields where Score for each subject is present in only one row with rest of the subject scores as NULL.
- The introduction of Aggregator transformation is used to combine these records of each student into one record. Hence they are grouped by ID and Name fields. Depending on source data, either Max() or Sum() functions can be used to aggregate the data.
5. How to Denormalize data in Oracle using a SQL query?
The same approach that we applied in Informatica can be applied to denormalize the data in Oracle using a SQL query.
Consider the same source data with table name as Students. The students data can be Denormalized using below SQL query.
SELECT a.ID, a.Name, MAX(a.Maths) AS Maths, MAX(a.Physics) AS Physics, MAX(a.Chemistry) AS Chemistry FROM( SELECT ID, NAME, CASE WHEN Subject = 'Maths' THEN Score ELSE null END AS Maths, CASE WHEN Subject = 'Physics' THEN Score Else null END AS Physics, CASE WHEN Subject = 'Chemistry' THEN Score Else null END AS Chemistry FROM Students) a GROUP BY ID, Name
6. Denormalizing data using PIVOT function
Alternatively you can use PIVOT function in Oracle to denormalize the data to convert the rows into the columns. The PIVOT functions takes data in separate rows, aggregates it and converts it into columns.
The SQL query using PIVOT function to denormalize the data in Oracle is as below
SELECT * FROM Students PIVOT (MAX(Score) FOR Subject IN ('Maths' AS Maths, 'Physics' AS Physics,'Chemistry' AS Chemistry)) ORDER BY ID
There is another way of viewing the denormalizing data where all the column values are concatenated using a delimiter and stored in a single column. This can be achieved by comparing current record with previous record and the final record for each ID will have the concatenated values of all the records with the same ID. This final record can be later filtered out using Aggregator. The method we discussed here in this article is different as we are converting the row values into separate columns instead of loading the concatenated field values into a single column.