Pushdown Optimization in Informatica Cloud (IICS)

Spread the love

1. Introduction

There are many performance tuning and optimization techniques offered by Informatica. Pushdown Optimization in Informatica Cloud is one such performance tuning technique which increase data processing performance extremely.

In this article let us discuss in detail about Pushdown Optimization technique, its limitations and how to overcome them.

2. Pushdown Optimization

Pushdown Optimization is a performance tuning technique where the transformation logic is converted into SQL and pushed towards either source database or target database or both.

Processing data on database level is much faster and efficient compared to processing the data in Informatica and Pushdown Optimization helps in achieving this in Informatica Cloud.

The amount of data transformation logic that can be pushed to the database depends on the database type, transformation logic, mapping task configuration. Informatica processes all transformation logic at Informatica level that it cannot push to a database.

3. Types of Pushdown Optimization

There are three different types in which Pushdown Optimization can be configured.

  • Source-side Pushdown Optimization
  • Target-side Pushdown Optimization
  • Full Pushdown Optimization

Before we get into understanding each type, consider below mapping as an example and we will discuss how the mapping behaves with each type enabled.

  • The above mapping is a simple mapping which reads Oracle table EMPLOYEES table as source.
  • Next we are applying a filter to read only details of employees with salary greater than 10000.
  • Next the data is sorted based on EMPLOYEE_ID before loading to target table.

When the mapping is triggered with no Pushdown optimization enabled, the entire records from source table are read and then Informatica applies filter on top of that data at Informatica level. Then the filtered data is sorted based on employee id before loading into target.

We have read 107 records from source and after transformation logic finally 15 records are loaded into target.

Let us now understand how the data is processed when each Pushdown Optimization method is enabled.

3.1 Source-side Pushdown optimization

In this method Informatica pushes as much transformation logic as possible to the Source database. The Data Integration Service generates and executes a SELECT statement based on the transformation logic for each transformation it can push to the database.

To enable the Source side Pushdown Optimization, go to Schedule tab of the Mapping task assigned to the mapping > Advanced Session Properties > click Add > from the drop down select Pushdown Optimization as Session Property Name and To Source as Session Property Value.

When the mapping completes, the Source/Target Results are as below

In the previous run entire 107 source records are read. In the current run with Source-side Pushdown Optimization enabled, only required 15 source records are read by Informatica.

In the session log we can see the Pushdown optimizer generated SQL Query as

SELECT 
HR.EMPLOYEES.EMPLOYEE_ID, 
HR.EMPLOYEES.FIRST_NAME, 
HR.EMPLOYEES.SALARY,
HR.EMPLOYEES.DEPARTMENT_ID 
FROM HR.EMPLOYEES 
WHERE (HR.EMPLOYEES.SALARY > 10000) 
ORDER BY HR.EMPLOYEES.EMPLOYEE_ID

The filter condition and sorting logic is pushed towards Source database there by reducing the number of records read from source and increasing the performance by avoiding processing the unnecessary data.

3.2 Target-side Pushdown optimization

In this method Informatica pushes as much transformation logic as possible to the target database. The Data Integration Service generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the target database.

To enable the Target side Pushdown Optimization, select Pushdown Optimization as Session Property Name and To Source as Session Property Value under Advanced Session Properties.

When the mapping completes, the Source/Target Results are as below

The results are identical to the initial run with no Pushdown Optimization enabled.

This is because there is no additional transformation logic that could be pushed to target database. Hence from the source end all the records are read again and the transformation logic is applied on them.

3.3 Full Pushdown optimization

In this method Informatica pushes as much transformation logic as possible to the target database. If the entire logic cannot be pushed to target database, it performs source-side pushdown and rest of the intermediate transformation logic which cannot be pushed to any database is processed at Informatica level.

In order to take advantage of Full Pushdown Optimization method, the source and target databases connections should be same.

To enable the Target side Pushdown Optimization, select Pushdown Optimization as Session Property Name and Full as Session Property Value under Advanced Session Properties.

When the mapping completes, the Source/Target Results are as below

There is only target database details mentioned. This is because the entire logic is pushed completely to target database.

In the session log we can see the Pushdown optimizer generated SQL Query as

INSERT INTO HR.EMP(EMPLOYEE_ID, NAME, SALARY, DEPARTMENT_ID) SELECT
HR.EMPLOYEES.EMPLOYEE_ID, 
CAST(HR.EMPLOYEES.FIRST_NAME AS VARCHAR(20)), 
HR.EMPLOYEES.SALARY, 
HR.EMPLOYEES.DEPARTMENT_ID 
FROM HR.EMPLOYEES 
WHERE (HR.EMPLOYEES.SALARY > 10000) 
ORDER BY HR.EMPLOYEES.EMPLOYEE_ID

A direct INSERT statement is applied on target table selecting the data from source table with all the other transformation logic of filter and sorting applied.

Though the data we used for testing is not significantly high to come down to any conclusion, I could see the run duration as below for each Pushdown method

PushDown MethodDuration
None15 Seconds
Source13 seconds
Target15 Seconds
Full10 Seconds

3.4 User-defined Pushdown optimization

Pushdown Optimization can also be configured using a user-defined parameter value defined in a parameter file.

To configure user-defined Pushdown Optimization, select Pushdown Optimization as Session Property Name and $$PushdownConfig as Session Property Value under Advanced Session Properties.

Configure a parameter file to use the user-defined parameter $$PushdownConfig in the below format.

$$PushdownConfig=<Pushdown optimization type>

Use the Pushdown optimization type as Source, Target, Full or None.

4. Limitations of Pushdown Optimization

  1. Source, Target and Lookup databases should be in the same relational database management system to use Full Pushdown Optimization.
  2. Lookup transformation must use Report Error for Multiple Match Policy. Using any other option for Multiple Matches makes the PDO stop at Lookup transformation.
  3. All the Informatica functions may not be supported by databases to convert as a SQL. If the function is not supported by database, Pushdown Optimization cannot be applied. The solution is to use custom query in Source transformation.
  4. Using variable ports in Expression transformation is not supported by Pushdown Optimization. The solution is to use to handle the logic in source query.
  5. In order to use SQL override query in source transformation, enable advanced session property ‘Allow Temporary View for Pushdown’ for PDO to work.
  6. In order to use Sequence generator transformation in the mapping, enable advanced session property ‘Allow Temporary Sequence for pushdown‘ for PDO to work.
  7. Informatica Powercenter provides access to Pushdown Optimization viewer which shows how the transformation logic is pushed down to database based on the PDO method you selected. In Informatica Cloud there is no access to Pushdown Optimization viewer. So you will not know how the Informatica pushes the transformation logic to database until you run the job and check the session log.

5. Pushdown Optimization Advanced Options

Informatica offers advanced options for Pushdown Optimization to overcome some of the limitations.

5.1 Allow Temporary View for Pushdown

When you use a SQL override query in the mapping, advanced session property Allow Temporary View for Pushdown should be enabled to use Pushdown Optimization. This enables the data integration service to create temporary views in database when a SQL override query is used in source and lookup transformations.

To enable, select Allow Temporary View for Pushdown as Advanced Session Property Name and select Yes as Session Property value in Schedule tab of mapping configuration task.

Let us understand how this works using an example.

In the above example we have discussed for understanding various Pushdown Optimization methods, following changes were made.

A SQL override query is used in source transformation instead of reading the EMPLOYEES object directly and the filter and sorter transformation logic is also covered in the same query as well. The query used in source transformation is as below.

SELECT 
HR.EMPLOYEES.EMPLOYEE_ID, 
HR.EMPLOYEES.SALARY, 
HR.EMPLOYEES.FIRST_NAME AS NAME,
HR.EMPLOYEES.DEPARTMENT_ID
FROM HR.EMPLOYEES
WHERE HR.EMPLOYEES.SALARY > 10000
ORDER BY HR.EMPLOYEES.EMPLOYEE_ID

A Lookup transformation is used to read DEPARTMENTS table based on Department_ID and fetch Department_Name. The Multiple Matches option is set as Report error.

The data integration service first creates a temporary view using the source SQL override query as shown below

CREATE VIEW PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE (EMPLOYEE_ID, SALARY, NAME, DEPARTMENT_ID) AS SELECT 
HR.EMPLOYEES.EMPLOYEE_ID, 
HR.EMPLOYEES.SALARY, 
HR.EMPLOYEES.FIRST_NAME AS NAME,
HR.EMPLOYEES.DEPARTMENT_ID
FROM HR.EMPLOYEES
WHERE HR.EMPLOYEES.SALARY > 10000
ORDER BY HR.EMPLOYEES.EMPLOYEE_ID;

Then the entire transformation logic is pushed to target using the temporary view created as below.

INSERT INTO HR.EMP(EMPLOYEE_ID, NAME, SALARY, DEPARTMENT_ID, DEPARTMENT_NAME) SELECT 
PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.EMPLOYEE_ID, 
PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.NAME, 
PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.SALARY, 
PM_ALookup_1.DEPARTMENT_ID, 
PM_ALookup_1.DEPARTMENT_NAME FROM 
(PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE LEFT OUTER JOIN "HR"."DEPARTMENTS" PM_ALookup_1 ON (PM_ALookup_1.DEPARTMENT_ID = PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.DEPARTMENT_ID)) 
WHERE ((PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.DEPARTMENT_ID = (SELECT PM_ALookup_1.DEPARTMENT_ID FROM "HR"."DEPARTMENTS" PM_ALookup_1 WHERE (PM_ALookup_1.DEPARTMENT_ID = PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE.DEPARTMENT_ID))) OR (0=0))

Once the data is loaded into the target, the temporary view created will be dropped.

DROP VIEW PM_V3R6KSJXGZFYM3HZ7MKR3CJ3DCE;

5.2 Allow Temporary Sequence for Pushdown

When a sequence generator transformation is used in the mapping, advanced session property Allow Temporary Sequence for Pushdown should be enabled to use Pushdown Optimization. This enables the data integration service to create temporary sequences in database.

To enable, select Allow Temporary Sequence for Pushdown as Advanced Session Property Name and select Yes as Session Property value in Schedule tab of mapping configuration task

  • Like we discussed in above example, data integration service creates a temporary sequence with the Sequence generator properties in the database.
  • The temporary sequence is then used in the SQL query pushed by data integration server to database.
  • When the data load is completed into target, the data integration service drops the temporary sequence created in the database.

5.3 Enable cross-schema pushdown optimization

When the source and target tables belong to two different schemas with in the same database, enable cross-schema pushdown optimization.

To use cross-schema pushdown optimization, create a separate connection for each schema. The username, password and rest of the details of the connection should be same except the schema details.

To enable cross-schema pushdown optimization, select Full Pushdown Optimization from Advance Session Properties and Select the Enable cross-schema pushdown optimization check box in Schedule tab of mapping configuration task.

Let’s break this down to understand how this works.

Active Database: During pushdown optimization, the Data Integration Service pushes the transformation logic to one database, which is called the active database.

Idle Database: A database that does not process transformation logic is called an Idle database.

During full Pushdown Optimization, the entire transformation logic is pushed towards target in below format in the examples we discussed.

INSERT INTO ACTIVE_DATABASE_TABLE( FieldNames) SELECT FieldNames from IDLE_DATABASE_TABLES

If you have observed, I have used tables from same schema HR as source, lookup and target in the examples discussed so far. What if I use a target table from a different schema SCOTT?

The SCOTT user cannot access the tables in HR schema, hence the pushdown optimization stops at target stating below reason and the transformation logic is just pushed towards source.

Pushdown optimization stops at transformation [EMP] because the connections are not pushdown compatible.

In order to implement full pushdown optimization, I have created two separate connections again with “system” username which has access to both HR and SCOTT schemas. I just used different schema names HR and SCOTT in each newly created connection.

5.4 Allow Pushdown for User Incompatible Connections

Enable Allow Pushdown for User Incompatible Connections when the username and passwords of Active and Idle database users are different but compatible.

When multiple connections are used in a mapping, the data integration service selects only one connection as active connection. In this case the user of the Active database must have read access on all the idle database tables to pushdown SQL query.

To enable, select Allow Pushdown for User Incompatible Connections as Advanced Session Property Name and select Yes as Session Property value in Schedule tab of mapping configuration task.

To see if Allow Pushdown for User Incompatible Connections is applicable for your connector, see help and instructions related to your connector.

5.5 Allow Pushdown across Databases

Enable Allow Pushdown across Databases to configure cross database pushdown optimization. The source and target connections must be different and pointing to two different databases.

To enable, select Allow Pushdown across Databases as Advanced Session Property Name and select Yes as Session Property value in Schedule tab of mapping configuration task.

To see if Allow Pushdown across databases is applicable for your connector, see help and instructions related to your connector.

6. Conclusion

Pushdown Optimization increases the mapping performance when you deal with large volumes of data. It avoids reading large volumes of data to and from between the database and data integration service across the network there by saving on network throughput time.

As much as Pushdown Optimization is efficient, it is not supported for all the connectors. You need to check if the connector you are going to use fall under the supported list.

Also lack of Pushdown Optimization viewer is a big minus in Informatica Cloud. So it needs a bit of trial and error while designing the mapping to see if a full advantage of Pushdown Optimization can be taken.

10 thoughts on “Pushdown Optimization in Informatica Cloud (IICS)”

  1. Very good content, requesting you if you can add few example for input and inout parameters that would be really helpful.

    Reply

Leave a Comment

Related Posts