Sharing Knowledge: How to optimize Informatica Aggregator Transformation

Tuesday, June 3, 2014

How to optimize Informatica Aggregator Transformation

In Informatica Power Center, Aggregator transformation is one of the heavily used transformation and one of expensive transformation in terms of the performance, especially when you are having large data set for processing. Now question is how to optimize or tune the aggregator transformation. There are multiple ways to optimize aggregator transformation in Informatica. I have listed out some of them –

Group by simple columns: Aggregator transformation performs better when it’s grouped by simple columns. If possible for grouping use number field instead of string or date.
Use sorted input: It will reduce the aggregator cache which will improve performance.
Use incremental aggregation: it will only cache and process the changed data rather than entire set of data.

Despite of applying above tuning method you may not get the expected performance. Now let us try to see if we can avoid the aggregator transformation completely by using the analytical functions in Source Qualifier. This will be possible only if all the data elements in aggregator transformation are from same source database. This won’t work if source is flat file or if all the data elements are not from same source database. I have given examples for couple of analytical functions that can be used in Source Qualifier.

Let’s say you would like to get the average age of the employees reporting to the same manager at the same time you need employee name and their age as well. Without analytical function you would need to have two queries joined one would get the average age another would get then employee name and their age. By using analytical function you can get this in one query like this -

SELECT MANAGER_NAME, EMP_NAME, AGE,

AVG(AGE) OVER (PARTITION BY MANAGER_NAME) AS AVG_AGE

FROM FACT_EMP;

MANAGER_NAME	EMP_NAME	AGE	AVG_AGE
Mike	John	30	40.5
Mike	David	34	40.5
Mike	Sam	50	40.5
Mike	Bob	48	40.5

Similarly if you like to get the summation of salary for each manager then you can write a query like this –

SELECT MANAGER_NAME, EMP_NAME, SALARY,

SUM(SALARY) OVER (PARTITION BY MANAGER_NAME) AS SAL_BY_MANAGER

FROM FACT_EMP;

MANAGER_NAME	EMP_NAME	SALARY	SAL_BY_MANAGER
Mike	John	80000	315000
Mike	David	75000	315000
Mike	Sam	60000	315000
Mike	Bob	100000	315000

There are so many other analytical functions, like AVG, COUNT, DENSE_RANK, LAG, LEAD, MAX, MIN, RANK, ROW_NUMBER, SUM, etc.

For detailed explanation about analytical function and full list of analytical functions please refer to your database manual.

I would be glad to see your comment or feedback about this topic.

2 comments:

UnknownOctober 27, 2015 at 9:22 PM
I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly
informatica online training
ReplyDelete
Replies
AltınGözler96October 15, 2023 at 10:29 PM
https://titandijital.com.tr/
kars parça eşya taşıma
konya parça eşya taşıma
çankırı parça eşya taşıma
yalova parça eşya taşıma
3SXUZ
ReplyDelete
Replies

Add comment