Friday, June 24, 2016

Automation of New Column Addition in Informatica

In my project I see many recurring requirements of new column addition in existing tables hence Informatica mapping modification again and again so thought of automating this activity and reduce the overall implementation time hence cost benefits to the customer. 
In this blog I will explain the framework that we implemented for this automation. Here I am taking source as a relational object and target as a Netezza table. Source data should be inserted or updated based on availability of data in target. This framework can easily be customized for any relational source/target & flat file.
1. First step is to add new dummy columns as Strings in Source Qualifier and precision should be on higher side.
2. Create the Target as a flat file, copy the existing attributes from Netezza target table and add new dummy columns same as Source Qualifier.

3. Create the parameters as String and precision should be on higher side. Number of parameters should be same as the new dummy columns that you added in above source & target. Initial value should be NULL.
4. Create the Transformation parameter with “IsExpression Variable” as TRUE. This is for to add any transformation logic for new columns that you will add in future.
5. Now modify the Source Qualifier query. Add the parameters in query that you created in above steps like below
SELECT distinct case when t.emp_id is null then ‘I’ else ‘U’ end as flag,
s.emp_id, s.first_name, s.last_name,
$$COLUMN1 as NEWFIELD,
$$COLUMN2 as NEWFIELD1,
$$COLUMN3 as NEWFIELD2,
$$COLUMN4 as NEWFIELD3,
$$COLUMN5 as NEWFIELD4,
$$COLUMN6 as NEWFIELD5
FROM
src_newcolumnaddition s left join
tgt_newcolumnaddition t on (s.emp_id = t.emp_id)
6. In expression transformation, select the NEWFIELDs as input and create Variable/Output Ports and assign the Transformation Parameters in Expression, like below –
7. Now connect the ports appropriately to the target. Mapping creation is completed here and create the session & workflows.
8. In workflow provide the parameter file name. When there is no new column addition then no need to pass any parameter value, by default it will take as a NULL value.
9. Now let’s say you have the requirement of new column addition in source as “Date of Birth” and in Target as “Age”. This is how parameter file should look like
[folder_name.WF:wkf_test_NewColumnAddition.ST:s_test_NewColumnAddition]
$$COLUMN1=DOB
$$TRNSFRM1=date_diff(SYSDATE, to_date(NEWFIELD,’yyyy-mm-dd’), ‘YY’)
10. Using above steps you would be to generate Target flat file with new columns as mentioned in parameter file.
11. After this step we need to use script to load the data in target table. I have used python script to automate the load process but this can be implemented in shell script as well.
In Netezza, nzload script throws an error if your file has more columns than the table. To resolve this script will get the list of target table columns first, remove the extra NULL columns from flat file and execute the insert & update queries in database. These insert & update queries will get generated dynamically based on the number of columns in table.
In this script, following are the parameters -
-f : Source file name which is generated by informatica
-df: Delimiter Character
-t: Target table name
-d: Database/Schema where above target table exists
If you like this idea and would like to implement then please let me know I would be very happy to assist you.
Taking this idea to next level is create template to do the data integration. In next blog I will show you the template creation and execution for different source/target objet
Please let me know your thoughts/comment on this. 

7 comments:

  1. Hi admin, this steps are really helpful and I have learned to add new column addition in Informatica. Keep sharing more like this.
    Informatica Training in Chennai | Informatica Training institutes in Chennai | Best Informatica Training Institute In Chennai

    ReplyDelete
  2. Your website is really cool and this is a great inspiring article. Thank you so much. https://view.ly/v/EPuUGO2Zqek0

    ReplyDelete
  3. I think Informatica is amongst the best tools to analyse,visualise and integrate data.

    Informatica Read JSON

    ReplyDelete