This is my sample file
#%cty_id1,#%ccy_id2,#%cty_src,#%cty_cd3,#%cty_nm4,#%cty_reg5,#%cty_natnl6,#%cty_bus7,#%cty_data8
690,ALL2,,AL,ALBALODMNIA,,,,
90,ALL2,,
Your best bet here may be to use the tSchemaComplianceCheck
component in Talend.
If you read the file in with a tFileInputDelimited
component and then check it with the tSchemaComplianceCheck
where you set cty_cd
to not nullable then it will reject your Antarctica row simply for the null where you expect no nulls.
From here you can use a tMap and simply map the fields to the one above.
You should be able to easily tweak this as necessary, potentially with further tSchemaComplianceCheck
s down the reject lines and mapping to suit. This method is a lot more self explanatory and you don't have to deal with complicated regex's that need complicated management when you want to accommodate different variations of your file structure with the benefit that you will always capture all of the well formatted rows.