How to split a column which has data in XML form to different rows of new Database as KEY VALUE in TALEND

纵饮孤独 提交于 2019-12-24 20:17:35

问题


In old DB i have a data in one column as

<ADDRESS>
<CITY>ABC</CITY>
<STATE>PQR</SERVICE>
</ADDRESS>

In my new DB i want this data to be stored in KEY VALUE fashion like:

USER_ID            KEY                VALUE
1                  CITY                ABC
1                  STATE               PQR

Someone please help me how to migrate this kind of data using TALEND tool.


回答1:


Design job like below.

tOracleInput---tExtractXMLFiled---output.

  1. tOracleInput component you can select XML column and make datatype as String.
  2. tExtractXmlFiled component pass this XML column as " XML Filed" and set the Loop xpath Expression as "/ADDRESS"
  3. Add new two Columns in output Schema of tExtractXmlFiled for city & STATE
  4. Set XPath Query in Mapping for city "/ADDRESS/CITY" and for STATE "/ADDRESS/STATE"
  5. Now you have both the values in output.

See the image for more details.

as I explain in your previous post you can follow the same approach for making Key value pair.
how-to-split-one-row-in-different-rows-in-talend

Or you can use tUnpivot component as you did here.

As you said source data has Special character then use below expression to replace it.

Steps: after oracle input add tMap and use this code for replacement of special symbol

row24.XMLField.replaceAll("&", "<![CDATA["+"&"+"]]>")   

once that is done execute the job and see the result it should work.




回答2:


I'd use tJavaFlex.

Component Settings:

tJavaFlex schema:

In the begin part, use

String input = ((String)globalMap.get("row2.xmlField")); // get the xml Fields value
String firstTag = input.substring(input.indexOf("<")+1,input.indexOf(">"));
input = input.replace("<"+firstTag+">","").replace("</"+firstTag+">","");  
int tagCount = input.length() - input.replace("</", "<").length();
int closeTagFinish = -1;
for (int i = 0; i<tagCount ; i++) { 

in the main part, parse the XML tag name and value, and have the output schema contain that 2 additional column. MAIN part will be like:

/*set up the output columns */
output.user_id = ((String)globalMap.get("row2.user_id"));
output.user_first_name = ((String)globalMap.get("row2.user_first_name"));
output.user_last_name = ((String)globalMap.get("row2.user_last_name"));

Then we can calculate the key-value pairs for the XML, without knowing the KEY values.

/*calculate columns out of XML */
int openTagStart = input.indexOf("<",closeTagFinish+1);
int openTagFinish = input.indexOf(">",openTagStart);
int closeTagStart = input.indexOf("<",openTagFinish);
closeTagFinish = input.indexOf(">",closeTagStart);
output.xmlKey = input.substring(openTagStart+1,openTagFinish);
output.xmlValue = input.substring(openTagFinish+1,closeTagStart);

tJavaFlex End part:

}

Output looks like:

.-------+---------------+--------------+------+--------.
|                      tLogRow_2                       |
|=------+---------------+--------------+------+-------=|
|user_id|user_first_name|user_last_name|xmlKey|xmlValue|
|=------+---------------+--------------+------+-------=|
|1      |foo            |bar           |CITY  |ABC     |
|1      |foo            |bar           |STATE |PQR     |
'-------+---------------+--------------+------+--------'


来源:https://stackoverflow.com/questions/27837015/how-to-split-a-column-which-has-data-in-xml-form-to-different-rows-of-new-databa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!