pentaho-data-integration

Make a DB INSERT based on Text File Input metadata

此生再无相见时 提交于 2019-12-12 04:54:28
问题 I'm developing an ETL and must do some routines for monitoring it. At the begining, I must make in INSERT on DB to create a record informing the filename and starting process datetime. This query will return the record's PK and it must be stored. When the ETL of that file finishes, I must update that record informing the ETL finished with success and its ending process datetime. I use Text File Input to look for files that match its regex, and add its "Additional output fields" to stream. But

How to get a resultset from a dynamic select-based sql query in Pentaho-Kettle?

送分小仙女□ 提交于 2019-12-11 19:52:55
问题 I want to execute a set of select-based sql queries, derived from xml-node elements within a XML file, and write the values of the corresponding resultsets in a CSV file. I'd like to clarify that no field of the sql query is parameterized, but the full sql query itself is. The part of getting the full sql query is done as expected, but I don't know how to proceed with the part of launching the sql query so that I can get the corresponding resultset to manage it later. What I've tried until

PDI - Read CSV Files, if missing field/data then move to the next file

核能气质少年 提交于 2019-12-11 17:49:07
问题 I'm new with PDI and still learn about it. I'm trying to create transformation that will read all the csv file from one folder, check if the data of the file is correct, meaning there is no rows with missing/error/wrong format, then store it in a database. What I have try is : Use Text File Input accessing CSV file in FTP using Apache Common VFS. Validate and make condition to check the data (checking filename, field if exist) in CSV using Filter Row Output into PostgreSQL Table using

How to run different sql to get data according to the previous input data in pentaho kettle

﹥>﹥吖頭↗ 提交于 2019-12-11 17:08:34
问题 I use pentaho kettle 8.2 in Windows 10 and db is Oracle, now i have a requirement and don't know how to realize this function. My requirement is that: step 1: get data 1 from db; step 2: get data 2 from different table(sql) according to the field of step 1's data 1; step 3: update other db according data 2 in step 2. Step 1 is easy to get data from one db, in step 2, i try to get data based on step 1's output, i use Switch/case to judge step 1's result and then use different SQL script ,