kettle

Java code for creating Pentaho reports and this reports accept input as data integration(.ktr) file

南笙酒味 提交于 2020-01-07 04:57:13
问题 I am looking for java code in which have to display Pentaho reports in HTML/PDF format with Data Integration(.ktr file) as input to the Pentaho reports. 回答1: First of all most probably there is official documentation to API, chapter " Embedding the Reporting Engine Into a Java Application " Code it pretty simple Create report using visual tool, assign data source as .ktr file, and code as example in tutorial. Only u might need to customize path to *.ktr file which either can be made via

Extract data from large files excel

风流意气都作罢 提交于 2020-01-06 19:31:53
问题 I'm using Pentaho Data Integration to create a transformation from xlsx files to mysql, but I can't import data from large files with Excel 2007 xlsx(apache POI Straiming) . It gives me out of memory errors. 回答1: Did you try this option ? Advanced settings -> Generation mode -> Less memory consumed for large excel(Event mode (You need to check "Read excel2007 file format" first) 回答2: I would recommend you to increase jvm memory allocation before running the transformation. By default, pentaho

Error running spoon on Ubuntu 14.04 64 bit

心不动则不痛 提交于 2020-01-06 08:47:25
问题 I am using Spoon tool of Pentaho data integration for long and it was working fine on my system. But since i moved it to /opt I am unable to run again . I have Oracle Java 8 installed on my system and each time try to run it i am end up with following exception Exception in thread "main" java.lang.NoClassDefFoundError: org/eclipse/swt/widgets/Composite at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2688) at java.lang.Class

ETL工具kettle的组件--生成记录

喜欢而已 提交于 2020-01-05 22:27:43
今天介绍下kettle的一个比较实用的组件——生成记录; 当我们想将一部分文本数据变成数据行,每个字段作为一个数据行的一个列,那么我们可以利用这个组件; 它的位置在 双击点开根据自己的实际需要进行设置 当设置后,可以点击预览,上面的【限制】选项就是行的数量; 其中 有三个选项是必填项——名称,类型,值 来源: 51CTO 作者: 夜七夜 链接: https://blog.51cto.com/13602563/2170365

kettle的【阻塞数据】、【阻塞数据直到完成】、【执行SQL脚本】

白昼怎懂夜的黑 提交于 2020-01-05 22:27:28
kettle转换中的各个组件是并行的关系,job中是有先后顺序的,这样就可能会遇到一种情况——我想在某个步骤完成后再执行下面的步骤,这时该怎么办呢?那么这时就可以用到【阻塞数据】和【阻塞数据直到完成】两个组件; 【阻塞数据】:这个组件只允许前一个步骤的最后一条数据通过,这个往往和【执行SQL脚本】组件 并用; 【阻塞数据直到完成】:这个组件会将所设置的被阻塞步骤的所有数据堵住,当被阻塞的步骤完成后,才会将被阻塞步骤的所有数据往下一个步骤推; 注:虽然阻塞了,但是阻塞之后的步骤仍然在运行,即运行时间仍然在进行; 【执行SQL脚本】: 这个组件是一个相对比较复杂的组件,在转换中这个组件是运行优先级很高的组件,如果不勾选{执行每一行}这个选项,那么该组件就只会执行一次,勾选了这个选项那么执行次数就是前一个步骤的数据条目数;如果想实现该组件只在最后一步执行一次,那么可以这样来操作 【注】:此时别忘了勾选{执行每一行}这个选项,这个流程的逻辑就是当【表输出】完成后,将其的最后一条数据推过【阻塞数据】,然后执行【执行SQL脚本】,因为勾选了{执行每一行}这个选项,又因为【阻塞数据】只有一条数据,那么【执行SQL脚本】只执行一次; 另外【执行SQL脚本】还有一些其他的选项,如{变量替换}等; 一、当要使用变量时必须勾选{变量替换}这个选项,如果只勾选这个选项,那么变量只能使用${变量名}这样的

ETL工具kettle的几个小插件(字符串替换,字段选择,将字段值设置为常量)

こ雲淡風輕ζ 提交于 2020-01-05 22:07:17
继续给大家介绍几个小组件: 一、字符串替换 这个功能类似于oracle的replace函数,就是将某个字段的某些字符替换成我们给定的字符 首先,选择【输入流字段】,【输出流字段】自己命名(就是用来保存处理后的结果的字段,可以和输入流字段保持一致),可以选择【使用正则表达式】,【搜索】就是你希望被替换的字符、字符串,可以是一个正则表达式,【使用,,,替换】就是你期望用什么值替换被替换的部分,【设置为空串】就是将被选择的部分用空替换,【使用字段值替换】你可以使用现有的某个字段的值替换你期望被替换的部分,后面两个根据自己的需要自己选择; 二、字段选择 这个组件包含三个功能,如图 选择和修改,这个可以实现对字段的名字进行更改,当是数值时可以进行精度的设置; 移除就是将某个字段从流中移除(我感觉这个功能比较鸡肋); 元数据就是实现比如将varchar变成date这类的功能,也可以实现修改字符集等; 三、将字段值设置为常量 这个就是将某个字段设置为常量,很简单,就不详细说了; 来源: 51CTO 作者: 夜七夜 链接: https://blog.51cto.com/13602563/2169750

Running PDI Kettle on Java - Mongodb Step Missing Plugins

China☆狼群 提交于 2020-01-05 14:11:30
问题 I am trying to run a transformation which includes mongodb input step from a java app but always resulting error with this message: org.pentaho.di.core.exception.KettleMissingPluginsException: Missing plugins found while loading a transformation Step : MongoDbInput at org.pentaho.di.trans.TransMeta.loadXML(TransMeta.java:2931) at org.pentaho.di.trans.TransMeta.<init>(TransMeta.java:2813) at org.pentaho.di.trans.TransMeta.<init>(TransMeta.java:2774) at org.pentaho.di.trans.TransMeta.<init>

How to continuously read JMS Messages in a thread and achnowledge them based on their JMSMessageID in another thread?

Deadly 提交于 2020-01-05 04:13:28
问题 I've written a Continuous JMS Message reveiver : Here, I'm using CLIENT_ACKNOWLEDGE because I don't want this thread to acknowledge the messages. (...) connection.start(); session = connection.createQueueSession(true, Session.CLIENT_ACKNOWLEDGE); queue = session.createQueue(QueueId); receiver = session.createReceiver(queue); While (true) { message = receiver.receive(1000); if ( message != null ) { // NB : I can only pass Strings to the other thread sendMessageToOtherThread( message.getText()

How connect Pentaho Data Integration with Amazon RDS

假装没事ソ 提交于 2020-01-04 13:56:29
问题 I´m having difficult with create a new connection on Pentaho Data Integration (Kettle) with Amazon RDS, Amazon needs a CA Cert, and I dont know how to input it to connection. Someone can help me? Tkx, 回答1: Establish Secure Connection (SSL) To AWS (RDS) Aurora / MySQL from Pentaho (PDI Kettle) 1. You need to create a new user id and Grant SSL rights to it. So this user id can connect to Aurora / MySQL only using Secured connection. GRANT USAGE ON *.* TO 'admin'@'%' REQUIRE SSL 2. Download

How to deploy scheduled Kettle jobs on Pentaho BI server v6 CE

好久不见. 提交于 2020-01-02 14:41:21
问题 I have a server running Pentaho BI server v6 Community Edition. We've developed a Kettle job to extract from one database to another, exported as a KJB file. I would like to run this job every 12 or so hours. I noticed that the BI server already included Kettle, and has the ability to upload and schedule jobs. Do I need to install the DI server if the BI server already has Kettle installed? If not, how can I publish the KJB file into the BI server? I'd like to use a file system repository. If