pentaho

kettle_errot_karafLifecycleListenter

北战南征 提交于 2020-02-09 10:40:01
使用kettle 6.1 通过命令行批量执行作业的过程中,发现偶尔有作业执行时间会变慢几分钟,查看日志发现改作业开始就报了一个错 报错之后才会继续下面的作业,虽然不影响最终作业执行结果,但也延误了一些跑批时间。 去pentaho论坛查了一下资料,有人给出两种解决方法: 1、(6.0版本)修改 /classes目录下kettle-lifecycle-listeners.xml和ke 大专栏 kettle_errot_karafLifecycleListenter ttle-registry-extensions.xml两个文件成下面这样: kettle-lifecycle-listeners.xml <listeners> </listeners> kettle-registry-extensions.xml <registry-extensions> </registry-extensions> 2、(6.1版本)修改/system/karaf/etc/org.apache.karaf.features.cfg 把featuresBoot=config,pentaho-client,pentaho-metaverse,pdi-dataservice,pdi-data-refinery 改成featuresBoot=config,pentaho-client,pentaho

Pentaho text file input step crashing (out of memory)

a 夏天 提交于 2020-02-07 17:15:55
问题 I am using Pentaho for reading a very large file. 11GB. The process is sometime crashing with out of memory exception, and sometimes it will just say process killed. I am running the job on a machine with 12GB, and giving the process 8 GB. Is there a way to run the Text File Input step with some configuration to use less memory? maybe use the disk more? Thanks! 回答1: Open up spoon.sh/bat or pan/kettle .sh or .bat and change the -Xmx figure. Search for JAVAMAXMEM Even though you have spare

Pentaho text file input step crashing (out of memory)

半城伤御伤魂 提交于 2020-02-07 17:15:48
问题 I am using Pentaho for reading a very large file. 11GB. The process is sometime crashing with out of memory exception, and sometimes it will just say process killed. I am running the job on a machine with 12GB, and giving the process 8 GB. Is there a way to run the Text File Input step with some configuration to use less memory? maybe use the disk more? Thanks! 回答1: Open up spoon.sh/bat or pan/kettle .sh or .bat and change the -Xmx figure. Search for JAVAMAXMEM Even though you have spare

Pentaho text file input step crashing (out of memory)

佐手、 提交于 2020-02-07 17:14:18
问题 I am using Pentaho for reading a very large file. 11GB. The process is sometime crashing with out of memory exception, and sometimes it will just say process killed. I am running the job on a machine with 12GB, and giving the process 8 GB. Is there a way to run the Text File Input step with some configuration to use less memory? maybe use the disk more? Thanks! 回答1: Open up spoon.sh/bat or pan/kettle .sh or .bat and change the -Xmx figure. Search for JAVAMAXMEM Even though you have spare

pentaho report designer安装和开发

六月ゝ 毕业季﹏ 提交于 2020-02-07 08:45:11
首先安装JAVA运行环境 pentaho是java语言开发的.安装jdk是必须的. pentaho以前是完全开源的,现在被Hitachi Vantara收购. Windows环境: 右击“我的电脑”–>“属性”–>“高级系统设置”–>“高级”–>“环境变量” 在系统变量里新建"JAVA_HOME"变量,变量值为:C:\Program Files\Java\jdk1.8.0_60(JDK的安装路径); 在系统变量里新建"classpath"变量 . ; %JAVA_HOME%\lib ; %JAVA_HOME%\lib\tools.jar 找到path变量(已存在不用新建)添加变量值 %JAVA_HOME%\bin ; %JAVA_HOME%\jre\bin 注意:变量值之间用";"隔开。注意原来Path的变量值末尾有没有;号,如果没有,先输入;号再输入。   Linux环境:   下载jdk,解压缩文件,配置参数. cd /usr/lib sudo tar xzf ~/Downloads/jdk-8u101-linux-x64.tar.gz 配置环境变量: vim ~/.profile JAVA_HOME = /usr/lib/jdk1.8.0_101 CLASSPATH = .: $JAVA_HOME /lib/tools.jar: $JAVA_HOME /lib/dt.jar

大数据案例分析

可紊 提交于 2020-02-06 19:56:52
摘自 https://www.cnblogs.com/ShaYeBlog/p/5872113.html 一、大数据分析在商业上的应用 1、体育赛事预测 世界杯期间,谷歌、百度、微软和高盛等公司都推出了比赛结果预测平台。百度预测结果最为亮眼,预测全程64场比赛,准确率为67%,进入淘汰赛后准确率为94%。现在互联网公司取代章鱼保罗试水赛事预测也意味着未来的体育赛事会被大数据预测所掌控。 “在百度对世界杯的预测中,我们一共考虑了团队实力、主场优势、最近表现、世界杯整体表现和博彩公司的赔率等五个因素,这些数据的来源基本都是互联网,随后我们再利用一个由搜索专家设计的机器学习模型来对这些数据进行汇总和分析,进而做出预测结果。”--- 百度北京大数据实验室的负责人张桐 2、股票市场预测 去年英国华威商学院和美国波士顿大学物理系的研究发现,用户通过谷歌搜索的金融关键词或许可以金融市场的走向,相应的投资战略收益高达326%。此前则有专家尝试通过Twitter博文情绪来预测股市波动。 理论上来讲股市预测更加适合美国。中国股票市场无法做到双向盈利,只有股票涨才能盈利,这会吸引一些游资利用信息不对称等情况人为改变股票市场规律,因此中国股市没有相对稳定的规律则很难被预测,且一些对结果产生决定性影响的变量数据根本无法被监控。 目前,美国已经有许多对冲基金采用大数据技术进行投资,并且收获甚丰

Amazon Redshift to Mysql using Pentaho Data Integration

寵の児 提交于 2020-01-30 11:29:26
问题 We are using Amazon redshift and the data base is POSTGRESQL.Tha data sit in amazon cloud. We need to load data from Amazon redshift to Mysql using Pentaho Data Integration Software.Could you please tell us how to connect to Redshift via Pentaho ??? 回答1: I'll try to help you. The redshift connection will need the PostgreSql JDBC in the lib folder of your pentaho data-integration. But the one that comes with Pentaho have some issues with redshift, this may be solved by removing the existent

Amazon Redshift to Mysql using Pentaho Data Integration

旧巷老猫 提交于 2020-01-30 11:28:48
问题 We are using Amazon redshift and the data base is POSTGRESQL.Tha data sit in amazon cloud. We need to load data from Amazon redshift to Mysql using Pentaho Data Integration Software.Could you please tell us how to connect to Redshift via Pentaho ??? 回答1: I'll try to help you. The redshift connection will need the PostgreSql JDBC in the lib folder of your pentaho data-integration. But the one that comes with Pentaho have some issues with redshift, this may be solved by removing the existent

Java Pentaho Exception MongoDB

痴心易碎 提交于 2020-01-26 00:17:47
问题 I have designed a transformation in Pentaho Data Integration ui tool and wrote a java code to execute the transformation. I followed below resources link as it is, try { /** * Initialize the Kettle Enviornment */ KettleEnvironment.init(); /** * Create a trans object to properly assign the ktr metadata. * * @filedb: The ktr file path to be executed. * */ TransMeta metadata = new TransMeta("Districts.ktr"); Trans trans = new Trans(metadata); // Execute the transformation trans.execute(null);

Generating PDF from Pentaho .prpt report file in Java - dependencies confusion

余生颓废 提交于 2020-01-24 09:25:06
问题 Can anyone help me get started generating PDFs from Pentaho .prpt files using java in a maven environment? I have the Pentaho Reporting 3.5 for Java Developers book, and I'm trying out an example from there, essentially: ResourceManager manager = new ResourceManager(); manager.registerDefaults(); Resource resource = manager.createDirectly(reportURL, MasterReport.class); MasterReport report = (MasterReport) resource.getResource(); PdfReportUtil.createPDF(report, outputStream); My problem is