greenplum

Postgresql intarray error: undefined symbol: pfree

时光毁灭记忆、已成空白 提交于 2019-12-18 09:46:15
问题 I'm trying to install Postgresql (8.2.15) additional supplied modules intarray and intagg for my Greenplum database 4.2.1.0. The installation seems successful; I followed the tutorial here and all the files are copied into the greenplumlib-db-4.2.1.0/lib/postgresql share/postgresql directory. but when I tried to execute my java code, it throws an "undefined symbol" error: org.postgresql.util.PSQLException: ERROR: could not load library "/usr/local/greenplum-db-4.2.1.0/lib/postgresql/_int.so":

GreenPlum tidb 性能比较

随声附和 提交于 2019-12-15 21:25:54
主要的需求 针对大体量表的OLAP统计查询,需要找到一个稳定,高性能的大数据数据库,具体使用 数据可以实时的写入和查询,并发的tps不是很高 建立数据仓库,模式上主要采用星星模型、雪花模型,或者宽表 前端展示 分为3类 saiku、granafa、c#代码开发 数据体量:事实表在3-5亿、维度表大的在500万左右 数据集成:可以和现在使用的kettle进行无缝集成 基于以上需求,前期使用tidb,但是在大体量表的olap查询性能不是很好,使用tipark 离线计算还可,但是时间上无法满足系统需求,初步了解到mpp架构的greenplum。因此先期进行了简单比较 基础测试数据表说明 数据表 订单宽表,数据表字段为300个左右 基本的测试结果 --不包含并发测试 集群基本配置 : Greenplum 4台8核56G,9个segments 表:列存,无索引 tidb :6台8核56G,ssd tpc-ds tpc-h 其余测试 -- 小结 针对OLAP的查询,greenplum 的分析统计性能要优于tidb 在greenplum不使用索引的情况下,点差要比tidb 差不少,增加对应的索引之后,性能差不多,但是greenplum 不建议使用索引 greenplum在列存的场景下,查询的列的个数对性能影响较大。 下一步验证 1.星星模型 下的性能,考虑事实表 3亿,维度表 500万, 2

psql: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory

岁酱吖の 提交于 2019-12-14 04:09:54
问题 I have the psql path in my path variable. crontab path is usr:/usr/bin:/bin I add all the user directories and then execute the cron job. Default PATH for usr:/usr/bin:/bin user is:gpadmin PATH for usr:/usr/bin:/bin:/usr/local/greenplum-db/./bin:/usr/local/greenplum-db/./ext/python/bin:/home/gpadmin/anaconda3/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/home/gpadmin/bin My script is as below (executed by crontab): #!/bin/bash # Usage : bash RR_load.sh echo "Loading the data into

rodbc character encoding error with PostgreSQL

二次信任 提交于 2019-12-12 09:38:05
问题 I'm getting a new error which I've never gotten before when connecting from R to a GreenPlum PostgreSQL database using RODBC. I've gotten the error using both EMACS/ESS and RStudio, and the RODBC call has worked as is in the past. library(RODBC) gp <- odbcConnect("greenplum", believeNRows = FALSE) data <- sqlQuery(gp, "select * from mytable") > data [1] "22P05 7 ERROR: character 0xc280 of encoding \"UTF8\" has no equivalent in "WIN1252\";\nError while executing the query" [2] "[RODBC] ERROR:

Two questions for formatting timestamp and number using postgresql

六月ゝ 毕业季﹏ 提交于 2019-12-12 04:16:05
问题 I am selecting a date column which is in the format "YYYY-MM-DD". I want to cast it to a timestamp such that it will be "YYYY-MM-DD HH:MM:SS:MS" I attempted: select CAST(mycolumn as timestamp) from mytable; but this resulted in the format YYYY-MM-DD HH:MM:SS I also tried select TO_TIMESTAMP(mycolumn,YYYY-MM-DD HH:MM:SS:MS) from mytable; but this did not work either. I cannot seem to figure out the correct way to format this. Note that I only want the first digit of the milliseconds. /////////

Merge Operation Fails -gpload utility greenplum

守給你的承諾、 提交于 2019-12-12 03:44:57
问题 We would like try to describe my problem below: We have small gpdb cluster. In that,we are trying for Data integration using Talend tool. We are trying to load the incremental from a table to another table, quite simple... I thought... Job Data Flow is tgreenplumconnection | tmssqlinput--->thdfsoutput-->tmap-->tgreenplumgpload--tgreenplumcommit Getting error Exception in thread "Thread-1" java.lang.RuntimeException: Cannot run program "gpload": CreateProcess error=2, The system cannot find

ERROR: protocol “gphdfs” does not exist

六月ゝ 毕业季﹏ 提交于 2019-12-12 03:29:17
问题 when I postgres=# CREATE EXTERNAL TABLE csv_hdfs_lineitem (like a) LOCATION ( 'gphdfs://xxxxx/gptest/lineitem.csv' ) FORMAT 'text' (delimiter E'|' null E'\\N' escape E'off' fill missing fields) ENCODING 'UTF8' ; it shows ERROR: protocol "gphdfs" does not exist I want to know how to configure greenplum to support gphdfs protocol 回答1: you need to install hadoop client to all gpdb nodes and add class_path setup 2 guc, gp_hadoop_target_version and gp_hadoop_home pointing to the hadoop

greenplum hang forever when doing any search or insert actions with psql and centos7

三世轮回 提交于 2019-12-11 16:53:11
问题 greenplum version is 5.3.0 centos 7 As title, The following is result of gplogfilter SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM pg_catalog.pg_namespace n WHERE substring(pg_catalog.quote_ident(n.nspname) || '.',1,7)='test_vb' AND (SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE substring(pg_catalog.quote_ident(nspname) || '.',1,7) = substring('test_vb',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) > 1 UNION SELECT pg_catalog.quote_ident(n.nspname) || '.' ||

Rolling (moving) median in Greenplum

早过忘川 提交于 2019-12-11 04:59:51
问题 I would like to calculate the rolling median for a column in Greenplum, i.e. as below: | x | rolling_median_x | | -- + ---------------- | | 4 | 4 | | 1 | 2.5 | | 3 | 3 | | 2 | 2.5 | | 1 | 2 | | 6 | 2.5 | | 9 | 3 | x is an integer and for each row rolling_median_x shows the median of x for the current and preceding rows. E.g. for the third row rolling_median_x = median(4, 1, 3) = 3 . Things I've found out so far: the median function can't be used as a framed window function, i.e. median(x)

gpload utility in greenplum

别来无恙 提交于 2019-12-11 04:22:40
问题 Can anyone help me with the gpload utility in greenplum ? I am doing this for the first time, I have created the control file as per the help command, but getting some gpfdist connection error. Is there any reference material on gpload utility or some examples I can refer to? I getting below error while using gpload: 2013-05-21 09:34:20|INFO|started gpfdist -p 9096 -P 9097 -f "gpload.test.txt" -t 30 "2013-05-21 09:34:23|ERROR|ERROR: connection with gpfdist failed for gpfdist://<host-ip>:9096