虚拟机 GreenPlum 数据仓库搭建好以后,可以执行建库,建表,数据插入语句,就是不能执行”select”操作报以下ERROR,这个问题折腾了好几天,后来在德哥的帮忙下,终于解决了,原来自己在一个配置上疏忽了。
问题现象
不能执行查询SQL,如下: 1.1创建测试表,并插入数据
1 2 3 4 5 6 7 8 9 10 11 12 13
warehouse=# create table test_1 (id integer , name varchar(32 )) distributed by (id); CREATE TABLEwarehouse=# /d test_1 Table "public.test_1" Column | Type | Modifiers --------+-----------------------+----------- id | integer | name | character varying(32 ) | Distributed by : (id) warehouse=# insert into test_1 values (1 ,'francs' ); INSERT 0 1 warehouse=# insert into test_1 values (2 ,'fpZhou' ); INSERT 0 1
1.2查询报错
1 2 3
warehouse =# select * From test_1;ERROR: Interconnect timeout: Unable to complete setup of all connections within time limit. DETAIL: Completed 0 of 2 incoming and 0 of 0 outgoing connections. gp_interconnect_setup_timeout = 20 seconds.
处理过程
2.1检查Master节点,未发现明显异常
2.2检查Segment节点,发现以下ERROR
1
2011-04-14 21:14:05.535064 CST,"warehouse" ,"warehouse" ,p20619,th-1208420672,"192.168.1.50" ,"41829" ,2011-04-14 21:08:26 CST,1279,con6,cmd16,seg0,slice1,,x1279,sx1,"LOG" ,"58M01" ,Interconnect could not connect to seg-1 127.0.0.1:61572 pid =9845; will retry. Connection refused (connect errno 111)",,,,,," select * From test_1
从上面的信息来看“seg-1 127.0.0.1” , seg-1 指的是 master 节点,怎么解析成 “127.0.0.1”了, 其实这里我早应该发现,这是关键信息。
2.3尝试调整参数 gp_interconnect_setup_timeout 将参数 gp_interconnect_setup_timeout 调大到 120s 后,依然无效。
2.4 GP整体状态检查,未发现异常
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
[greenplum@gpmaster opt]$ gpstate -s 20110414 : 20 : 12 : 43 :gpstate :gpmaster :greenplum- [INFO]:-Obtaining GPDB array type, [Brief], please wait...20110414 : 20 : 13 : 01 :gpstate :gpmaster :greenplum- [INFO]:-Obtaining GPDB array type, [Brief], please wait...20110414 : 20 : 13 : 20 :gpstate :gpmaster :greenplum- [INFO]:-Spawning parallel processes batch [1 ], please wait....... 20110414 : 20 : 13 : 40 :gpstate :gpmaster :greenplum- [INFO]:-Waiting for parallel processes batch [1 ], please wait.................. 20110414 : 20 : 14 : 03 :gpstate :gpmaster :greenplum- [INFO]:-Master Configuration & Status20110414 : 20 : 14 : 03 :gpstate :gpmaster :greenplum- [INFO]:------------------------------ 20110414 : 20 : 14 : 04 :gpstate :gpmaster :greenplum- [INFO]:-Master host = gpmaster20110414 : 20 : 14 : 05 :gpstate :gpmaster :greenplum- [INFO]:-Master postgres process ID = 20992 20110414 : 20 : 14 : 06 :gpstate :gpmaster :greenplum- [INFO]:-Master data directory = /opt/gp _data/gp-1 20110414 : 20 : 14 : 06 :gpstate :gpmaster :greenplum- [INFO]:-Database name = template1 20110414 : 20 : 14 : 07 :gpstate :gpmaster :greenplum- [INFO]:-Master port = 5432 20110414 : 20 : 14 : 07 :gpstate :gpmaster :greenplum- [INFO]:-Master current role = dispatch20110414 : 20 : 14 : 08:gpstate :gpmaster :greenplum- [INFO]:-Greenplum array configuration type = Standard20110414 : 20 : 14 : 09:gpstate :gpmaster :greenplum- [INFO]:-Greenplum initsystem version = 2 |3.3 .6.1 build 1 20110414 : 20 : 14 : 10 :gpstate :gpmaster :greenplum- [INFO]:-Greenplum current version = PostgreSQL 8.2 .13 (Greenplum Database 3.3 .6.1 build 1 ) on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1 .2 20080704 (Red Hat 4.1 .2 -44 ) compiled on Apr 2 2010 16 : 39 : 49 20110414 : 20 : 14 : 10 :gpstate :gpmaster :greenplum- [INFO]:-Postgres version = 8.2 .13 20110414 : 20 : 14 : 11 :gpstate :gpmaster :greenplum- [INFO]:-Greenplum fault mode = readonly20110414 : 20 : 14 : 11 :gpstate :gpmaster :greenplum- [INFO]:-Greenplum mirroring status = on20110414 : 20 : 14 : 12 :gpstate :gpmaster :greenplum- [INFO]:-Greenplum master standby = No master standby configured20110414 : 20 : 14 : 12 :gpstate :gpmaster :greenplum- [INFO]:-Greenplum standby master state = 20110414 : 20 : 14 : 13 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance status20110414 : 20 : 14 : 13 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 14 :gpstate :gpmaster :greenplum- [INFO]:-Parallel process exit status20110414 : 20 : 14 : 14 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 15 :gpstate :gpmaster :greenplum- [INFO]:-Total processes marked as completed = 4 20110414 : 20 : 14 : 15 :gpstate :gpmaster :greenplum- [INFO]:-Total processes marked as killed = 0 20110414 : 20 : 14 : 15 :gpstate :gpmaster :greenplum- [INFO]:-Total processes marked as failed = 0 20110414 : 20 : 14 : 16 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 17 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 17 :gpstate :gpmaster :greenplum- [INFO]:-Segment Instance Status Report20110414 : 20 : 14 : 17 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 18 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 18 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance hostname = gpnode120110414 : 20 : 14 : 19 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance port = 50001 20110414 : 20 : 14 : 19 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance datadir = /opt/gp _data/data/gp0 Type = Primary20110414 : 20 : 14 : 20 :gpstate :gpmaster :greenplum- [INFO]:-Master reports status as = Valid20110414 : 20 : 14 : 20 :gpstate :gpmaster :greenplum- [INFO]:-Total errors (remote) = 0 20110414 : 20 : 14 : 20 :gpstate :gpmaster :greenplum- [INFO]:-File postmaster.pid = Found20110414 : 20 : 14 : 21 :gpstate :gpmaster :greenplum- [INFO]:-PID from postmaster.pid file = 18851 20110414 : 20 : 14 : 21 :gpstate :gpmaster :greenplum- [INFO]:-Lock files in /tmp status = Found20110414 : 20 : 14 : 22 :gpstate :gpmaster :greenplum- [INFO]:-Active PID = 18851 20110414 : 20 : 14 : 22 :gpstate :gpmaster :greenplum- [INFO]:-Instance status = Active20110414 : 20 : 14 : 23 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 23 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance hostname = gpnode120110414 : 20 : 14 : 24 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance port = 60001 20110414 : 20 : 14 : 24 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance datadir = /opt/gp _data/mdata/gp1 Type = Mirror20110414 : 20 : 14 : 24 :gpstate :gpmaster :greenplum- [INFO]:-Master reports status as = Valid20110414 : 20 : 14 : 25 :gpstate :gpmaster :greenplum- [INFO]:-Total errors (remote) = 0 20110414 : 20 : 14 : 25 :gpstate :gpmaster :greenplum- [INFO]:-File postmaster.pid = Found20110414 : 20 : 14 : 26 :gpstate :gpmaster :greenplum- [INFO]:-PID from postmaster.pid file = 18865 20110414 : 20 : 14 : 26 :gpstate :gpmaster :greenplum- [INFO]:-Lock files in /tmp status = Found20110414 : 20 : 14 : 26 :gpstate :gpmaster :greenplum- [INFO]:-Active PID = 18865 20110414 : 20 : 14 : 27 :gpstate :gpmaster :greenplum- [INFO]:-Mirror status = Passive20110414 : 20 : 14 : 27 :gpstate :gpmaster :greenplum- [INFO]:-Instance status = Valid20110414 : 20 : 14 : 28 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 29 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance hostname = gpnode220110414 : 20 : 14 : 29 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance port = 50001 20110414 : 20 : 14 : 29 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance datadir = /opt/gp _data/data/gp1 Type = Primary20110414 : 20 : 14 : 30 :gpstate :gpmaster :greenplum- [INFO]:-Master reports status as = Valid20110414 : 20 : 14 : 30 :gpstate :gpmaster :greenplum- [INFO]:-Total errors (remote) = 0 20110414 : 20 : 14 : 31 :gpstate :gpmaster :greenplum- [INFO]:-File postmaster.pid = Found20110414 : 20 : 14 : 31 :gpstate :gpmaster :greenplum- [INFO]:-PID from postmaster.pid file = 17141 20110414 : 20 : 14 : 32 :gpstate :gpmaster :greenplum- [INFO]:-Lock files in /tmp status = Found20110414 : 20 : 14 : 32 :gpstate :gpmaster :greenplum- [INFO]:-Active PID = 17141 20110414 : 20 : 14 : 32 :gpstate :gpmaster :greenplum- [INFO]:-Instance status = Active20110414 : 20 : 14 : 33 :gpstate :gpmaster :greenplum- [INFO]:------------------------------------------------ 20110414 : 20 : 14 : 33 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance hostname = gpnode220110414 : 20 : 14 : 34 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance port = 60001 20110414 : 20 : 14 : 34 :gpstate :gpmaster :greenplum- [INFO]:-Segment instance datadir = /opt/gp _data/mdata/gp0 Type = Mirror20110414 : 20 : 14 : 35 :gpstate :gpmaster :greenplum- [INFO]:-Master reports status as = Valid20110414 : 20 : 14 : 35 :gpstate :gpmaster :greenplum- [INFO]:-Total errors (remote) = 0 20110414 : 20 : 14 : 36 :gpstate :gpmaster :greenplum- [INFO]:-File postmaster.pid = Found20110414 : 20 : 14 : 36 :gpstate :gpmaster :greenplum- [INFO]:-PID from postmaster.pid file = 17153 20110414 : 20 : 14 : 36 :gpstate :gpmaster :greenplum- [INFO]:-Lock files in /tmp status = Found20110414 : 20 : 14 : 37 :gpstate :gpmaster :greenplum- [INFO]:-Active PID = 17153 20110414 : 20 : 14 : 38 :gpstate :gpmaster :greenplum- [INFO]:-Mirror status = Passive20110414 : 20 : 14 : 38 :gpstate :gpmaster :greenplum- [INFO]:-Instance status = Valid
2.5网络检查,发现网络不好
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
[greenplum@gpmaster ~]$ gpchecknet -f all_host_ file -d test/ /opt/greenplum-db/./bin/gpchecknet -f all_host_ file -d test/ ------------------- -- NETPERF TEST------------------- ==================== == RESULT ==================== Netperf bisection bandwidth test gpnode1 -> gpnode2 = 6.090000 gpnode2 -> gpmaster = 7.730000 gpnode2 -> gpnode1 = 6.960000 gpmaster -> gpnode2 = 6.530000 Summary: sum = 27.31 MB/sec min = 6.09 MB/sec max = 7.73 MB/sec avg = 6.83 MB/sec median = 6.96 MB/sec [Warning] connection between gpnode1 and gpnode2 is no good [Warning] connection between gpmaster and gpnode2 is no good
2.6虚拟机用的是 HOST-ONLY 网络连接,尝试将网络连接改成桥接,依然无效 2.7后来实在搞不定,在德哥的帮助下,终于确定了以下问题 2.8ping gpmaster 主机发现问题
1 2 3 4 5 6 7
[root@gpmaster ~]# ping gpmaster PING gpmaster (127.0.0.1) 56(84) bytes of data. 64 bytes from gpmaster (127.0.0.1): icmp_seq =1 ttl =64 time =2.55 ms 64 bytes from gpmaster (127.0.0.1): icmp_seq =2 ttl =64 time =0.073 ms 64 bytes from gpmaster (127.0.0.1): icmp_seq =3 ttl =64 time =0.040 ms 64 bytes from gpmaster (127.0.0.1): icmp_seq =4 ttl =64 time =0.049 ms 64 bytes from gpmaster (127.0.0.1): icmp_seq =5 ttl =64 time =0.039 ms
ping gpmaster 发现,gpmaster 解析为 127.0.0.1, 应该是 192.168.1.50 才对 2.9 检查 /etc/hosts
2.9.1 原来的 /etc/hosts内容
1 2 3 4 5
# that require network functionality will fail . 127.0 .0 .1 gpmaster localhost .localdomain localhost 192.168 .1 .50 gpmaster 192.168 .1 .51 gpnode1 192.168 .1 .52 gpnode2
发现,gpmaster 解析成 127.0.0.1 了,先读的第一行。
2.9.2修改后的 /etc/hosts
1 2 3 4 5 6
# Do not remove the following line , or various programs # that require network functionality will fail . 127.0 .0 .1 localhost .localdomain localhost 192.168 .1 .50 gpmaster 192.168 .1 .51 gpnode1 192.168 .1 .52 gpnode2
2.9.3再次 ping gpmaster
1 2 3 4 5 6 7
[root@gpmaster ~]# ping gpmaster PING gpmaster (192.168.1.50) 56(84) bytes of data. 64 bytes from gpmaster (192.168.1.50): icmp_seq =1 ttl =64 time =2.59 ms 64 bytes from gpmaster (192.168.1.50): icmp_seq =2 ttl =64 time =0.066 ms 64 bytes from gpmaster (192.168.1.50): icmp_seq =3 ttl =64 time =0.038 ms 64 bytes from gpmaster (192.168.1.50): icmp_seq =4 ttl =64 time =0.051 ms 64 bytes from gpmaster (192.168.1.50): icmp_seq =5 ttl =64 time =0.044 ms
这时,gpmaster 的IP解析为 “192.168.1.50”。
2.9.4退出 session 后,重新测试
1 2 3 4 5 6
warehouse=# select * From test_1; id | name ----+-------- 1 | francs 2 | fpZhou (2 rows)
这次终于OK了,数据可以查询了。
总结
这个问题折腾了好几天,结果问题发生在 /etc/hosts上,还是在高手帮忙的情况下解决的,应该好好反思下,OS层面的知识比较薄弱,对基本知识应该加强。
/etc/hosts
解析是从上到下解析的,当解析文件 /etc/hosts 时,第一次匹配成功则返回。
原创文章,作者:506227337,如若转载,请注明出处:https://blog.ytso.com/236389.html