l gcadmin查看集群状态时发现一个节点的gcware服务关闭,且未产生fevent
[gbase@node37 log]$ gcadmin
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
==============================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
==============================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
————————————————————–
| coordinator1 | 10.10.55.37 | CLOSE | OPEN | 0 |
————————————————————–
| coordinator2 | 10.10.55.38 | OPEN | OPEN | 0 |
————————————————————–
| coordinator3 | 10.10.55.39 | OPEN | OPEN | 0 |
————————————————————–
| coordinator4 | 10.10.55.40 | OPEN | OPEN | 0 |
————————————————————–
=========================================================================================================
| GBASE DATA CLUSTER INFORMATION |
=========================================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
———————————————————————————————————
| node1 | 10.10.55.37 | 1 | OPEN | OPEN | 0 |
———————————————————————————————————
| node2 | 10.10.55.38 | 1 | OPEN | OPEN | 0 |
———————————————————————————————————
| node3 | 10.10.55.39 | 1 | OPEN | OPEN | 0 |
———————————————————————————————————
| node4 | 10.10.55.40 | 1 | OPEN | OPEN | 0 |
———————————————————————————————————
l 查看gcware.log发现以下报错:
Oct 20 10:34:16.316037 ERROR [MAIN ] raft node init fail
如果集群节点的ip没有发生更改,那么大概率是$GCWARE_BASE/data/gcware中的REDOLOG或SNAPSHOT发生损坏,只需要把当前节点的$GCWARE_BASE/data/gcware删掉,将正常节点的$GCWARE_BASE/data/gcware目录拷贝过来,重启服务即可恢复正常。
原创文章,作者:kirin,如若转载,请注明出处:https://blog.ytso.com/tech/bigdata/317939.html