最近准备研究下 Pgpool-II(后续简称pgpool) ，之前很早就了解 pgpool 具有连接池，复制，负载均衡，并行查询，以及 HA 功能，今天打算测试下 pgpool 的 HA 功能，暂不考虑其它。

环境信息

基本信息
硬件：笔记本两台虚拟机
db1: 192.168.2.37
db2: 192.168.2.38
pgpool: 192.168.2.200 ( pgpool 使用 VIP, VIP 可以在 db1 , db2 上漂移 )
系统: RHEL 6.2
PostgreSQL 版本： 9.3.5
备注：PostgreSQL 流复制环境搭建略，也可参考 PostgreSQL：使用 pg_basebackup 搭建流复制环境。

Pgpool 实验环境架构图

PostgreSQL 流复制 + Pgpool-II 实现高可用 HA

pgpool 部署架构规划图

备注: watchdog可以避免 pgpool 的单点故障。

编译安装 Pgpool ( 两节点安装)

下载pgpool软件

1	http://www.pgpool.net/download.php?f=pgpool-II-3.3.4.tar.gz

安装 pgpool

[root@db1 soft_bak]# tar xvf pgpool-II-3.3.4.tar.gz   
[root@db1 pgpool-II-3.3.4]# mkdir -p /opt/pgpool  
[root@db1 pgpool-II-3.3.4]# source /home/pg93/.bash_profile  

[root@db1 pgpool-II-3.3.4]#./configure --prefix=/opt/pgpool -with-pgsql=path -with-pgsql=/opt/pgsql_9.3.3 --with-openssl   
[root@db1 pgpool-II-3.3.4]# make   
[root@db1 pgpool-II-3.3.4]# make install
  
configure 错误  
checking for  PQexecPrepared  in  -lpq...  no  
configure: error: libpq is  not installed or libpq is old

备注：如果出现这个错误，说明找不到 PostgreSQL lib 库， configure 加上 -with-pgsql 选项即可。

安装 Pgpool 相关函数

pgpool 函数虽然不是必需安装，但建议安装 pgpool_regclass, pgpool_recovery 函数。

安装 pgpool_regclass, pgpool_recovery (recommended)

[root@db1 sql]# cd /opt/soft_bak/pgpool-II-3.3.4/sql  
[root@db1 sql]# source /home/pg93/.bash_profile  
[root@db1 sql]# make   
[root@db1 sql]# make install

备注：之后在 $PGHOME/share/extension 目录下生成以下文件。

pgpool 相关文件

[pg93@db1 extension]$ ll /opt/pgsql_9.3.3/share/extension/pgpool*    
-rw-r--r--  1 root root 791  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool_recovery--1.0.sql  
-rw-r--r--  1 root root 160  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool_recovery.control  
-rw-r--r--  1 root root 551  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool-recovery.sql  
-rw-r--r--  1 root root 283  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool_regclass--1.0.sql  
-rw-r--r--  1 root root 152  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool_regclass.control  
-rw-r--r--  1 root root 142  Sep  30  15:56  /opt/pgsql_9.3.3/share/extension/pgpool-regclass.sql

在模板库 template1 创建 pgpool_regclass

[pg93@db1 extension]$ psql template1  
psql (9.3.3)  
Type  "help"  for help.  
  
template1=# create extension pgpool_regclass;   
CREATE EXTENSION  
  
template1=# CREATE EXTENSION pgpool_recovery;  
CREATE EXTENSION

查看新增加的recovery 管理函数

template1=# /df  
 List of functions  
Schema  | Name |  Result data type | Argument data types | Type   
--------+---------------------+------------------+----------------------------------------------------------------+--------  
public  | pgpool_pgctl |  boolean | action text, stop_mode text | normal  
public  | pgpool_recovery |  boolean | script_name text, remote_host text, remote_data_directory text | normal  
public  | pgpool_remote_start |  boolean | remote_host text, remote_data_directory text | normal  
public  | pgpool_switch_xlog | text | arcive_dir text | normal  
(4 rows)

备注：每个库都需要安装 pgpool_regclass，为了方便在 template1 上安装 pgpool_regclass，以后新建的库都以 template1 为模板库。

数据库配置

recovery.conf 配置

1
2
3

recovery_target_timeline =  'latest'  
standby_mode = on  
primary_conninfo =  'host=192.168.2.38 port=1921 user=repuser'

备注: primary_conninfo 的 host 分别配置对端 host IP.

.pgpass文件配置

1
2
3

[pg93@db1 pg_root]$ cat ~/.pgpass  
192.168.2.37:1921:replication:repuser:rep123us345er  
192.168.2.38:1921:replication:repuser:rep123us345er

postgresql.conf 配置略

配置 Pgpool-II ( 两节点操作)

配置 pcp.conf

[root@db1 etc]# cd /opt/pgpool/etc  
[root@db1 etc]# cp pcp.conf.sample pcp.conf  
  
[pg93@db1 etc]$ pg_md5 -u pgpool -p   
password:    
ba777e4c2f15c11ea8ac3be7e0440aa0

备注: pgpool 提供 pcp 接口，可以查看，管理 pgpool 的状态，并且可以远程操作 pgpool ， pcp.conf 用来对 pcp 相关命令认证的文件，格式为 USERID:MD5PASSWD。

编写 pcp.conf 文件，写入以下

1 2	# USERID:MD5PASSWD pgpool:ba777e4c2f15c11ea8ac3be7e0440aa0

配置 /etc/hosts

[root@db1 ~]# cat /etc/hosts  
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4  
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6  
  
192.168.2.37 db1  
192.168.2.38 db2

配置 ifconfig, arping 执行权限

1
2
3

[root@db1 pgpool]# chmod u+s /sbin/ifconfig  
[root@db1 pgpool]# chmod u+s /usr/sbin/
备注: 以理普通用户能够执行以上命令， failover_command 命令要用到。

配置两节点信任关系 , 例如 db1 上的 pg93 配置信任关系，免密码登录 db2.

pg93@db1 ~]$ ssh-keygen   
[pg93@db1 ~]$ ssh-copy-id pg93@db2  
[pg93@db1 ~]$ ssh pg93@db2  
[pg93@db2 ~]$ exit

备注: 此操作在 db2 上也操作一次, failover_command 命令要用到。

配置 pgpool.conf

1 2	[root@db1 etc]# cd /opt/pgpool/etc [root@db1 etc]# cp pgpool.conf.sample pgpool.conf

开启日志
在日志 /etc/rsyslog.conf 加入以下行

1 2	# pgpool local0.* /var/log/pgpool.log

重启 rsyslog 服务

1
2
3

[root@db1 ~]# /etc/init.d/rsyslog restart  
Shutting down system logger: [OK ]  
Starting system logger: [OK ]

pgpool.conf 配置以下行

1	log_destination = 'syslog'

备注：这里使用 syslog。

主节点的 pgpool.conf

[pg93@db1 etc]$ grep ^[a-z] pgpool.conf  
listen_addresses =  '*'  
port =  9999  
socket_dir =  '/tmp'  
pcp_port =  9898  
pcp_socket_dir =  '/tmp'  
backend_hostname0 =  'db1' ##配置数据节点 db1  
backend_port0 =  1921  
backend_weight0 =  1  
backend_flag0 =  'ALLOW_TO_FAILOVER'  
backend_hostname1 =  'db2' ##配置数据节点 db2  
backend_port1 =  1921  
backend_weight1 =  1  
backend_flag1 =  'ALLOW_TO_FAILOVER'  
enable_pool_hba = on  
pool_passwd =  'pool_passwd'  
authentication_timeout =  60  
ssl = off  
num_init_children =  32  
max_pool =  4  
child_life_time =  300  
child_max_connections =  0  
connection_life_time =  0  
client_idle_limit =  0  
log_destination =  'syslog'  
print_timestamp = on  
log_connections = on  
log_hostname = on  
log_statement = on  
log_per_node_statement = off  
log_standby_delay =  'none'  
syslog_facility =  'LOCAL0'  
syslog_ident =  'pgpool'  
debug_level =  0  
pid_file_name =  '/opt/pgpool/pgpool.pid'  
logdir =  '/tmp'  
connection_cache = on  
reset_query_list =  'ABORT; DISCARD ALL'  
replication_mode = off  
replicate_select = off  
insert_lock = on  
lobj_lock_table =  ''  
replication_stop_on_mismatch = off  
failover_if_affected_tuples_mismatch = off  
load_balance_mode = on  
ignore_leading_white_space = on  
white_function_list =  ''  
black_function_list =  'nextval,setval'  
master_slave_mode = on # 设置流复制模式  
master_slave_sub_mode =  'stream'  # 设置流复制模式  
sr_check_period =  5  
sr_check_user =  'repuser'  
sr_check_password =  'rep123us345er'  
delay_threshold =  16000  
follow_master_command =  ''  
parallel_mode = off  
pgpool2_hostname =  ''  
system_db_hostname =  'localhost'  
system_db_port =  5432  
system_db_dbname =  'pgpool'  
system_db_schema =  'pgpool_catalog'  
system_db_user =  'pgpool'  
system_db_password =  ''  
health_check_period =  5  
health_check_timeout =  20  
health_check_user =  'repuser'  
health_check_password =  'rep123us345er'  
health_check_max_retries =  3  
health_check_retry_delay =  1  
failover_command =  '/opt/pgpool/failover_stream.sh %H ' ## 配置 failover 脚本，脚本内容下面会贴出。  
failback_command =  ''  
fail_over_on_backend_error = on  
search_primary_node_timeout =  10  
recovery_user =  'nobody'  
recovery_password =  ''  
recovery_1st_stage_command =  ''  
recovery_2nd_stage_command =  ''  
recovery_timeout =  90  
client_idle_limit_in_recovery =  0  
use_watchdog = on  
trusted_servers =  ''  
ping_path =  '/bin'  
wd_hostname =  'db1'  
wd_port =  9000  
wd_authkey =  ''  
delegate_IP =  '192.168.2.200' ## 配置 pgpool 的 VIP，避免 pgpool 的单点故障  
ifconfig_path =  '/sbin' ## 以下几个网卡命令不需要修改。  
if_up_cmd =  'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'  
if_down_cmd =  'ifconfig eth0:0 down'  
arping_path =  '/usr/sbin' # arping command path  
arping_cmd =  'arping -U $_IP_$ -w 1'  
clear_memqcache_on_escalation = on  
wd_escalation_command =  ''  
wd_lifecheck_method =  'heartbeat'  
wd_interval =  10  
wd_heartbeat_port =  9694  
wd_heartbeat_keepalive =  2  
wd_heartbeat_deadtime =  30  
heartbeat_destination0 =  'db2' # 配置对端的 hostname  
heartbeat_destination_port0 =  9694    
heartbeat_device0 =  'eth0'    
wd_life_point =  3  
wd_lifecheck_query =  'SELECT 1'  
wd_lifecheck_dbname =  'template1'  
wd_lifecheck_user =  'nobody'  
wd_lifecheck_password =  ''  
other_pgpool_hostname0 =  'db2' ## 配置对端的 pgpool  
other_pgpool_port0 =  9999  
other_wd_port0 =  9000  
relcache_expire =  0  
relcache_size =  256  
check_temp_table = on  
memory_cache_enabled = off  
memqcache_method =  'shmem'  
memqcache_memcached_host =  'localhost'  
memqcache_memcached_port =  11211  
memqcache_total_size =  67108864  
memqcache_max_num_cache =  1000000  
memqcache_expire =  0  
memqcache_auto_cache_invalidation = on  
memqcache_maxcache =  409600  
memqcache_cache_block_size =  1048576  
memqcache_oiddir =  '/var/log/pgpool/oiddir'  
white_memqcache_table_list =  ''  
black_memqcache_table_list =  ''

备节点的 pgpool.conf

[pg93@db2 etc]$ grep ^[a-z] pgpool.conf  
listen_addresses =  '*'  
port =  9999  
socket_dir =  '/tmp'  
pcp_port =  9898  
pcp_socket_dir =  '/tmp'  
backend_hostname0 =  'db1'  
backend_port0 =  1921  
backend_weight0 =  1  
backend_flag0 =  'ALLOW_TO_FAILOVER'  
backend_hostname1 =  'db2'  
backend_port1 =  1921  
backend_weight1 =  1  
backend_flag1 =  'ALLOW_TO_FAILOVER'  
enable_pool_hba = on  
pool_passwd =  'pool_passwd'  
authentication_timeout =  60  
ssl = off  
num_init_children =  32  
max_pool =  4  
child_life_time =  300  
child_max_connections =  0  
connection_life_time =  0  
client_idle_limit =  0  
log_destination =  'syslog'  
print_timestamp = on  
log_connections = on  
log_hostname = on  
log_statement = on  
log_per_node_statement = off  
log_standby_delay =  'none'  
syslog_facility =  'LOCAL0'  
syslog_ident =  'pgpool'  
debug_level =  0  
pid_file_name =  '/opt/pgpool/pgpool.pid'  
logdir =  '/tmp'  
connection_cache = on  
reset_query_list =  'ABORT; DISCARD ALL'  
replication_mode = off  
replicate_select = off  
insert_lock = on  
lobj_lock_table =  ''  
replication_stop_on_mismatch = off  
failover_if_affected_tuples_mismatch = off  
load_balance_mode = on  
ignore_leading_white_space = on  
white_function_list =  ''  
black_function_list =  'nextval,setval'  
master_slave_mode = on  
master_slave_sub_mode =  'stream'  
sr_check_period =  0  
sr_check_user =  'repuser'  
sr_check_password =  'rep123us345er'  
delay_threshold =  16000  
follow_master_command =  ''  
parallel_mode = off  
pgpool2_hostname =  ''  
system_db_hostname =  'localhost'  
system_db_port =  5432  
system_db_dbname =  'pgpool'  
system_db_schema =  'pgpool_catalog'  
system_db_user =  'pgpool'  
system_db_password =  ''  
health_check_period =  0  
health_check_timeout =  20  
health_check_user =  'nobody'  
health_check_password =  ''  
health_check_max_retries =  0  
health_check_retry_delay =  1  
failover_command =  '/opt/pgpool/failover_stream.sh %H '  
failback_command =  ''  
fail_over_on_backend_error = on  
search_primary_node_timeout =  10  
recovery_user =  'nobody'  
recovery_password =  ''  
recovery_1st_stage_command =  ''  
recovery_2nd_stage_command =  ''  
recovery_timeout =  90  
client_idle_limit_in_recovery =  0  
use_watchdog = on  
trusted_servers =  ''  
ping_path =  '/bin'  
wd_hostname =  'db2'  
wd_port =  9000  
wd_authkey =  ''  
delegate_IP =  '192.168.2.200'  
ifconfig_path =  '/sbin'  
if_up_cmd =  'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'  
if_down_cmd =  'ifconfig eth0:0 down'  
arping_path =  '/usr/sbin' # arping command path  
arping_cmd =  'arping -U $_IP_$ -w 1'  
clear_memqcache_on_escalation = on  
wd_escalation_command =  ''  
wd_lifecheck_method =  'heartbeat'  
wd_interval =  10  
wd_heartbeat_port =  9694  
wd_heartbeat_keepalive =  2  
wd_heartbeat_deadtime =  30  
heartbeat_destination0 =  'db1'  
heartbeat_destination_port0 =  9694    
heartbeat_device0 =  'eth0'  
wd_life_point =  3  
wd_lifecheck_query =  'SELECT 1'  
wd_lifecheck_dbname =  'template1'  
wd_lifecheck_user =  'nobody'  
wd_lifecheck_password =  ''  
other_pgpool_hostname0 =  'db1'  
other_pgpool_port0 =  9999  
other_wd_port0 =  9000  
relcache_expire =  0  
relcache_size =  256  
check_temp_table = on  
memory_cache_enabled = off  
memqcache_method =  'shmem'  
memqcache_memcached_host =  'localhost'  
memqcache_memcached_port =  11211  
memqcache_total_size =  67108864  
memqcache_max_num_cache =  1000000  
memqcache_expire =  0  
memqcache_auto_cache_invalidation = on  
memqcache_maxcache =  409600  
memqcache_cache_block_size =  1048576  
memqcache_oiddir =  '/var/log/pgpool/oiddir'  
white_memqcache_table_list =  ''  
black_memqcache_table_list =  ''

/opt/pgpool/failover_stream.sh 脚本内容

[pg93@db1 etc]$ cat /opt/pgpool/failover_stream.sh   
#! /bin/sh  
# Failover command for streaming replication.  
# Arguments: $1: new master hostname.   
  
new_master=$1  
trigger_command="$PGHOME/bin/pg_ctl promote -D $PGDATA"  
  
# Prompte standby database.  
/usr/bin/ssh -T $new_master $trigger_command   
  
exit  0;

备注：我这里定义的 failover 脚本和 pgpool 手册上的脚本不同，这里使用了 pg_ctl promote 的切换方式，一方面以文件触发的形式个人觉得不是很好。另一方面：当以 trigger file 形式实现 HA 时会遇到不能来回切换的问题。

如果仔细看，可以看到这个切换脚本并不严谨，每当有节点离线时，它都会触发一次，也就是说如果当前掉线的是备节点，它也会到对端主库执行一次 failover_command 命令，不过没关系，并不影响。

启动 pgpool

1	[pg93@db2 etc]$ pgpool

备注：此时可以查看 /var/log/pgpool.log 日志了，注意两节点都启动。

pgpool 关闭命令

1	[pg93@db2 etc]$ pgpool -m fast stop

pgpool reload 命令

1	[pg93@db1 etc]$ pgpool reload

关于连接异常问题
PostgreSQL 流复制 + Pgpool-II 实现高可用 HA
备注：这个表格很好地描述了各种连接异常问题，对照表格很容易找到原因。

在本机 ubuntu 查看 pgpool 状态

francs@francs:~$ lsb_release -a  
No LSB modules are available.  
Distributor ID:Ubuntu  
Description:Ubuntu  14.04.1 LTS  
Release:14.04  
Codename: trusty  
  
francs@francs:~$ psql -h 192.168.2.200  -p 9999 postgres postgres  
Password  for user postgres:    
psql (9.3.5, server 9.3.3)  
Type  "help"  for help.  
  
postgres=# show pool_nodes;  
node_id | hostname | port | status | lb_weight | role   
---------+----------+------+--------+-----------+---------  
0 | db1 |  1921  |  2 |  0.500000 | standby  
1 | db2 |  1921  |  2 |  0.500000 | primary

备注: db2 为 primary, db1 为 standby。 status 为 2 ，表示正常连接; 关于 status 状态，如下

0 – This state is only used during the initialization. PCP will never display it.
1 – Node is up. No connections yet.
2 – Node is up. Connections are pooled.
3 – Node is down.

HA 验证

根据上面 show_nodes 输出，此时 db2 节点为主库， db1 节点为备库，并且 pgpool 目前跑在 db1 上，下面分别测试以下情形:

关 db2 上的数据库

db2 关闭数据库

1
2
3

[pg93@db2 etc]$ pg_ctl stop -m fast  
waiting for server to shut down.......  done  
server stopped

db1 上测试

[pg93@db1 etc]$ pg_controldata | grep cluster  
Database cluster state: in production  
[pg93@db1 etc]$   
  
[pg93@db1 etc]$ ll /database/pg93/pg_root/recovery.done    
-rw-r--r--  1 pg93 pg93 4.8K  Oct 2  09:05  /database/pg93/pg_root/recovery.done

备注：可以看到db1 节点已完成从 standby 角色切换到 primary ，并且 $PGDATA/recovery.conf 文件变成 recovery.done 。

查看 pgpool 状态

francs@francs:~$ psql -h 192.168.2.200  -p 9999 postgres postgres  
Password  for user postgres:    
psql (9.3.5, server 9.3.3)  
Type  "help"  for help.  
  
postgres=# show pool_nodes;  
node_id | hostname | port | status | lb_weight | role   
---------+----------+------+--------+-----------+---------  
0 | db1 |  1921  |  2 |  0.500000 | primary  
1 | db2 |  1921  |  3 |  0.500000 | standby  
(2 rows)

备注:db1 已转换成primary 角色。 db2 状态为 3 ，表示 down 状态。

接下来以 standby 身份启动 db2

1 2	pg93@db2 pg_root]$ mv recovery.done recovery.conf [pg93@db2 pg_root]$ pg_ctl start

db1 上操作，添加 db2 节点信息

[pg93@db1 etc]$ pcp_attach_node -d 5 db2 9898 pgpool pgpool 1  
DEBUG: send: tos="R", len=44  
DEBUG: recv: tos="r", len=21, data=AuthenticationOK  
DEBUG: send: tos="D", len=6  
DEBUG: recv: tos="c", len=20, data=CommandComplete  
DEBUG: send: tos="X", len=4

db2 状态恢复

[pg93@db1 etc]$ psql -p 9999  -U postgres  
Password  for user postgres:    
psql (9.3.3)  
Type  "help"  for help.  
  
postgres=# show pool_nodes;  
node_id | hostname | port | status | lb_weight | role   
---------+----------+------+--------+-----------+---------  
0 | db1 |  1921  |  2 |  0.500000 | primary  
1 | db2 |  1921  |  2 |  0.500000 | standby  
(2 rows)

备注：切换成功

关 pgpool master

pgpool VIP :192.168.2.200 从 db1 飘到 db2，注意此时仅 pgpool 的 VIP 飘移， db1,db2 上的数据库角色不变, pgpool 的切换非常容易，关闭 pgpool 的脚本如下：

1	[pg93@db2 etc]$ pgpool -m fast stop

db1 掉电 power off

这里通过 vmwaer 执行 power off 模拟断电的情况，在 db1(现在的 Primary 节点) 执行 power off ，发现 pgpool VIP 能切换到 db1 ，同时 db1 上的数据库切换成 primary ，切换成功。

第二种，第三种情况就不贴详细日志了。

参考

原创文章，作者：ItWorker，如若转载，请注明出处：https://blog.ytso.com/239586.html