PostgreSQL 流复制 + Pgpool-II 实现高可用 HA

最近准备研究下 Pgpool-II(后续简称pgpool) ,之前很早就了解 pgpool 具有连接池,复制,负载均衡,并行查询,以及 HA 功能,今天打算测试下 pgpool 的 HA 功能,暂不考虑其它。

环境信息

基本信息
硬件:笔记本两台虚拟机
db1: 192.168.2.37
db2: 192.168.2.38
pgpool: 192.168.2.200 ( pgpool 使用 VIP, VIP 可以在 db1 , db2 上漂移 )
系统: RHEL 6.2
PostgreSQL 版本: 9.3.5
备注:PostgreSQL 流复制环境搭建略,也可参考 PostgreSQL:使用 pg_basebackup 搭建流复制环境

Pgpool 实验环境架构图

PostgreSQL 流复制 + Pgpool-II 实现高可用 HA

pgpool 部署架构规划图

备注: watchdog可以避免 pgpool 的单点故障。

编译安装 Pgpool ( 两节点安装)

下载pgpool软件

1
http://www.pgpool.net/download.php?f=pgpool-II-3.3.4.tar.gz

安装 pgpool

1
2
3
4
5
6
7
8
9
10
11
[root@db1 soft_bak]# tar xvf pgpool-II-3.3.4.tar.gz   
[root@db1 pgpool-II-3.3.4]# mkdir -p /opt/pgpool
[root@db1 pgpool-II-3.3.4]# source /home/pg93/.bash_profile

[root@db1 pgpool-II-3.3.4]#./configure --prefix=/opt/pgpool -with-pgsql=path -with-pgsql=/opt/pgsql_9.3.3 --with-openssl
[root@db1 pgpool-II-3.3.4]# make
[root@db1 pgpool-II-3.3.4]# make install

configure 错误
checking for PQexecPrepared in -lpq... no
configure: error: libpq is not installed or libpq is old

备注:如果出现这个错误,说明找不到 PostgreSQL lib 库, configure 加上 -with-pgsql 选项即可。

安装 Pgpool 相关函数

pgpool 函数虽然不是必需安装,但建议安装 pgpool_regclass, pgpool_recovery 函数。

安装 pgpool_regclass, pgpool_recovery (recommended)

1
2
3
4
[root@db1 sql]# cd /opt/soft_bak/pgpool-II-3.3.4/sql  
[root@db1 sql]# source /home/pg93/.bash_profile
[root@db1 sql]# make
[root@db1 sql]# make install

备注:之后在 $PGHOME/share/extension 目录下生成以下文件。

pgpool 相关文件

1
2
3
4
5
6
7
[pg93@db1 extension]$ ll /opt/pgsql_9.3.3/share/extension/pgpool*    
-rw-r--r-- 1 root root 791 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_recovery--1.0.sql
-rw-r--r-- 1 root root 160 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_recovery.control
-rw-r--r-- 1 root root 551 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool-recovery.sql
-rw-r--r-- 1 root root 283 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_regclass--1.0.sql
-rw-r--r-- 1 root root 152 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_regclass.control
-rw-r--r-- 1 root root 142 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool-regclass.sql

在模板库 template1 创建 pgpool_regclass

1
2
3
4
5
6
7
8
9
[pg93@db1 extension]$ psql template1  
psql (9.3.3)
Type "help" for help.

template1=# create extension pgpool_regclass;
CREATE EXTENSION

template1=# CREATE EXTENSION pgpool_recovery;
CREATE EXTENSION

查看新增加的recovery 管理函数

1
2
3
4
5
6
7
8
9
template1=# /df  
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+---------------------+------------------+----------------------------------------------------------------+--------
public | pgpool_pgctl | boolean | action text, stop_mode text | normal
public | pgpool_recovery | boolean | script_name text, remote_host text, remote_data_directory text | normal
public | pgpool_remote_start | boolean | remote_host text, remote_data_directory text | normal
public | pgpool_switch_xlog | text | arcive_dir text | normal
(4 rows)

备注: 每个库都需要安装 pgpool_regclass,为了方便在 template1 上安装 pgpool_regclass,以后新建的库都以 template1 为模板库。

数据库配置

recovery.conf 配置

1
2
3
recovery_target_timeline =  'latest'  
standby_mode = on
primary_conninfo = 'host=192.168.2.38 port=1921 user=repuser'

备注: primary_conninfo 的 host 分别配置对端 host IP.

.pgpass文件配置

1
2
3
[pg93@db1 pg_root]$ cat ~/.pgpass  
192.168.2.37:1921:replication:repuser:rep123us345er
192.168.2.38:1921:replication:repuser:rep123us345er

postgresql.conf 配置略

配置 Pgpool-II ( 两节点操作)

配置 pcp.conf

1
2
3
4
5
6
[root@db1 etc]# cd /opt/pgpool/etc  
[root@db1 etc]# cp pcp.conf.sample pcp.conf

[pg93@db1 etc]$ pg_md5 -u pgpool -p
password:
ba777e4c2f15c11ea8ac3be7e0440aa0

备注: pgpool 提供 pcp 接口,可以查看,管理 pgpool 的状态,并且可以远程操作 pgpool , pcp.conf 用来对 pcp 相关命令认证的文件,格式为 USERID:MD5PASSWD。

编写 pcp.conf 文件,写入以下

1
2
# USERID:MD5PASSWD  
pgpool:ba777e4c2f15c11ea8ac3be7e0440aa0

配置 /etc/hosts

1
2
3
4
5
6
[root@db1 ~]# cat /etc/hosts  
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.2.37 db1
192.168.2.38 db2

配置 ifconfig, arping 执行权限

1
2
3
[root@db1 pgpool]# chmod u+s /sbin/ifconfig  
[root@db1 pgpool]# chmod u+s /usr/sbin/
备注: 以理普通用户能够执行以上命令, failover_command 命令要用到。

配置两节点信任关系 , 例如 db1 上的 pg93 配置信任关系,免密码登录 db2.

1
2
3
4
pg93@db1 ~]$ ssh-keygen   
[pg93@db1 ~]$ ssh-copy-id pg93@db2
[pg93@db1 ~]$ ssh pg93@db2
[pg93@db2 ~]$ exit

备注: 此操作在 db2 上也操作一次, failover_command 命令要用到。

配置 pgpool.conf

1
2
[root@db1 etc]# cd /opt/pgpool/etc   
[root@db1 etc]# cp pgpool.conf.sample pgpool.conf

开启日志
在日志 /etc/rsyslog.conf 加入以下行

1
2
# pgpool   
local0.* /var/log/pgpool.log

重启 rsyslog 服务

1
2
3
[root@db1 ~]# /etc/init.d/rsyslog restart  
Shutting down system logger: [OK ]
Starting system logger: [OK ]

pgpool.conf 配置以下行

1
log_destination =  'syslog'

备注: 这里使用 syslog。

主节点的 pgpool.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
[pg93@db1 etc]$ grep ^[a-z] pgpool.conf  
listen_addresses = '*'
port = 9999
socket_dir = '/tmp'
pcp_port = 9898
pcp_socket_dir = '/tmp'
backend_hostname0 = 'db1' ##配置数据节点 db1
backend_port0 = 1921
backend_weight0 = 1
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'db2' ##配置数据节点 db2
backend_port1 = 1921
backend_weight1 = 1
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
print_timestamp = on
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/opt/pgpool/pgpool.pid'
logdir = '/tmp'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on # 设置流复制模式
master_slave_sub_mode = 'stream' # 设置流复制模式
sr_check_period = 5
sr_check_user = 'repuser'
sr_check_password = 'rep123us345er'
delay_threshold = 16000
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = ''
system_db_hostname = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 5
health_check_timeout = 20
health_check_user = 'repuser'
health_check_password = 'rep123us345er'
health_check_max_retries = 3
health_check_retry_delay = 1
failover_command = '/opt/pgpool/failover_stream.sh %H ' ## 配置 failover 脚本,脚本内容下面会贴出。
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'nobody'
recovery_password = ''
recovery_1st_stage_command = ''
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/bin'
wd_hostname = 'db1'
wd_port = 9000
wd_authkey = ''
delegate_IP = '192.168.2.200' ## 配置 pgpool 的 VIP,避免 pgpool 的单点故障
ifconfig_path = '/sbin' ## 以下几个网卡命令不需要修改。
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth0:0 down'
arping_path = '/usr/sbin' # arping command path
arping_cmd = 'arping -U $_IP_$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'db2' # 配置对端的 hostname
heartbeat_destination_port0 = 9694
heartbeat_device0 = 'eth0'
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
other_pgpool_hostname0 = 'db2' ## 配置对端的 pgpool
other_pgpool_port0 = 9999
other_wd_port0 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''

备节点的 pgpool.conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
[pg93@db2 etc]$ grep ^[a-z] pgpool.conf  
listen_addresses = '*'
port = 9999
socket_dir = '/tmp'
pcp_port = 9898
pcp_socket_dir = '/tmp'
backend_hostname0 = 'db1'
backend_port0 = 1921
backend_weight0 = 1
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'db2'
backend_port1 = 1921
backend_weight1 = 1
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
print_timestamp = on
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/opt/pgpool/pgpool.pid'
logdir = '/tmp'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'repuser'
sr_check_password = 'rep123us345er'
delay_threshold = 16000
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = ''
system_db_hostname = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 0
health_check_timeout = 20
health_check_user = 'nobody'
health_check_password = ''
health_check_max_retries = 0
health_check_retry_delay = 1
failover_command = '/opt/pgpool/failover_stream.sh %H '
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'nobody'
recovery_password = ''
recovery_1st_stage_command = ''
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/bin'
wd_hostname = 'db2'
wd_port = 9000
wd_authkey = ''
delegate_IP = '192.168.2.200'
ifconfig_path = '/sbin'
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth0:0 down'
arping_path = '/usr/sbin' # arping command path
arping_cmd = 'arping -U $_IP_$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = ''
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'db1'
heartbeat_destination_port0 = 9694
heartbeat_device0 = 'eth0'
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
other_pgpool_hostname0 = 'db1'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''

/opt/pgpool/failover_stream.sh 脚本内容

1
2
3
4
5
6
7
8
9
10
11
12
[pg93@db1 etc]$ cat /opt/pgpool/failover_stream.sh   
#! /bin/sh
# Failover command for streaming replication.
# Arguments: $1: new master hostname.

new_master=$1
trigger_command="$PGHOME/bin/pg_ctl promote -D $PGDATA"

# Prompte standby database.
/usr/bin/ssh -T $new_master $trigger_command

exit 0;

备注: 我这里定义的 failover 脚本和 pgpool 手册上的脚本不同,这里使用了 pg_ctl promote 的切换方式,一方面以文件触发的形式个人觉得不是很好。另一方面:当以 trigger file 形式实现 HA 时会遇到不能来回切换的问题。

如果仔细看,可以看到这个切换脚本并不严谨,每当有节点离线时,它都会触发一次,也就是说如果当前掉线的是备节点,它也会到对端主库执行一次 failover_command 命令,不过没关系,并不影响。

启动 pgpool

1
[pg93@db2 etc]$ pgpool

备注: 此时可以查看 /var/log/pgpool.log 日志了,注意两节点都启动。

pgpool 关闭命令

1
[pg93@db2 etc]$ pgpool -m fast stop

pgpool reload 命令

1
[pg93@db1 etc]$ pgpool reload

关于连接异常问题
PostgreSQL 流复制 + Pgpool-II 实现高可用 HA
备注: 这个表格很好地描述了各种连接异常问题,对照表格很容易找到原因。

在本机 ubuntu 查看 pgpool 状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
francs@francs:~$ lsb_release -a  
No LSB modules are available.
Distributor ID:Ubuntu
Description:Ubuntu 14.04.1 LTS
Release:14.04
Codename: trusty

francs@francs:~$ psql -h 192.168.2.200 -p 9999 postgres postgres
Password for user postgres:
psql (9.3.5, server 9.3.3)
Type "help" for help.

postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+---------
0 | db1 | 1921 | 2 | 0.500000 | standby
1 | db2 | 1921 | 2 | 0.500000 | primary

备注: db2 为 primary, db1 为 standby。 status 为 2 ,表示正常连接; 关于 status 状态,如下

  • 0 – This state is only used during the initialization. PCP will never display it.
  • 1 – Node is up. No connections yet.
  • 2 – Node is up. Connections are pooled.
  • 3 – Node is down.

HA 验证

根据上面 show_nodes 输出, 此时 db2 节点为主库, db1 节点为备库,并且 pgpool 目前跑在 db1 上,下面分别测试以下情形:

关 db2 上的数据库

db2 关闭数据库

1
2
3
[pg93@db2 etc]$ pg_ctl stop -m fast  
waiting for server to shut down....... done
server stopped

db1 上测试

1
2
3
4
5
6
[pg93@db1 etc]$ pg_controldata | grep cluster  
Database cluster state: in production
[pg93@db1 etc]$

[pg93@db1 etc]$ ll /database/pg93/pg_root/recovery.done
-rw-r--r-- 1 pg93 pg93 4.8K Oct 2 09:05 /database/pg93/pg_root/recovery.done

备注:可以看到db1 节点已完成从 standby 角色切换到 primary ,并且 $PGDATA/recovery.conf 文件变成 recovery.done 。

查看 pgpool 状态

1
2
3
4
5
6
7
8
9
10
11
francs@francs:~$ psql -h 192.168.2.200  -p 9999 postgres postgres  
Password for user postgres:
psql (9.3.5, server 9.3.3)
Type "help" for help.

postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+---------
0 | db1 | 1921 | 2 | 0.500000 | primary
1 | db2 | 1921 | 3 | 0.500000 | standby
(2 rows)

备注:db1 已转换成primary 角色。 db2 状态为 3 ,表示 down 状态。

接下来以 standby 身份启动 db2

1
2
pg93@db2 pg_root]$ mv recovery.done recovery.conf  
[pg93@db2 pg_root]$ pg_ctl start

db1 上操作,添加 db2 节点信息

1
2
3
4
5
6
[pg93@db1 etc]$ pcp_attach_node -d 5 db2 9898 pgpool pgpool 1  
DEBUG: send: tos="R", len=44
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="c", len=20, data=CommandComplete
DEBUG: send: tos="X", len=4

db2 状态恢复

1
2
3
4
5
6
7
8
9
10
11
[pg93@db1 etc]$ psql -p 9999  -U postgres  
Password for user postgres:
psql (9.3.3)
Type "help" for help.

postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role
---------+----------+------+--------+-----------+---------
0 | db1 | 1921 | 2 | 0.500000 | primary
1 | db2 | 1921 | 2 | 0.500000 | standby
(2 rows)

备注:切换成功

关 pgpool master

pgpool VIP :192.168.2.200 从 db1 飘到 db2,注意此时仅 pgpool 的 VIP 飘移, db1,db2 上的数据库角色不变, pgpool 的切换非常容易,关闭 pgpool 的脚本如下:

1
[pg93@db2 etc]$ pgpool -m fast stop

db1 掉电 power off

这里通过 vmwaer 执行 power off 模拟断电的情况, 在 db1(现在的 Primary 节点) 执行 power off ,发现 pgpool VIP 能切换到 db1 ,同时 db1 上的数据库切换成 primary ,切换成功。

第二种,第三种情况就不贴详细日志了。

参考

原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/database/239586.html

(0)
上一篇 2022年2月12日 13:52
下一篇 2022年2月12日 13:52

相关推荐

发表回复

登录后才能评论