最近准备研究下 Pgpool-II(后续简称pgpool) ,之前很早就了解 pgpool 具有连接池,复制,负载均衡,并行查询,以及 HA 功能,今天打算测试下 pgpool 的 HA 功能,暂不考虑其它。
环境信息
基本信息 硬件:笔记本两台虚拟机 db1: 192.168.2.37 db2: 192.168.2.38 pgpool: 192.168.2.200 ( pgpool 使用 VIP, VIP 可以在 db1 , db2 上漂移 ) 系统: RHEL 6.2 PostgreSQL 版本: 9.3.5 备注:PostgreSQL 流复制环境搭建略,也可参考 PostgreSQL:使用 pg_basebackup 搭建流复制环境 。
Pgpool 实验环境架构图
pgpool 部署架构规划图
备注: watchdog可以避免 pgpool 的单点故障。
编译安装 Pgpool ( 两节点安装)
下载pgpool软件
安装 pgpool
1 2 3 4 5 6 7 8 9 10 11
[root@db1 soft_bak] [root@db1 pgpool-II-3.3 .4 ] [root@db1 pgpool-II-3.3 .4 ] [root@db1 pgpool-II-3.3 .4 ] [root@db1 pgpool-II-3.3 .4 ] [root@db1 pgpool-II-3.3 .4 ] configure 错误 checking for PQexecPrepared in -lpq... no configure: error: libpq is not installed or libpq is old
备注:如果出现这个错误,说明找不到 PostgreSQL lib 库, configure 加上 -with-pgsql 选项即可。
安装 Pgpool 相关函数
pgpool 函数虽然不是必需安装,但建议安装 pgpool_regclass, pgpool_recovery 函数。
安装 pgpool_regclass, pgpool_recovery (recommended)
1 2 3 4
[root@db1 sql]# cd /opt/soft_bak/pgpool-II-3.3.4/sql [root@db1 sql]# source /home/pg93/.bash_profile [root@db1 sql]# make [root@db1 sql]# make install
备注:之后在 $PGHOME/share/extension 目录下生成以下文件。
pgpool 相关文件
1 2 3 4 5 6 7
[pg93@db1 extension]$ ll /opt/pgsql_9.3.3/share/extension/pgpool* -rw-r--r-- 1 root root 791 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_recovery--1.0.sql -rw-r--r-- 1 root root 160 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_recovery.control -rw-r--r-- 1 root root 551 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool-recovery.sql -rw-r--r-- 1 root root 283 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_regclass--1.0.sql -rw-r--r-- 1 root root 152 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool_regclass.control -rw-r--r-- 1 root root 142 Sep 30 15:56 /opt/pgsql_9.3.3/share/extension/pgpool-regclass.sql
在模板库 template1 创建 pgpool_regclass
1 2 3 4 5 6 7 8 9
[pg93@db1 extension]$ psql template1 psql (9.3.3) Type "help" for help. template1 =# create extension pgpool_regclass; CREATE EXTENSION template1 =# CREATE EXTENSION pgpool_recovery; CREATE EXTENSION
查看新增加的recovery 管理函数
1 2 3 4 5 6 7 8 9
template1 =# /df List of functions Schema | Name | Result data type | Argument data types | Type --------+---------------------+------------------+----------------------------------------------------------------+-------- public | pgpool_pgctl | boolean | action text, stop_mode text | normal public | pgpool_recovery | boolean | script_name text, remote_host text, remote_data_directory text | normal public | pgpool_remote_start | boolean | remote_host text, remote_data_directory text | normal public | pgpool_switch_xlog | text | arcive_dir text | normal (4 rows)
备注: 每个库都需要安装 pgpool_regclass,为了方便在 template1 上安装 pgpool_regclass,以后新建的库都以 template1 为模板库。
数据库配置
recovery.conf 配置
1 2 3
recovery_target_timeline = 'latest' standby_mode = on primary_conninfo = 'host=192.168.2.38 port=1921 user=repuser'
备注: primary_conninfo 的 host 分别配置对端 host IP.
.pgpass文件配置
1 2 3
[pg93@db 1 pg_root]$ cat ~/.pgpass 192.168 .2 .37 :1921 :replication: repuser: rep123us345er 192.168 .2 .38 :1921 :replication: repuser: rep123us345er
postgresql.conf 配置略
配置 Pgpool-II ( 两节点操作)
配置 pcp.conf
1 2 3 4 5 6
[root@db1 etc] [root@db1 etc] [pg93@db1 etc]$ pg_md5 -u pgpool -p password: ba777e4c2f15c11ea8ac3be7e0440aa0
备注: pgpool 提供 pcp 接口,可以查看,管理 pgpool 的状态,并且可以远程操作 pgpool , pcp.conf 用来对 pcp 相关命令认证的文件,格式为 USERID:MD5PASSWD。
编写 pcp.conf 文件,写入以下
1 2
# USERID:MD5PASSWD pgpool: ba777e4c2f15c11ea8ac3be7e0440aa0
配置 /etc/hosts
1 2 3 4 5 6
[root@db1 ~]# cat /etc/hosts 127.0 .0 .1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168 .2 .37 db1 192.168 .2 .38 db2
配置 ifconfig, arping 执行权限
1 2 3
[root@db1 pgpool]# chmod u+s /sbin/ifconfig [root@db1 pgpool]# chmod u+s /usr/sbin/ 备注: 以理普通用户能够执行以上命令, failover_command 命令要用到。
配置两节点信任关系 , 例如 db1 上的 pg93 配置信任关系,免密码登录 db2.
1 2 3 4
pg93@db1 ~]$ ssh-keygen [pg93@db1 ~]$ ssh-copy-id pg93@db2 [pg93@db1 ~]$ ssh pg93@db2 [pg93@db2 ~]$ exit
备注: 此操作在 db2 上也操作一次, failover_command 命令要用到。
配置 pgpool.conf
1 2
[root@db1 etc]# cd /opt /pgpool/etc [root@db1 etc]# cp pgpool.conf .sample pgpool.conf
开启日志 在日志 /etc/rsyslog.conf 加入以下行
1 2
# pgpool local0.* /var/log /pgpool.log
重启 rsyslog 服务
1 2 3
[root@db1 ~]# /etc/init.d/rsyslog restart Shutting down system logger: [OK ] Starting system logger: [OK ]
pgpool.conf 配置以下行
1
log_destination = 'syslog'
备注: 这里使用 syslog。
主节点的 pgpool.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
[pg93@db1 etc]$ grep ^[a-z] pgpool.conf listen_addresses = '*' port = 9999 socket_dir = '/tmp' pcp_port = 9898 pcp_socket_dir = '/tmp' backend_hostname0 = 'db1' backend_port0 = 1921 backend_weight0 = 1 backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = 'db2' backend_port1 = 1921 backend_weight1 = 1 backend_flag1 = 'ALLOW_TO_FAILOVER' enable_pool_hba = on pool_passwd = 'pool_passwd' authentication_timeout = 60 ssl = off num_init_children = 32 max_pool = 4 child_life_time = 300 child_max_connections = 0 connection_life_time = 0 client_idle_limit = 0 log_destination = 'syslog' print_timestamp = on log_connections = on log_hostname = on log_statement = on log_per_node_statement = off log_standby_delay = 'none' syslog_facility = 'LOCAL0' syslog_ident = 'pgpool' debug_level = 0 pid_file_name = '/opt/pgpool/pgpool.pid' logdir = '/tmp' connection_cache = on reset_query_list = 'ABORT; DISCARD ALL' replication_mode = off replicate_select = off insert_lock = on lobj_lock_table = '' replication_stop_on_mismatch = off failover_if_affected_tuples_mismatch = off load_balance_mode = on ignore_leading_white_space = on white_function_list = '' black_function_list = 'nextval,setval' master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 5 sr_check_user = 'repuser' sr_check_password = 'rep123us345er' delay_threshold = 16000 follow_master_command = '' parallel_mode = off pgpool2_hostname = '' system_db_hostname = 'localhost' system_db_port = 5432 system_db_dbname = 'pgpool' system_db_schema = 'pgpool_catalog' system_db_user = 'pgpool' system_db_password = '' health_check_period = 5 health_check_timeout = 20 health_check_user = 'repuser' health_check_password = 'rep123us345er' health_check_max_retries = 3 health_check_retry_delay = 1 failover_command = '/opt/pgpool/failover_stream.sh %H ' failback_command = '' fail_over_on_backend_error = on search_primary_node_timeout = 10 recovery_user = 'nobody' recovery_password = '' recovery_1st_stage_command = '' recovery_2nd_stage_command = '' recovery_timeout = 90 client_idle_limit_in_recovery = 0 use_watchdog = on trusted_servers = '' ping_path = '/bin' wd_hostname = 'db1' wd_port = 9000 wd_authkey = '' delegate_IP = '192.168.2.200' ifconfig_path = '/sbin' if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0' if_down_cmd = 'ifconfig eth0:0 down' arping_path = '/usr/sbin' arping_cmd = 'arping -U $_IP_$ -w 1' clear_memqcache_on_escalation = on wd_escalation_command = '' wd_lifecheck_method = 'heartbeat' wd_interval = 10 wd_heartbeat_port = 9694 wd_heartbeat_keepalive = 2 wd_heartbeat_deadtime = 30 heartbeat_destination0 = 'db2' heartbeat_destination_port0 = 9694 heartbeat_device0 = 'eth0' wd_life_point = 3 wd_lifecheck_query = 'SELECT 1' wd_lifecheck_dbname = 'template1' wd_lifecheck_user = 'nobody' wd_lifecheck_password = '' other_pgpool_hostname0 = 'db2' other_pgpool_port0 = 9999 other_wd_port0 = 9000 relcache_expire = 0 relcache_size = 256 check_temp_table = on memory_cache_enabled = off memqcache_method = 'shmem' memqcache_memcached_host = 'localhost' memqcache_memcached_port = 11211 memqcache_total_size = 67108864 memqcache_max_num_cache = 1000000 memqcache_expire = 0 memqcache_auto_cache_invalidation = on memqcache_maxcache = 409600 memqcache_cache_block_size = 1048576 memqcache_oiddir = '/var/log/pgpool/oiddir' white_memqcache_table_list = '' black_memqcache_table_list = ''
备节点的 pgpool.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
[pg93@db2 etc]$ grep ^[a-z] pgpool.conf listen_addresses = '*' port = 9999 socket_dir = '/tmp' pcp_port = 9898 pcp_socket_dir = '/tmp' backend_hostname0 = 'db1' backend_port0 = 1921 backend_weight0 = 1 backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = 'db2' backend_port1 = 1921 backend_weight1 = 1 backend_flag1 = 'ALLOW_TO_FAILOVER' enable_pool_hba = on pool_passwd = 'pool_passwd' authentication_timeout = 60 ssl = off num_init_children = 32 max_pool = 4 child_life_time = 300 child_max_connections = 0 connection_life_time = 0 client_idle_limit = 0 log_destination = 'syslog' print_timestamp = on log_connections = on log_hostname = on log_statement = on log_per_node_statement = off log_standby_delay = 'none' syslog_facility = 'LOCAL0' syslog_ident = 'pgpool' debug_level = 0 pid_file_name = '/opt/pgpool/pgpool.pid' logdir = '/tmp' connection_cache = on reset_query_list = 'ABORT; DISCARD ALL' replication_mode = off replicate_select = off insert_lock = on lobj_lock_table = '' replication_stop_on_mismatch = off failover_if_affected_tuples_mismatch = off load_balance_mode = on ignore_leading_white_space = on white_function_list = '' black_function_list = 'nextval,setval' master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 0 sr_check_user = 'repuser' sr_check_password = 'rep123us345er' delay_threshold = 16000 follow_master_command = '' parallel_mode = off pgpool2_hostname = '' system_db_hostname = 'localhost' system_db_port = 5432 system_db_dbname = 'pgpool' system_db_schema = 'pgpool_catalog' system_db_user = 'pgpool' system_db_password = '' health_check_period = 0 health_check_timeout = 20 health_check_user = 'nobody' health_check_password = '' health_check_max_retries = 0 health_check_retry_delay = 1 failover_command = '/opt/pgpool/failover_stream.sh %H ' failback_command = '' fail_over_on_backend_error = on search_primary_node_timeout = 10 recovery_user = 'nobody' recovery_password = '' recovery_1st_stage_command = '' recovery_2nd_stage_command = '' recovery_timeout = 90 client_idle_limit_in_recovery = 0 use_watchdog = on trusted_servers = '' ping_path = '/bin' wd_hostname = 'db2' wd_port = 9000 wd_authkey = '' delegate_IP = '192.168.2.200' ifconfig_path = '/sbin' if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0' if_down_cmd = 'ifconfig eth0:0 down' arping_path = '/usr/sbin' arping_cmd = 'arping -U $_IP_$ -w 1' clear_memqcache_on_escalation = on wd_escalation_command = '' wd_lifecheck_method = 'heartbeat' wd_interval = 10 wd_heartbeat_port = 9694 wd_heartbeat_keepalive = 2 wd_heartbeat_deadtime = 30 heartbeat_destination0 = 'db1' heartbeat_destination_port0 = 9694 heartbeat_device0 = 'eth0' wd_life_point = 3 wd_lifecheck_query = 'SELECT 1' wd_lifecheck_dbname = 'template1' wd_lifecheck_user = 'nobody' wd_lifecheck_password = '' other_pgpool_hostname0 = 'db1' other_pgpool_port0 = 9999 other_wd_port0 = 9000 relcache_expire = 0 relcache_size = 256 check_temp_table = on memory_cache_enabled = off memqcache_method = 'shmem' memqcache_memcached_host = 'localhost' memqcache_memcached_port = 11211 memqcache_total_size = 67108864 memqcache_max_num_cache = 1000000 memqcache_expire = 0 memqcache_auto_cache_invalidation = on memqcache_maxcache = 409600 memqcache_cache_block_size = 1048576 memqcache_oiddir = '/var/log/pgpool/oiddir' white_memqcache_table_list = '' black_memqcache_table_list = ''
/opt/pgpool/failover_stream.sh 脚本内容
1 2 3 4 5 6 7 8 9 10 11 12
[pg93@db1 etc]$ cat /opt/pgpool/failover_stream.sh #! /bin/sh new_master=$1 trigger_command="$PGHOME /bin/pg_ctl promote -D $PGDATA " /usr/bin/ssh -T $new_master $trigger_command exit 0;
备注: 我这里定义的 failover 脚本和 pgpool 手册上的脚本不同,这里使用了 pg_ctl promote
的切换方式,一方面以文件触发的形式个人觉得不是很好。另一方面:当以 trigger file 形式实现 HA 时会遇到不能来回切换的问题。
如果仔细看,可以看到这个切换脚本并不严谨,每当有节点离线时,它都会触发一次,也就是说如果当前掉线的是备节点,它也会到对端主库执行一次 failover_command 命令,不过没关系,并不影响。
启动 pgpool
备注: 此时可以查看 /var/log/pgpool.log 日志了,注意两节点都启动。
pgpool 关闭命令
1
[pg93@db2 etc]$ pgpool -m fast stop
pgpool reload 命令
1
[pg93@db1 etc]$ pgpool reload
关于连接异常问题 备注: 这个表格很好地描述了各种连接异常问题,对照表格很容易找到原因。
在本机 ubuntu 查看 pgpool 状态
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
francs@francs:~$ lsb_release -a No LSB modules are available. Distributor ID:Ubuntu Description:Ubuntu 14.04.1 LTS Release:14.04 Codename: trusty francs@francs:~$ psql -h 192.168.2.200 -p 9999 postgres postgres Password for user postgres: psql (9.3.5, server 9.3.3) Type "help" for help. postgres =# show pool_nodes; node_id | hostname | port | status | lb_weight | role ---------+----------+------+--------+-----------+--------- 0 | db1 | 1921 | 2 | 0.500000 | standby 1 | db2 | 1921 | 2 | 0.500000 | primary
备注: db2 为 primary, db1 为 standby。 status 为 2 ,表示正常连接; 关于 status 状态,如下
0 – This state is only used during the initialization. PCP will never display it.
1 – Node is up. No connections yet.
2 – Node is up. Connections are pooled.
3 – Node is down.
HA 验证
根据上面 show_nodes 输出, 此时 db2 节点为主库, db1 节点为备库,并且 pgpool 目前跑在 db1 上,下面分别测试以下情形:
关 db2 上的数据库
db2 关闭数据库
1 2 3
[pg93@db2 etc]$ pg_ctl stop -m fast waiting for server to shut down.. .. .. . done server stopped
db1 上测试
1 2 3 4 5 6
[pg93@db 1 etc]$ pg_controldata | grep cluster Database cluster state: in production [pg93@db 1 etc]$ [pg93@db 1 etc]$ ll /database/ pg93/pg_root/ recovery.done -rw-r--r-- 1 pg93 pg93 4.8 K Oct 2 09 :05 /database/ pg93/pg_root/ recovery.done
备注:可以看到db1 节点已完成从 standby 角色切换到 primary ,并且 $PGDATA/recovery.conf 文件变成 recovery.done 。
查看 pgpool 状态
1 2 3 4 5 6 7 8 9 10 11
francs@francs:~$ psql -h 192.168.2.200 -p 9999 postgres postgres Password for user postgres: psql (9.3.5, server 9.3.3) Type "help" for help. postgres =# show pool_nodes; node_id | hostname | port | status | lb_weight | role ---------+----------+------+--------+-----------+--------- 0 | db1 | 1921 | 2 | 0.500000 | primary 1 | db2 | 1921 | 3 | 0.500000 | standby (2 rows)
备注:db1 已转换成primary 角色。 db2 状态为 3 ,表示 down 状态。
接下来以 standby 身份启动 db2
1 2
pg93@db2 pg_root]$ mv recovery.done recovery.conf [pg93@db2 pg_root]$ pg_ctl start
db1 上操作,添加 db2 节点信息
1 2 3 4 5 6
[pg93@db1 etc]$ pcp_attach_node -d 5 db2 9898 pgpool pgpool 1 DEBUG: send: tos ="R" , len =44 DEBUG: recv: tos ="r" , len =21, data =AuthenticationOK DEBUG: send: tos ="D" , len =6 DEBUG: recv: tos ="c" , len =20, data =CommandComplete DEBUG: send: tos ="X" , len =4
db2 状态恢复
1 2 3 4 5 6 7 8 9 10 11
[pg93@db1 etc]$ psql -p 9999 -U postgres Password for user postgres: psql (9.3.3) Type "help" for help. postgres =# show pool_nodes; node_id | hostname | port | status | lb_weight | role ---------+----------+------+--------+-----------+--------- 0 | db1 | 1921 | 2 | 0.500000 | primary 1 | db2 | 1921 | 2 | 0.500000 | standby (2 rows)
备注:切换成功
关 pgpool master
pgpool VIP :192.168.2.200 从 db1 飘到 db2,注意此时仅 pgpool 的 VIP 飘移, db1,db2 上的数据库角色不变, pgpool 的切换非常容易,关闭 pgpool 的脚本如下:
1
[pg93@db2 etc]$ pgpool -m fast stop
db1 掉电 power off
这里通过 vmwaer 执行 power off 模拟断电的情况, 在 db1(现在的 Primary 节点) 执行 power off ,发现 pgpool VIP 能切换到 db1 ,同时 db1 上的数据库切换成 primary ,切换成功。
第二种,第三种情况就不贴详细日志了。
参考
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/database/239586.html