PG Hot Standby 从库宕机案例

今天在一台主机上创建数据库时犯了一个错误,导致 standby 库 PG Server 宕掉,具体如下。

背景

今天业务需求,需要在原有主机上新增一个业务库,并分配新的表空间。已经有的主机上已经做了 PostgreSQL Hot standby 高可用。当时由于疏忽,忘记了重要的操作,下面是具体过程。

主库上创始建用户

1
2
CREATE ROLE mydb LOGIN ENCRYPTED PASSWORD '*'  
nosuperuser noinherit nocreatedb nocreaterole ;

主库上创建表空间

1
2
mkdir -p /database/pgdata/pg_tbs/tbs_mydb  
create tablespace tbs_mydb owner mydb LOCATION '/database/pgdata/pg_tbs/tbs_mydb';

主库上创建数据库

1
2
3
4
5
CREATE DATABASE mydb  
WITH OWNER = mydb
TEMPLATE = template0
ENCODING = 'UTF8'
TABLESPACE = tbs_mydb;

从库上验证

在主库上创建好数据库后, 准备到从库上观察新建的库是否已经同步到备库。此时,发现PostgreSQL已经宕机了,当时还以为是谁停掉的呢,接下来查看日志。

从库CSV日志

1
2
3
2011-02-23 11:10:25.909 CST,,,6591,,4d35195a.19bf,12,,2011-01-18 12:38:50 CST,1/0,0,FATAL,58P01,"directory ""/database/pgdata/pg_tbs/tbs_mydb"" does not exist",,,,,"xlog redo create ts: 3046144 ""/database/pgdata/pg_tbs/tbs_mydb""",,,,""  
2011-02-23 11:10:26.050 CST,,,6589,,4d351959.19bd,2,,2011-01-18 12:38:49 CST,,0,LOG,00000,"startup process (PID 6591) exited with exit code 1",,,,,,,,,""
2011-02-23 11:10:26.050 CST,,,6589,,4d351959.19bd,3,,2011-01-18 12:38:49 CST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,""

重要信息/database/pgdata/pg_tbs/tbs_mydb, 原来是自己在主库上创建表空间目录时,忘记在备机上创建相同的目录了,这么重要的一点居然忘记,汗,大意了, 想到库刚停不久,新建好这个目录,并启动备库,备库应该能赶得上主库。

解决方法

在从库上创建目录 /database/pgdata/pg_tbs/tbs_mydb, 并启动 PG Server。

再次查看 CSV日志

1
2
3
4
5
6
7
8
9
10
11
12
13
2011-02-23 11:38:09.981 CST,,,15156,,4d648121.3b34,1,,2011-02-23 11:38:09 CST,,0,LOG,00000,"database system was interrupted while in recovery at log time 2011-02-23 10:59:21 CST",,"If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.",,,,,,,""  
2011-02-23 11:38:09.981 CST,,,15156,,4d648121.3b34,2,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/16386/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.981 CST,,,15156,,4d648121.3b34,3,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/16394/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.982 CST,,,15156,,4d648121.3b34,4,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/19507/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.982 CST,,,15156,,4d648121.3b34,5,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/16392/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.982 CST,,,15156,,4d648121.3b34,6,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/16396/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.982 CST,,,15156,,4d648121.3b34,7,,2011-02-23 11:38:09 CST,,0,LOG,00000,"could not open tablespace directory ""pg_tblspc/17048/PG_9.0_201004261"": No such file or directory",,,,,,,,,""
2011-02-23 11:38:09.982 CST,,,15156,,4d648121.3b34,8,,2011-02-23 11:38:09 CST,,0,LOG,00000,"entering standby mode",,,,,,,,,""
2011-02-23 11:38:10.054 CST,,,15156,,4d648121.3b34,9,,2011-02-23 11:38:09 CST,1/0,0,LOG,00000,"redo starts at 60/ADC26678",,,,,,,,,""
2011-02-23 11:38:10.347 CST,,,15156,,4d648121.3b34,10,,2011-02-23 11:38:09 CST,1/0,0,LOG,00000,"consistent recovery state reached at 60/AF50F140",,,,,,,,,""
2011-02-23 11:38:10.347 CST,,,15154,,4d648121.3b32,1,,2011-02-23 11:38:09 CST,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""
2011-02-23 11:38:10.387 CST,,,15156,,4d648121.3b34,11,,2011-02-23 11:38:09 CST,1/0,0,LOG,00000,"invalid record length at 60/AF51E888",,,,,,,,,""
2011-02-23 11:38:10.393 CST,,,15160,,4d648122.3b38,1,,2011-02-23 11:38:10 CST,,0,LOG,00000,"streaming replication successfully connected to primary",,,,,,,,,""

信息streaming replication successfully connected to primary说明库已经恢复正常。

这次还好是备库菪机,对业务没有影响,而且发现及时,如果发现晚了,从库可能要重做,好险。生产库上操作一定要谨慎,谨慎,再谨慎。

查看同步情况

1
2
3
4
5
6
7
8
9
postgres=# /l  
List of databases
Name | Owner | Encoding | Collation | Ctype | Access privileges
--------------+--------------+----------+-----------+-------+-----------------------------
postgres | postgres | UTF8 | C | C |
template0 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C | C | =c/postgres +
mydb | mydb | UTF8 | C | C |

总结

在PG主从模式环境下,新增业务库的方法如下

  1. 在主库上创建用户。
  2. 在主库主机上创建表空间目录。
  3. 在从库主机上创建与主库上相同表空间目录。( 非常重要 )
  4. 在主库上创建表空间
  5. 在主库上创建数据库。

原创文章,作者:bd101bd101,如若转载,请注明出处:https://blog.ytso.com/236367.html

(0)
上一篇 2022年1月24日
下一篇 2022年1月24日

相关推荐

发表回复

登录后才能评论