PostgreSQL: Oldest xmin is far in the past 处理一例

快过年了,今天对数据库进行健康检查,发现海外一数据库中出现大量以下日志,而且已经报了十几天,CSV 日志如下:

数据库 CSV 日志

1
2
3
4
2012-01-18 15:22:39.098 CST,,,8871,,4f16733e.22a7,1,,2012-01-18 15:22:38 CST,81/42114625,0,WARNING,01000,"oldest xmin is far in the past",,"Close open transactions soon to avoid wraparound problems.",,,,,,  
2012-01-18 15:22:39.109 CST,,,8871,,4f16733e.22a7,2,,2012-01-18 15:22:38 CST,81/42114627,0,WARNING,01000,"oldest xmin is far in the past",,"Close open transactions soon to avoid wraparound problems.",,,,,,
2012-01-18 15:22:39.119 CST,,,8871,,4f16733e.22a7,3,,2012-01-18 15:22:38 CST,81/42114629,0,WARNING,01000,"oldest xmin is far in the past",,"Close open transactions soon to avoid wraparound problems.",,,,,,
2012-01-18 15:22:39.119 CST,,,8871,,4f16733e.22a7,4,,2012-01-18 15:22:38 CST,81/42114631,0,WARNING,01000,"oldest xmin is far in the past",,"Close open transactions soon to avoid wraparound problems.",,,,,,

备注:根据提示信息, 猜测有尚未提交的事务,导致 autovacuum 进程不能 vacuum。

查询数据库对像年龄

1
2
3
4
5
6
7
8
9
10
11
12
13
14
postgres=# select age(relfrozenxid) from pg_class where relkind='r' order by 1 desc limit 10;  
age
-----------
341105277
341105277
341105277
341105277
341105277
341105277
341105277
315987817
315987814
315987811
(10 rows)

备注:数据库对像年龄已经达到 3 亿。

发现异常进程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
postgres=# select procpid, datname,usename,query_start from pg_stat_activity where current_query !='<IDLE>'  
order by query_start;
procpid | datname | usename | query_start
---------+------------+----------+-------------------------------
27694 | db_oversea | oversea | 2011-09-12 07:35:56.615883+08
5019 | db_oversea | oversea | 2012-01-18 15:40:47.063949+08
5927 | db_oversea | oversea | 2012-01-18 15:40:47.290211+08
1552 | db_oversea | oversea | 2012-01-18 15:40:47.32765+08
8227 | db_oversea | oversea | 2012-01-18 15:40:47.381145+08
7758 | db_oversea | oversea | 2012-01-18 15:40:47.391125+08
7759 | db_oversea | oversea | 2012-01-18 15:40:47.45002+08
8094 | db_oversea | oversea | 2012-01-18 15:40:47.812721+08
4804 | db_oversea | oversea | 2012-01-18 15:40:47.821056+08
6689 | db_oversea | oversea | 2012-01-18 15:40:47.90369+08
8506 | db_oversea | oversea | 2012-01-18 15:40:47.915079+08
9039 | postgres | postgres | 2012-01-18 15:43:47.834318+08
(12 rows)

postgres=# select procpid,datname,usename,current_query,query_start from pg_stat_activity where procpid=27694;
procpid | datname | usename | current_query | query_start
---------+------------+---------+----------------------------------------------------+-------------------------------
27694 | skytf | skytf | SELECT | 2011-09-12 07:35:56.615883+08
: mrp.*,res.name as displayname
: FROM
: skytf.tbl_app as mrp,tbl_app_resource as res
: where mrp.appid=res.appid and
: mrp.status='1' and res.status='1'
: order by id desc

备注:上面发现进程 27694 是去年的,至今仍未提交,经和应用人员确认后,可以 kill。

Kill 进程 27694

1
2
3
4
5
postgres=# select pg_terminate_backend(27694);  
pg_terminate_backend
----------------------
t
(1 row)

发现 Autovacuum 进程

1
2
3
4
5
6
7
8
9
10
[postgres@skytf-db](mailto:postgres@skytf-db)-> ps -ef | grep auto  
root 3523 1 0 2010 ? 00:00:01 automount
postgres 4002 3949 0 2010 ? 00:10:54 postgres: autovacuum launcher process
postgres 10001 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10002 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10003 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10004 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10005 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10006 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf
postgres 10007 3949 0 16:16 ? 00:00:00 postgres: autovacuum worker process skytf

进程 kill 后再次查看年龄

1
2
3
4
5
6
7
8
9
10
11
12
13
14
postgres=# select age(relfrozenxid) from pg_class where relkind='r' order by 1 desc limit 10;  
age
----------
50003395
50003393
50003382
50003381
50003381
50003381
50003381
50003374
50003374
50003374
(10 rows)

备注:进程 kill 后再次查看年龄,数据库对像年龄已经循环,当前最大值为 50003395,并且日志也不再报错。

参考

http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

原创文章,作者:6024010,如若转载,请注明出处:https://blog.ytso.com/237819.html

(0)
上一篇 2022年1月29日
下一篇 2022年1月29日

相关推荐

发表回复

登录后才能评论