PostgreSQL : 慎用 db_link

今天由于业务需要,需要在生产库上创建一个 db_link, 每天取源端库一张表一天数据,表大小为2G左右,考虑到这个数据每天只取一次,而且只取一天数据,于是准备在库上创建 db_link, 但在测试过程中发现性能问题。 在创建完 db_link 后,在库上执行 select * from view_xxx limit 1 时,发现十分钟都没有执行完, 而且数据库负载很高,达到 20, 同时 IO 等侍也很高,达到 20, 下面在测试环境下模拟下这种情形。

环境准备

主机: 8核 8G
源库 mpc_db_statistic
源库表 mpc_log_register(表大小: 193 M)
测试库 skytf
视图 view_mpc_log_register

准备在数据库 skytf 创建视图 view_mpc_log_register 取源库 mpc_db_statistic 的数据 mpc_log_register。

mpc_log_register 表结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Table "mpc_db_statistic.mpc_log_register"  
Column | Type | Modifiers
---------------+--------------------------------+----------------------------------------------------------------------
record_id | integer | not null default nextval('mpc_log_register_record_id_seq'::regclass)
client_ip | character varying(64) | not null
client_type | integer | default 0
reg_time | timestamp(0) without time zone |
reg_account | character varying(32) |
reg_userid | integer | default 0
update_time | timestamp(0) without time zone | default now()
imsi | character varying |
imei | character varying |
manufacturers | character varying |
model | character varying |
appid | integer |
scr_width | integer |
scr_height | integer |
sms_center | character varying |
mem_size | integer |
access_ip | character varying(32) |
Indexes:
"mpc_log_register_pkey" PRIMARY KEY, btree (record_id)
"log_register_client_id_index" btree (client_ip)

创建视图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
CREATE OR REPLACE VIEW view_mpc_log_register AS  
SELECT *
FROM dblink('dbname=mpc_db_statistic host=localhost port=1921 user=mpc_db_statistic password=mpc_db_statistic' ,
'select * from mpc_log_register')
as t1 (
record_id integer ,
client_ip character varying(64) ,
client_type integer ,
reg_time timestamp(0) without time zone ,
reg_account character varying(32) ,
reg_userid integer ,
update_time timestamp(0) without time zone ,
imsi character varying ,
imei character varying ,
manufacturers character varying ,
model character varying ,
appid integer ,
scr_width integer ,
scr_height integer ,
sms_center character varying ,
mem_size integer ,
access_ip character varying(32)
);

配置 pg_hba.conf

1
host mpc_db_statistic mpc_db_statistic 127.0.0.1/32 md5

备注:这行需要加到 pg_hba.conf ;另外 # IPv4 local connections: 部分第一行,如果不加这行,那么在查询视图时可能会报以下ERROR:

1
2
3
ERROR: password is required  
DETAIL: Non-superuser cannot connect if the server does not request a password.
HINT: Target servers authentication method must be changed.

查询视图测试

1
2
3
4
5
6
7
skytf=> select record_id,client_ip from view_mpc_log_register limit 1;  
NOTICE: identifier "dbname=mpc_db_statistic host=localhost port=1921 user=mpc_db_statistic password=mpc_db_statistic" will be truncated to "dbname=mpc_db_statistic host=localhost port=1921 user=mpc_db_st"
record_id | client_ip
-----------+---------------------
235215 | 117.136.25.45:30119
(1 row)
Time: 16697.482 ms

查询当前进程

1
2
3
4
5
6
postgres=# select datname,procpid,current_query from pg_stat_activity where current_query !='<IDLE>';  
datname | procpid | current_query
------------------+---------+--------------------------------------------------------------------------------------------
skytf | 29265 | select record_id,client_ip from view_mpc_log_register where record_id=1 limit 1;
postgres | 29853 | select datname,procpid,current_query from pg_stat_activity where current_query !='<IDLE>';
mpc_db_statistic | 29946 | select * from mpc_log_register

备注:一开始查询,可以看到有两个进程,一个是 skytf 库查询视图进程,另一个是 mpc_db_statistic库的全表扫描进程, 奇怪了,进程 29265 明明是根据索引字段 record_id 来查询的,为什么到了目标库就是全表扫描 select * from mpc_log_register

继续查询当前进程

1
2
3
4
5
6
postgres=# select datname,procpid,current_query from pg_stat_activity where current_query !='<IDLE>';  
datname | procpid | current_query
----------+---------+--------------------------------------------------------------------------------------------
skytf | 29265 | select record_id,client_ip from view_mpc_log_register where record_id=1 limit 1;
postgres | 29853 | select datname,procpid,current_query from pg_stat_activity where current_query !='<IDLE>';
(2 rows)

备注:过一会儿,发现目标库进程 29946 执行完毕,此时还有进程 29265 在跑。

查看内存使用

SQL执行过程中内存使用情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
[root@ ~]# free -m  
total used free shared buffers cached
Mem: 16053 656 15396 0 8 501
-/+ buffers/cache: 146 15906
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 792 15260 0 8 501
-/+ buffers/cache: 282 15771
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 849 15203 0 8 501
-/+ buffers/cache: 339 15714
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1012 15041 0 8 501
-/+ buffers/cache: 501 15551
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1126 14926 0 8 501
-/+ buffers/cache: 616 15436
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1166 14886 0 8 501
-/+ buffers/cache: 656 15396
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1209 14843 0 8 501
-/+ buffers/cache: 699 15354
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1318 14734 0 8 514
-/+ buffers/cache: 795 15257
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1495 14558 0 8 546
-/+ buffers/cache: 940 15113
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1585 14467 0 8 563
-/+ buffers/cache: 1014 15039
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1672 14380 0 8 578
-/+ buffers/cache: 1085 14968
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1762 14290 0 8 595
-/+ buffers/cache: 1158 14895
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1858 14194 0 8 614
-/+ buffers/cache: 1235 14817
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 1952 14100 0 8 632
-/+ buffers/cache: 1311 14741
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 2044 14008 0 8 650
-/+ buffers/cache: 1385 14668
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 2140 13912 0 8 669
-/+ buffers/cache: 1462 14590
Swap: 16386 31 16355
[root@ ~]# free -m
total used free shared buffers cached
Mem: 16053 658 15394 0 8 501
-/+ buffers/cache: 148 15904
Swap: 16386 31 16355

备注:发现内存 used 值开始由 656M 一直增加到 2140 M, 当 SQL执行完后,又再次回到 658 M, 推测SQL通过 db_link 查询远端数据时,会将远端表先取到本地库内存里,这个步骤非常耗时,耗IO,同时本地库负载也比异常高。

结果出来了

1
2
3
4
5
6
7
skytf=> select record_id,client_ip from view_mpc_log_register limit 1;  
NOTICE: identifier "dbname=mpc_db_statistic host=localhost port=1921 user=mpc_db_statistic password=mpc_db_statistic" will be truncated to "dbname=mpc_db_statistic host=localhost port=1921 user=mpc_db_st"
record_id | client_ip
-----------+---------------------
235215 | 117.136.25.45:30119
(1 row)
Time: 16697.482 ms

备注:这张表不太大,只有193M,都花了 16 秒左右。

执行计划

1
2
3
4
5
6
7
skytf=> explain select record_id,client_ip from view_mpc_log_register where record_id=1 limit 1;  
QUERY PLAN
-----------------------------------------------------------------------
Limit (cost=0.00..2.50 rows=1 width=150)
-> Function Scan on dblink t1 (cost=0.00..12.50 rows=5 width=150)
Filter: (record_id = 1)
(3 rows)

备注: 走的是函数索引。

大表测试

由于前面源表大小才 193M,效果不明显,后来拿了张2G大小的表做实验,通过视图取一条数据,后来数据库负载直接升到40左右,连数据库主机都 SSH 不上了。汗, PostgreSQL 的 db_link 得慎用。

总结

  1. PostgreSQL 的 db_link 要慎用,没有特殊的需求不要使用,如果业务需要可以通过其它方式代替;
  2. PostgreSQL 的 db_link 在使用过程中会先把对端表数据全表扫描到本地库,尽管查询视图时用到了索引字段,但依然不会走索引扫描,而是走 “Function Scan” ,从而带来严重的性能问题 (负载高,IO高)。

原创文章,作者:254126420,如若转载,请注明出处:https://blog.ytso.com/236434.html

(0)
上一篇 2022年1月24日
下一篇 2022年1月24日

相关推荐

发表回复

登录后才能评论