之前已经写了篇 Blog 简单介绍了 PostgreSQL9.2 新增的 RANGE 数据类型使用，链接如 https://postgres.fun/20120524163556.html , 今天 RANGE 类型的使用情况源于一开发人员咨询 IP 地址库查询的 SQL 效率问题，后来转换成 inet 网络地址类型后，效率提升并不明显，之后经过德哥建议，使用 RANGE 类型，效率提升迅速，下面是具体过程：起初，开发人员咨询的 SQL 如下

优化前的 SQL

francs=> select version();  
 version  
---------------------------------------------------------------------------------------------------------------  
PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit  
(1 row)
francs=> explain analyze SELECT PROVINCE,CITY  
francs-> FROM ip_address_range  
francs-> WHERE start_ip <= 3708713472  
francs-> and end_ip>=3708713472 LIMIT 1 ;  
 QUERY PLAN  

-----------------------------------------------------------------------------------------------------------------------------------  
Limit (cost=0.00..0.32 rows=1 width=20) (actual time=3.138..3.138 rows=1 loops=1)  
 -> Index Scan using idx_start_end_ip on ip_address_range (cost=0.00..908.16 rows=2856 width=20) (actual time=3.136..3.136 rows=1 loops=1)  
 Index Cond: ((start_ip <= 3708713472::bigint) AND (end_ip >= 3708713472::bigint))  
Total runtime: 3.168 ms  
(4 rows)

备注：开发环境 PostgreSQL 9.1.3，执行时间为 3 ms 左右，其实这个速度也不算非常慢，当然我们希望能够将它优化，当压力测试时力求达到更多的 tps。

表信息

表结构

francs=> /d ip_address_range  
 Table "francs.ip_address_range"  
 Column | Type | Modifiers  
-----------+-------------------+-----------  
id | numeric | not null  
start_ip | bigint |  
end_ip | bigint |  
province | character varying |  
city | character varying |  
isp | character varying |  
start_ip1 | character varying |  
end_ip1 | character varying |  
Indexes:  
 "ip_address_range_pkey" PRIMARY KEY, btree (id)  
 "idx_start_end_ip" btree (start_ip, end_ip)

表数据

francs=> select * from ip_address_range limit 1;  
id | start_ip | end_ip | province | city | isp | start_ip1 | end_ip1  
----+------------+------------+----------+--------+------+--------------+----------------  
 1 | 3708713472 | 3708715007 | 河南省 | 信阳市 | 联通 | 221.14.122.0 | 221.14.127.255  
(1 row)

francs=> select count(*) from ip_address_range;  
count  
-------  
32807  
(1 row)

备注：最初的想法是将 start_ip,end_ip 字段转换成一个 inet 字段，然后通过操作符 >>= 进行判断，遗憾的是，进行一番折腾，操作符 >>= 不走索引，在德哥的提示下，建议使用 Range 类型。而 RANGE 是 PostgreSQL 9.2 的新特性，于是准备将这个库导到 9.2 版本环境下测试。数据库

9.2 版本测试

增加 Range 字段

francs=> select version();  
 version  
------------------------------------------------------------------------------------------------------------------  
PostgreSQL 9.2beta4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 64-bit  
(1 row)

francs=> alter table ip_address_range add column ip_range int8range;ALTER TABLE

francs=> update ip_address_range set ip_range= int8range(start_ip,end_ip,'(]');  
UPDATE 32807

francs=> select * From ip_address_range limit 1;  
id | start_ip | end_ip | province | city | isp | start_ip1 | end_ip1 | ip_range  
----+------------+------------+----------+--------+------+--------------+----------------+-------------------------  
 1 | 3708713472 | 3708715007 | 河南省 | 信阳市 | 联通 | 221.14.122.0 | 221.14.127.255 | [3708713473,3708715008)  
(1 row)

备注：在表 ip_address_range 上增加一个 int8range 字段。

创建GIST索引

1 2	francs=> create index idx_ip_address_range_ip_range on ip_address_range using gist ( ip_range); CREATE INDEX

备注： GIST 索引可以用于操作符 =, &&, <@, @>, <<, >>, -|-, &<, and &>等，加速查询。

性能测试1

francs=> explain analyze SELECT PROVINCE,CITY  
francs-> FROM ip_address_range  
francs-> WHERE start_ip <= 3708713472  
francs-> and end_ip>=3708713472 LIMIT 1 ;  
 QUERY PLAN  
-----------------------------------------------------------------------------------------------------------------  
Limit (cost=0.00..0.53 rows=1 width=19) (actual time=0.559..0.559 rows=1 loops=1)  
 -> Seq Scan on ip_address_range (cost=0.00..1481.11 rows=2779 width=19) (actual time=0.558..0.558 rows=1 loops=1)  
 Filter: ((start_ip <= 3708713472::bigint) AND (end_ip >= 3708713472::bigint))  
Total runtime: 0.591 ms  
(4 rows)

备注：迁移到 9.2 版本后，走的是 Seq Scan，但执行时间由9.1 版本的3ms 左右降低到了 0.59 ms 左右，这在一定程度上验证了 9.2 版本查询性能提升的说法( 9.2 版本 release note 说明中有描述)，当然这不是今天的重点。

性能测试2

francs=> explain analyze select * From ip_address_range where ip_range @> 3708713472::int8;  
 QUERY PLAN  
  
-------------------------------------------------------------------------------------------------------------------------------------  
Index Scan using idx_ip_address_range_ip_range on ip_address_range (cost=0.00..34.85 rows=33 width=106) (actual time=0.055..0.056 rows=1 loops=1)  
 Index Cond: (ip_range @> 3708713472::bigint)  
Total runtime: 0.085 ms  
(3 rows)

备注：性能测试1 根据的是起始IP查询，性能测试2 根据 range 字段查询，执行时间在 0.085 ms 左右，这个时间比性能测试1 提高 6 倍左右。

总结

本文中的整型字段 IP，是经过了一种的算法将普通 IP 转换成的，本文略。
RANGE 类型对 IP 地址定位的应用效率非常高，我们最终将IP 地址定位城市的 SQL 从原来的 3 ms 左右优化到 0.08 ms 左右。

参考

http://www.postgresql.org/docs/9.2/static/rangetypes.html
http://www.postgresql.org/docs/9.2/static/functions-range.html#RANGE-OPERATORS-TABLE

原创文章，作者：kepupublish，如若转载，请注明出处：https://blog.ytso.com/237897.html

PostgreSQL9.2：Range Data Type Using Case

优化前的 SQL

表信息

9.2 版本测试

创建GIST索引

性能测试1

性能测试2

总结

参考

发表回复

PostgreSQL9.2：Range Data Type Using Case

优化前的 SQL

表信息

9.2 版本测试

创建GIST索引

性能测试1

性能测试2

总结

参考

相关推荐

发表回复