浅谈 PostgreSQL 的柱状图

和Oracle 一样,PostgreSQL 的表也有柱状图, 用于保存表的列的统计信息。可以通过查询系统视图 pg_stats.histogram_bounds 来查看列的柱状图。

官网关于柱状图的描述

Histogram_bounds

A list of values that divide the column is values into groups of approximately
equal population. The values in most_common_vals, if present, are omitted from this
histogram calculation. (This column is NULL if the column data type does not have
a < operator or if the most_common_vals list accounts for the entire population.)

上面这段英文对英文基础不好的人来说,可能有点费力;偶读了半天,也似懂非懂,但第一句话提供了重要信息, 说柱状图是指一系列的值将表的列值分成近似相同的组;这句话可能翻译得不太好,等下通过实验来了解一下,就明白了。histogram_bounds 的值受参数 default_statistics_target 控制。default_statistics_target 默认值为100,default_statistics_target参数值越大,那么PG搜集的列上的统计信息就越精确,当然在表做 Analyze 操作时花费的时间也稍长些。

参数 default_statistics_target 的描述

default_statistics_target (integer)
Sets the default statistics target for table columns without a column-specific
target set via ALTER TABLE SET STATISTICS. Larger values increase the time needed to
do ANALYZE, butmight improve the quality of the planner is estimates. The default is

  1. For moreinformation on the use of statistics by the PostgreSQL query planner,
    refer to Section 14.2.

场景一: attstattarget 1000 时的场景

pg_attribute.attstattarget 是指列上统计信息搜集指标,默认值为系统 default_statistics_target ,当attstattarget 值为 -1 时, 表示使用系统的 “default_statistics_target”值,关于这个参数详见官网文档。

查看默认参数值

1
2
3
4
5
skytf=> show default_statistics_target;  
default_statistics_target
---------------------------
1000
(1 row)

查看表 test_11 的表 id上的 attstattarget 参数

1
2
3
4
5
6
7
8
9
10
11
skytf=> select oid,relname from pg_class where relname='test_11';  
oid | relname
----------+---------
14205208 | test_11
(1 row)

skytf=> select attrelid,attname,attstattarget from pg_attribute where attrelid=14205208 and attname=('id');
attrelid | attname | attstattarget
----------+---------+---------------
14205208 | id | -1
(1 row)

备注:从上面看出,表 test_11 的列 id上的 attstattarget 参数为 -1 ,即使用了系统的参数 default_statistics_target 值 100.

创建测试表并测试数据

1
2
3
4
skytf=> create table test_11 (id integer ,name varchar(32));  
CREATE TABLE
skytf=> insert into test_11 select generate_series(1,1000),'francs';
INSERT 0 1000

表分析

1
2
skytf=> analyze test_11;  
ANALYZE

查看表 test_11 列 id 上的柱状图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
skytf=> select histogram_bounds from pg_stats where tablename='test_11' and attname='id';  

histogram_bounds
--------------------------------------------------------------------
{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,4
7,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,9
1,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,12
6,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,15
9,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,19
2,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,22
5,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,25
8,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,29
1,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,32
4,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,35
7,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,39
0,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,42
3,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,45
6,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,48
9,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,52
2,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,55
5,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,58
8,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,62
1,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,65
4,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,68
7,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,72
0,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,75
3,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,78
6,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,81
9,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,85
2,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,88
5,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,911,912,913,914,915,916,917,91
8,919,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,943,944,945,946,947,948,949,950,95
1,952,953,954,955,956,957,958,959,960,961,962,963,964,965,966,967,968,969,970,971,972,973,974,975,976,977,978,979,980,981,982,983,98
4,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999,1000}
(1 row)

备注:当default_statistics_target为 1000时,柱状图被分为 1000组。

场景二: attstattarget 100 时的场景

更改表列上的attstattarget参数

1
2
skytf=> alter table test_11 alter column id set STATISTICS 100;  
ALTER TABLE

验证下,attstattarget值已经改成100了

1
2
3
4
5
skytf=> select attrelid,attname,attstattarget from pg_attribute where attrelid=14205208 and attname=('id');  
attrelid | attname | attstattarget
----------+---------+---------------
14205208 | id | 100
(1 row)

表分析,当执行 analyze 命令后,看下结果

1
2
skytf=> analyze test_11;  
ANALYZE

再次查看柱状图

1
2
3
4
5
6
7
8
9
10
skytf=> select histogram_bounds from pg_stats where tablename='test_11' and attname='id';  

histogram_bounds

------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
{1,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,300,310,320,330,340,3
50,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,6
80,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000}
(1 row)

备注:表”test_11” 的列ID上的 histogram_bounds 已经被分成了100组。

场景三: attstattarget 10 时的场景

更改表列上的attstattarget参数

1
2
skytf=> alter table test_11 alter column id set STATISTICS 10;  
ALTER TABLE

验证下,attstattarget值已经改成10了

1
2
3
4
5
skytf=> select attrelid,attname,attstattarget from pg_attribute where attrelid=14205208 and attname=('id');  
attrelid | attname | attstattarget
----------+---------+---------------
14205208 | id | 10
(1 row)

表分析,当执行 analyze 命令后,看下结果

1
2
skytf=> analyze test_11;  
ANALYZE

再次查看柱状图

1
2
3
4
5
skytf=> select histogram_bounds from pg_stats where tablename='test_11' and attname='id';  
histogram_bounds
----------------------------------------------
{1,100,200,300,400,500,600,700,800,900,1000}
(1 row)

备注:表”test_11” 的列ID上的 histogram_bounds 已经被分成了10 组。

结论

  1. 列上的柱状图信息可以通过 “alter table set STATISTICS “来更改,并通过视图 pg_attribute.attstattarget来查看列上的当前 statistics 值。
  2. 列上的柱状图被近似均匀的分割成 “pg_attribute.attstattarget” 组。即
    当 pg_attribute.attstattarget 值为 1000时,则列上的柱状图分为 1000组,
    当 pg_attribute.attstattarget 值为 100时, 则列上的柱状图分为 100组,
    当 pg_attribute.attstattarget 值为 10时, 则列上的柱状图分为 10组。

  3. 列上 STATISTICS 值越高,则柱状图分割的组越多,柱状图信息越详细 ,但做表分析时,所花的时间也稍长。

原创文章,作者:bd101bd101,如若转载,请注明出处:https://blog.ytso.com/236371.html

(0)
上一篇 2022年1月24日
下一篇 2022年1月24日

相关推荐

发表回复

登录后才能评论