因为我们使用的是关系型数据库,每张表表示的都是独立的单元(对象),而该单元(对象)所涉及到的其他信息通常都存储在其他表中,例如:
MariaDB [world]> DESC city; +-------------+----------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------+----------+------+-----+---------+----------------+ | ID | int(11) | NO | PRI | NULL | auto_increment | | Name | char(35) | NO | | | | | CountryCode | char(3) | NO | MUL | | | | District | char(20) | NO | | | | | Population | int(11) | NO | | 0 | | +-------------+----------+------+-----+---------+----------------+ 5 rows in set (0.61 sec) MariaDB [world]> DESC countrylanguage; +-------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------+---------------+------+-----+---------+-------+ | CountryCode | char(3) | NO | PRI | | | | Language | char(30) | NO | PRI | | | | IsOfficial | enum('T','F') | NO | | F | | | Percentage | float(4,1) | NO | | 0.0 | | +-------------+---------------+------+-----+---------+-------+ 4 rows in set (0.06 sec)
比如其上两张表,我们想知道某一城市所使用的语言,就可以分为两个步骤:
1.在City表中查询该城市的CountryCode。
2.使用查询到的这个CountryCode在CountryLanguage表中查询该国家所使用的语言。
虽然,可以分两步完成,但是,需要两次查询和两次传输,在带宽和性能的对比下,我们更希望让Mysql(MariaDB)来帮助我们完成这件事不是吗?
连接(JOIN):也叫连结,是指将两张表按照一定规则连成一张表,将两张表中不同的数据(行)连成一行来看待。
又可以将连接分为如下几类:
- 内连接
-
外连接
- 左外连接
- 右外连接
- 交叉连接
在连接查询中,一个列可能出现在多张表中,为了避免引起歧义,通常在列名前面加上表名或表别名作为前缀(例:s.sid、x.sid)—使用表别名作为前缀,可以使得SQL代码较短,使用的内存更少(例:stu s,xuanke as x)。
内连接语法如下:
SELECT tb1_name.column,tb2_name.column FROM tb1 INNER JOIN tb2 ON 约束条件; SELECT tb1_name.column,tb2_name.column FROM tb1,tb2 WHERE 约束条件;
查询每一个城市可能使用的语言有哪些:
MariaDB [world]> SELECT Name,District,Language FROM city,countrylanguage WHERE city.CountryCode = countrylanguage.CountryCode LIMIT 10; +----------+----------+------------+ | Name | District | Language | +----------+----------+------------+ | Kabul | Kabol | Balochi | | Kabul | Kabol | Dari | | Kabul | Kabol | Pashto | | Kabul | Kabol | Turkmenian | | Kabul | Kabol | Uzbek | | Qandahar | Qandahar | Balochi | | Qandahar | Qandahar | Dari | | Qandahar | Qandahar | Pashto | | Qandahar | Qandahar | Turkmenian | | Qandahar | Qandahar | Uzbek | +----------+----------+------------+ 10 rows in set (0.00 sec)
内连接是怎样工作的
我们来看一下,这些数据是怎么连接起来的,具体可以看如下这张图(放大看):
所以所谓内连接,就是仅将多表中符合条件的行进行连接且返回结果。
比如这样,就将三张表连接了起来:
MariaDB [world]> SELECT * FROM city INNER JOIN countrylanguage INNER JOIN country ON city.CountryCode = countrylanguage.CountryCode AND city.CountryCode = country.Code WHERE city.Name='Kabul'/G; *************************** 1. row *************************** ID: 1 Name: Kabul CountryCode: AFG District: Kabol Population: 1780000 CountryCode: AFG Language: Balochi IsOfficial: F Percentage: 0.9 Code: AFG Name: Afghanistan Continent: Asia Region: Southern and Central Asia SurfaceArea: 652090.00 IndepYear: 1919 Population: 22720000 LifeExpectancy: 45.9 GNP: 5976.00 GNPOld: NULL LocalName: Afganistan/Afqanestan GovernmentForm: Islamic Emirate HeadOfState: Mohammad Omar Capital: 1 Code2: AF ....仅截取了第一条记录 5 rows in set (0.01 sec)
这里比较推荐SQL的标准写法,也就是如下格式:
SELECT tb1_name.column,tb2_name.column FROM tb1 INNER JOIN tb2 ON 约束条件;
为什么呢?因为在ON子句后还可以跟WHERE子句多连接出来的表进行过滤呀,且此语法结构更清晰不是吗?
使用内连接会将多表中符合条件的行连接到一起,而不符合条件的行则忽略,而外连接则会将一些不符合条件的行也输出出来。
例如,我们有如下数据:
MariaDB [world]> SELECT * FROM user; +----+-------+----------+---------------------+--------+ | id | name | password | regtime | deptid | +----+-------+----------+---------------------+--------+ | 1 | test | test | 2018-03-05 17:25:26 | 1 | | 2 | test1 | test1 | 2018-03-05 17:25:26 | 1 | | 3 | lucy | lucy | 2018-03-05 17:25:26 | 2 | | 4 | mars | mars | 2018-03-05 17:25:26 | 3 | | 5 | mark | mark | 2018-03-05 17:26:05 | NULL | +----+-------+----------+---------------------+--------+ 5 rows in set (0.01 sec) MariaDB [world]> SELECT * FROM department; +----+------------+---------+----------+ | id | name | comment | adminids | +----+------------+---------+----------+ | 1 | Sales | NULL | NULL | | 2 | Tech | NULL | NULL | | 3 | administra | NULL | NULL | | 4 | Secretaria | NULL | NULL | +----+------------+---------+----------+ 4 rows in set (0.01 sec) //其中deptid是用户所属部门的编号
我们有如下需求,显示用户及用户所在部门名称,根据我们上面所说的内连接,我们可以写出如下语句:
MariaDB [world]> SELECT user.id,user.name,department.name FROM user INNER JOIN department ON user.deptid = department.id; +----+-------+------------+ | id | name | name | +----+-------+------------+ | 1 | test | Sales | | 2 | test1 | Sales | | 3 | lucy | Tech | | 4 | mars | administra | +----+-------+------------+ 4 rows in set (0.14 sec)
但是,结果对吗?虽说我们的mark先生还没有被分到任何部门,但是也不能不显示人家了吧?
这时候,外连接就派上用场了:
在JOIN左面的表叫左表,而在右面的表叫右表
左外连接,FROM tb1_name LEFT OUTER JOIN tb2_name
除将符合条件的行显示出来,还显示左表的全部行,而右表的字段拼接过去全为NULL。如下所示:
MariaDB [world]> SELECT * FROM user LEFT OUTER JOIN department ON user.deptid = department.id; +----+-------+----------+---------------------+--------+------+------------+---------+----------+ | id | name | password | regtime | deptid | id | name | comment | adminids | +----+-------+----------+---------------------+--------+------+------------+---------+----------+ | 1 | test | test | 2018-03-05 17:25:26 | 1 | 1 | Sales | NULL | NULL | | 2 | test1 | test1 | 2018-03-05 17:25:26 | 1 | 1 | Sales | NULL | NULL | | 3 | lucy | lucy | 2018-03-05 17:25:26 | 2 | 2 | Tech | NULL | NULL | | 4 | mars | mars | 2018-03-05 17:25:26 | 3 | 3 | administra | NULL | NULL | | 5 | mark | mark | 2018-03-05 17:26:05 | NULL | NULL | NULL | NULL | NULL | +----+-------+----------+---------------------+--------+------+------------+---------+----------+ 5 rows in set (0.00 sec)
右外连接,FROM tb1_name RIGHT OUTER JOIN tb2_name
顾名思义,就是显示右表的所有行,而未符合连接条件的行,左表字段全为NULL,如下所示:
MariaDB [world]> SELECT * FROM user RIGHT OUTER JOIN department ON user.deptid = department.id; +------+-------+----------+---------------------+--------+----+------------+---------+----------+ | id | name | password | regtime | deptid | id | name | comment | adminids | +------+-------+----------+---------------------+--------+----+------------+---------+----------+ | 1 | test | test | 2018-03-05 17:25:26 | 1 | 1 | Sales | NULL | NULL | | 2 | test1 | test1 | 2018-03-05 17:25:26 | 1 | 1 | Sales | NULL | NULL | | 3 | lucy | lucy | 2018-03-05 17:25:26 | 2 | 2 | Tech | NULL | NULL | | 4 | mars | mars | 2018-03-05 17:25:26 | 3 | 3 | administra | NULL | NULL | | NULL | NULL | NULL | NULL | NULL | 4 | Secretaria | NULL | NULL | +------+-------+----------+---------------------+--------+----+------------+---------+----------+ 5 rows in set (0.00 sec)
当没有连接条件的表进行连接的结果为笛卡儿积,检索出的行的数目将是第一个表中的行数乘以第二个表中的行数,如下图所示:
如果有使用笛卡尔积的必要时,可以使用交叉连接(CROSS JOIN)如下例所示:
MariaDB [world]> SELECT user.Name,department.name FROM user CROSS JOIN department; +-------+------------+ | Name | name | +-------+------------+ | test | Sales | | test | Tech | | test | administra | | test | Secretaria | | test1 | Sales | | test1 | Tech | | test1 | administra | | test1 | Secretaria | | lucy | Sales | | lucy | Tech | | lucy | administra | | lucy | Secretaria | | mars | Sales | | mars | Tech | | mars | administra | | mars | Secretaria | | mark | Sales | | mark | Tech | | mark | administra | | mark | Secretaria | +-------+------------+ 20 rows in set (0.00 sec)
当我们的想要过滤多表连接查询结果时,我们可以将过滤条件放在ON子句或者WHERE子句,ON子句和WHERE子句得到的结果可能会不太一样。
过滤条件放ON子句:使用AND逻辑与操作将过滤条件放在连接条件前或后->在连接前进行条件过滤。
过滤条件放WHERE子句:使用单独的WHERE子句进行数据过滤->在连接后进行条件过滤。
对于内连接而言,过滤条件放在ON子句或WHERE子句是相同的,比较推荐在ON子句过滤。
而对于外连接而言,有以下情况参考:
//过滤条件放连接条件前或后 MariaDB [world]> SELECT user.name,department.name FROM user LEFT OUTER JOIN department ON user.name='mars' AND user.deptid = department.id; MariaDB [world]> SELECT user.name,department.name FROM user LEFT OUTER JOIN department ON user.deptid = department.id AND user.name='mars'; +-------+------------+ | name | name | +-------+------------+ | test | NULL | | test1 | NULL | | lucy | NULL | | mars | administra | | mark | NULL | +-------+------------+ 5 rows in set (0.00 sec) //因为ON user.name='mars'会将左表变为一条数据,但AND要求第二个表达式也为真,user.deptid = department.id;这条又仅过滤了mars的deptid和其部门表中对应的id,但左连接又要求左表显示所有数据,所以右表字段为NULL //过滤条件放WHERE子句,因为是连接后进行过滤,就是说对连接生成的这个新表过滤,所以只会显示符合条件的这条数据。 MariaDB [world]> SELECT user.name,department.name FROM user LEFT OUTER JOIN department ON user.deptid = department.id WHERE user.name = 'mars'; +------+------------+ | name | name | +------+------------+ | mars | administra | +------+------------+ 1 row in set (0.00 sec)
在多表连接查询时,通常会对表进行重命名操作,与列的重命名一样使用AS关键字,对表重命名主要是引用表时使用方便。
如下所示,对user表重命名为U,对department重命名为D:
MariaDB [world]> SELECT U.name,D.name FROM user AS U LEFT OUTER JOIN department AS D ON U.deptid = D.id; +-------+------------+ | name | name | +-------+------------+ | test | Sales | | test1 | Sales | | lucy | Tech | | mars | administra | | mark | NULL | +-------+------------+ 5 rows in set (0.01 sec)
多表连接查询说白了就是产生一张临时的新表,所以使用分组和聚合函数就像平常一样简单,参考如下例子:
统计每个部门的人数:
MariaDB [world]> SELECT D.name,COUNT(U.name) FROM user AS U LEFT OUTER JOIN department AS D ON U.deptid = D.id GROUP BY D.name; +------------+---------------+ | name | COUNT(U.name) | +------------+---------------+ | NULL | 1 | | administra | 1 | | Sales | 2 | | Tech | 1 | +------------+---------------+ 4 rows in set (0.00 sec)
统计每个城市所能说的官方语言的数量:
MariaDB [world]> SELECT C.Name,COUNT(CL.Language) FROM city AS C INNER JOIN countrylanguage AS CL ON C.CountryCode = CL.CountryCode AND CL.IsOfficial = 'T' GROUP BY C.Name; +-------------------------+--------------------+ | Name | COUNT(CL.Language) | +-------------------------+--------------------+ | A Coruña (La Coruña) | 1 | | Aachen | 1 | ................................................ | Alicante [Alacant] | 1 | | Aligarh | 1 | +-------------------------+--------------------+
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/linux/118739.html