IT码农一个~
分类: Mysql/postgreSQL
2008-09-19 17:50:46
一、索引分单列索引和组合索引………………………………………………………………………………………
1
二、介绍一下索引的类型………………………………………………………………………………………………….
1
三、单列索引和组合索引…………………………………………………………………………………………………. 2
索引简介
在数据库表中,使用索引可以大大提高查询速度。
All storage engines support at least 16 indexes per
table and a total index length of at least 256 bytes. Most storage engines have
higher limits.
假如我们创建了一个testIndex表:
CREATE
TABLE testIndex(i_testID INT NOT NULL,vc_Name VARCHAR(16) NOT NULL);
我们随机向里面插入了1000条记录,其中有一条
i_testID vc_Name
555 erquan
在查找vc_Name="erquan"的记录
SELECT *
FROM testIndex WHERE vc_Name='erquan';
时,如果在vc_Name上已经建立了索引,MySql无须任何扫描,即准确可找到该记录!相反,MySql会扫描所有记录,即要查询1000次啊~~可以索引将查询速度提高100倍。
一、索引分单列索引和组合索引
单列索引:即一个索引只包含单个列,一个表可以有多个单列索引,但这不是组合索引。
组合索引:即一个索包含多个列。
二、介绍一下索引的类型
1.普通索引。
这是最基本的索引,它没有任何限制。它有以下几种创建方式:
(1)创建索引:CREATE INDEX
indexName ON tableName(tableColumns(length));如果是CHAR,VARCHAR类型,length可以小于字段实际长度;如果是 BLOB 和 TEXT 类型,必须指定length,下同。
(2)修改表结构:ALTER tableName
ADD INDEX [indexName] ON (tableColumns(length))
(3)创建表的时候直接指定:CREATE TABLE
tableName ( [...], INDEX [indexName] (tableColumns(length)) ;
2.唯一索引。
它与前面的"普通索引"类似,不同的就是:索引列的值必须唯一,但允许有空值。如果是组合索引,则列值的组合必须唯一。它有以下几种创建方式:
(1)创建索引:CREATE UNIQUE
INDEX indexName ON tableName(tableColumns(length))
(2)修改表结构:ALTER tableName
ADD UNIQUE [indexName] ON (tableColumns(length))
(3)创建表的时候直接指定:CREATE TABLE
tableName ( [...], UNIQUE [indexName] (tableColumns(length));
3.主键索引
它是一种特殊的唯一索引,不允许有空值。一般是在建表的时候同时创建主键索引:CREATE
TABLE testIndex(i_testID INT NOT NULL AUTO_INCREMENT,vc_Name VARCHAR(16) NOT
NULL,PRIMARY KEY(i_testID)); 当然也可以用ALTER命令。
记住:一个表只能有一个主键。
4.全文索引
MySQL从3.23.23版开始支持全文索引和全文检索。这里不作讨论,呵呵~~
删除索引的语法:DROP INDEX index_name ON tableName
三、单列索引和组合索引
为了形象地对比两者,再建一个表:
CREATE TABLE myIndex ( i_testID INT NOT NULL
AUTO_INCREMENT, vc_Name VARCHAR(50) NOT NULL, vc_City VARCHAR(50) NOT NULL,
i_Age INT NOT NULL, i_SchoolID INT NOT NULL, PRIMARY KEY (i_testID) );
在这10000条记录里面7上8下地分布了5条vc_Name="erquan"的记录,只不过city,age,school的组合各不相同。
来看这条T-SQL:
SELECT i_testID FROM myIndex WHERE vc_Name='erquan' AND vc_City='郑州' AND i_Age=25;
首先考虑建单列索引:
在vc_Name列上建立了索引。执行T-SQL时,MYSQL很快将目标锁定在了vc_Name=erquan的5条记录上,取出来放到一中间
结果集。在这个结果集里,先排除掉vc_City不等于"郑州"的记录,再排除i_Age不等于25的记录,最后筛选出唯一的符合条件的记录。
虽然在vc_Name上建立了索引,查询时MYSQL不用扫描整张表,效率有所提高,但离我们的要求还有一定的距离。同样的,在vc_City和i_Age分别建立的单列索引的效率相似。
为了进一步榨取MySQL的效率,就要考虑建立组合索引。就是将vc_Name,vc_City,i_Age建到一个索引里:
ALTER TABLE myIndex ADD INDEX name_city_age
(vc_Name(10),vc_City,i_Age);--注意了,建表时,vc_Name长度为50,这里为什么用10呢?因为一般情况下名字的长 度不会超过10,这样会加速索引查询速度,还会减少索引文件的大小,提高INSERT的更新速度。
执行T-SQL时,MySQL无须扫描任何记录就到找到唯一的记录!!
肯定有人要问了,如果分别在vc_Name,vc_City,i_Age上建立单列索引,让该表有3个单列索引,查询时和上述的组合索引效率一样
吧?嘿嘿,大不一样,远远低于我们的组合索引~~虽然此时有了三个索引,但MySQL只能用到其中的那个它认为似乎是最有效率的单列索引。
建立这样的组合索引,其实是相当于分别建立了
vc_Name,vc_City,i_Age
vc_Name,vc_City
vc_Name
这样的三个组合索引!为什么没有vc_City,i_Age等这样的组合索引呢?这是因为mysql组合索引"最左前缀"的结果。简单的理解就是只从最左面的开始组合。并不是只要包含这三列的查询都会用到该组合索引,下面的几个T-SQL会用到:
SELECT * FROM myIndex WHREE vc_Name="erquan"
AND vc_City="郑州"
SELECT * FROM myIndex WHREE vc_Name="erquan"
而下面几个则不会用到:
SELECT * FROM myIndex WHREE i_Age=20 AND vc_City="郑州"
SELECT * FROM myIndex WHREE vc_City="郑州"
Another example:
Suppose that a table has the following specification:
CREATE
TABLE test (
id
INT NOT NULL,
last_name
CHAR(30) NOT NULL,
first_name CHAR(30) NOT NULL,
PRIMARY KEY (id),
INDEX name (last_name,first_name)
);
The name index is an index over the last_name and first_name columns. The index can be used for queries that specify
values in a known range for last_name, or for both last_name and first_name. Therefore, the name index is used in
the following queries:
SELECT
* FROM test WHERE last_name='Widenius';
SELECT
* FROM test
WHERE last_name='Widenius' AND
first_name='Michael';
SELECT
* FROM test
WHERE last_name='Widenius'
AND (first_name='Michael' OR
first_name='Monty');
SELECT
* FROM test
WHERE last_name='Widenius'
AND first_name >='M' AND first_name <
'N';
However, the name index is not
used in the following queries:
SELECT * FROM test
WHERE first_name='Michael';
SELECT * FROM test
WHERE last_name='Widenius' OR
first_name='Michael';
到此你应该会建立、使用索引了吧?但什么情况下需要建立索引呢?一般来说,在WHERE和JOIN中出现的列需要建立索引,但也不完全如此,因为 MySQL只对 <,<=,=,>,>=,BETWEEN,IN,以及某些时候的LIKE(后面有说明)才会使用索引。
SELECT t.vc_Name FROM testIndex t LEFT JOIN myIndex m
ON t.vc_Name=m.vc_Name WHERE m.i_Age=20 AND m.vc_City='郑州' 时,有对myIndex表的vc_City和i_Age建立索引的需要,由于testIndex表的vc_Name开出 现在了JOIN子句中,也有对它建立索引的必要。
刚才提到了,只有某些时候的LIKE才需建立索引?是的。因为在以通配符 % 和 _ 开头作查询时,MySQL不会使用索引,如
SELECT * FROM myIndex WHERE vc_Name like'erquan%'
会使用索引,而
SELECT * FROM
myIndex WHEREt vc_Name like'%erquan'
就不会使用索引了。
五、索引的不足之处
上面说了那么多索引的好话,它真的有像传说中那么优秀么?当然会有缺点了。
1.虽然索引大大提高了查询速度,同时却会降低更新表的速度,如对表进行INSERT、UPDATE和DELETE。因为更新表时,MySQL不仅要保存数据,还要保存一下索引文件
2.建立索引会占用磁盘空间的索引文件。一般情况这个问题不太严重,但如果你在一个大表上创建了多种组合索引,索引文件的会膨胀很快。
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX [ ON [
|
|
USING {BTREE | HASH | RTREE}
|
CREATE INDEX
is mapped
to an ALTER TABLE
statement to create indexes. See Section 12.1.2, “ALTER TABLE
Syntax”. CREATE INDEX
cannot be used to create a PRIMARY KEY
; use ALTER TABLE
instead. For more information about indexes, see Section 7.4.5, “How MySQL Uses
Indexes”.
Normally, you create all indexes on a table at the time the
table itself is created with CREATE TABLE
. See Section 12.1.5, “CREATE TABLE
Syntax”. CREATE INDEX
enables you to add indexes to existing tables.
A column list of the form (col1,col2,...)
creates a multiple-column index. Index values are formed by concatenating the
values of the given columns.
Indexes can be created that use only the leading part of
column values, using col_name(length)
syntax to
specify an index prefix length:
·
Prefixes can be specified for CHAR
,
VARCHAR
, BINARY
, and VARBINARY
columns.
·
BLOB
and TEXT
columns also can be indexed, but a prefix
length must
be given.
·
Prefix lengths are given in
characters for non-binary string types and in bytes for binary string types.
That is, index entries consist of the first length
characters of each column value for CHAR
, VARCHAR
,
and TEXT
columns, and the first length
bytes
of each column value for BINARY
, VARBINARY
, and BLOB
columns.
· For spatial columns, prefix values can be given as described later in this section.
The statement shown here creates an index using the first
10 characters of the name
column:
CREATE INDEX part_of_name ON customer (name(10));
If names in the column usually differ in the first 10
characters, this index should not be much slower than an index created from the
entire name
column. Also, using column prefixes for indexes can
make the index file much smaller, which could save a lot of disk space and
might also speed up INSERT
operations.
Prefix lengths are storage engine-dependent (for example, a
prefix can be up to 1000 bytes long for MyISAM
tables, 767 bytes
for InnoDB
tables). Note that prefix limits are measured in bytes,
whereas the prefix length in CREATE INDEX
statements is
interpreted as number of characters for non-binary data types (CHAR
,
VARCHAR
, TEXT
). Take this into account when
specifying a prefix length for a column that uses a multi-byte character set.
For example, utf8
columns require up to three index bytes per
character.
A UNIQUE
index creates a constraint such that
all values in the index must be distinct. An error occurs if you try to add a
new row with a key value that matches an existing row. This constraint does not
apply to NULL
values except for the BDB
storage
engine. For other engines, a UNIQUE
index allows multiple NULL
values for columns that can contain NULL
. If you specify a prefix
value for a column in a UNIQUE
index, the column values must be
unique within the prefix.
FULLTEXT
indexes are
supported only for MyISAM
tables and can include only CHAR
,
VARCHAR
, and TEXT
columns. Indexing always happens
over the entire column; column prefix indexing is not supported and any prefix
length is ignored if specified. See Section 11.8, “Full-Text
Search Functions”, for details of operation.
The MyISAM
, InnoDB
, NDB
,
BDB
, and ARCHIVE
storage engines support spatial
columns such as (POINT
and GEOMETRY
. (Chapter 20, Spatial
Extensions, describes the spatial data types.) However, support for spatial
column indexing varies among engines. Spatial and non-spatial indexes are
available according to the following rules.
Spatial indexes (created using SPATIAL INDEX
):
·
Available only for MyISAM
tables. Specifying a SPATIAL INDEX
for other storage engines
results in an error.
·
Indexed columns must be NOT
NULL
.
·
In MySQL 5.0, the full width of
each column is indexed by default, but column prefix lengths are allowed.
However, as of MySQL 5.0.40, the length is not displayed in SHOW CREATE
TABLE
output. mysqldump uses that statement. As of that version, if a
table with SPATIAL
indexes containing prefixed columns is dumped
and reloaded, the index is created with no prefixes. (The full column width of
each column is indexed.)
Non-spatial indexes (created with INDEX
, UNIQUE
,
or PRIMARY KEY
):
·
Allowed for any storage engine
that supports spatial columns except ARCHIVE
.
·
Columns can be NULL
unless the index is a primary key.
·
For each spatial column in a
non-SPATIAL
index except POINT
columns, a column
prefix length must be specified. (This is the same requirement as for indexed BLOB
columns.) The prefix length is given in bytes.
·
The index type for a non-SPATIAL
index depends on the storage engine. Currently, B-tree is used.
In MySQL 5.0:
·
You can add an index on a
column that can have NULL
values only if you are using the MyISAM
,
InnoDB
, BDB
, or MEMORY
storage engine.
·
You can add an index on a BLOB
or TEXT
column only if you are using the MyISAM
, BDB
,
or InnoDB
storage engine.
An index_col_name
specification can end
with ASC
or DESC
. These keywords are allowed for
future extensions for specifying ascending or descending index value storage.
Currently, they are parsed but ignored; index values are always stored in
ascending order.
Some storage engines allow you to specify an index type when creating an index. The allowable index type values supported by different storage engines are shown in the following table. Where multiple index types are listed, the first one is the default when no index type specifier is given.
Storage Engine |
Allowable Index Types |
|
|
|
|
|
|
|
|
BTREE
indexes are
implemented by the NDBCLUSTER
storage engine as T-tree indexes.
For indexes on NDBCLUSTER
table columns, the USING
clause can be specified only for a unique index or primary key. In such cases,
the USING HASH
clause prevents the creation of an implicit ordered
index. Without USING HASH
, a statement defining a unique index or
primary key automatically results in the creation of a HASH
index
in addition to the ordered index, both of which index the same set of columns.
The RTREE
index type is allowable only for SPATIAL
indexes.
If you specify an index type that is not legal for a given storage engine, but there is another index type available that the engine can use without affecting query results, the engine uses the available type.
Examples:
CREATE TABLE lookup (id INT) ENGINE = MEMORY;
CREATE INDEX id_index USING BTREE ON lookup (id);
TYPE type_name
is recognized as a synonym for USING type_name
. However, USING
is the preferred form.
Before MySQL 5.0.60, the index_type
option can
be given only before the ON tbl_name
clause. Use of the
option in this position is deprecated as of 5.0.60; support for it is to be
dropped in a future MySQL release. As of 5.0.60, the option should be given
following the index column list. If an index_type
option is given
in both the earlier and later positions, the final option applies.
Previous / Next / Up / Table of Contents