BLOB和TEXT的不同之处在于BLOB存储的是二进制的数据,不像TEXT存储的是字符数据。BLOB没有字符集和排序规则的概念,而TEXT有字符集和排序规则的概念。Blob和Text在排序上
和其他的数据类型不同,它不会对该字段的所有值进行排序,只对前max_sort_length个byte的数值进排序。如果你只需要对该属性的列的很少一部分字符来进行排序,你可以适当减小服务
器参数max_sort_length的值,或者使用order by substring(column,length)函数.
MYSQL不能对BLOB和TEXT的全部字段来进行索引,同样也不能使用索引来做排序。在这里要特别注意,当我们通过索引来优化BLOB字段来做排序,可能会于事无补。
因为MEMORY引擎不支持BLOB和TEXT类型,使用BLOB和TEXT列时,即遍只需要很少的行,也会使用到隐式的临时表,将会使用到disk MyISAM的临时表,从而带来性能的降低,这将会带来
严重的性能开销。所以最好的解决方案就是避免使用BLOB和TEXT类型,除非你真的需要使用到它们。当然,如果你不能避免,你或许可以在需要使用到BLOB和TEXT的地方,使用order by
SUBSTRING(column,length)方式,将该值转换成字符串的类型,从而能够使用in-memory临时表来进行操作。当然在这个过程中要特别注意substring返回值要尽量的短,不要大于max_heap_size
或者tmp_table_size,否则MySQL将还会使用on-disk MyISAM表.
以下是high performance mysql中一段关于使用BLOB和TEXT的注意细节部分:
On-Disk Temporary Tables and Sort Files
Because the Memory storage engine doesn’t support the BLOB and TEXT types, queries that use BLOB or TEXT columns and need an implicit temporary table will have to use ondisk
MyISAM temporary tables, even for only a few rows. (Percona Server’s Memory storage engine supports the BLOB and TEXT types, but at the time of writing, it doesn’t yet prevent
on-disk tables from being used.) This can result in a serious performance overhead. Even if you configure MySQL to store temporary tables on a RAM disk, many expensive operating
system calls will be required.
The best solution is to avoid using the BLOB and TEXT types unless you really need them. If you can’t avoid them, you may be able to use the SUBSTRING(column, length) trick
everywhere a BLOB column is mentioned (including in the ORDER BY clause) to convert the values to character strings, which will permit in-memory temporary tables. Just be sure
that you’re using a short enough substring that the temporary table doesn’t grow larger than max_heap_table_size or tmp_table_size, or MySQL will convert the table to an on-disk
MyISAM table. The worst-case length allocation also applies to sorting of values, so this trick can help with both kinds of problems: creating large temporary tables and sort files, and
creating them on disk. Here’s an example. Suppose you have a table with 10 million rows, which uses a couple of gigabytes on disk. It has a VARCHAR(1000) column with the utf8
character set. This can use up to 3 bytes per character,for a worst-case size of 3,000 bytes. If you mention this column in your ORDER BY clause, a query against the whole table
can require over 30 GB of temporary space just for the sort files! If the Extra column of EXPLAIN contains “Using temporary,” the query uses an implicit temporary table.
从上面的分析中,我们可以总结一下几点:
[1] BLOB和TEXT在使用order by的情况下,都只会使用服务器参数max_sort_length定义的字段长度(前max_sort_length)部分来进行排序。
[2] 如果可以避免使用BLOB和TEXT类型,尽量不使用该类型,以免造成隐式磁盘临时表的使用,从而带来性能开销。
[3] 尽量使用substring(column,length)来做排序,将blob或者text转换成字符类型,从而能够使用in-memory临时表来进行排序操作。
阅读(2068) | 评论(0) | 转发(0) |