What does SELECT do in the background-kinfinger-ChinaUnix博客

kinfingerasage.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

kinfinger

博客访问： 1813363
博文数量： 335
博客积分： 4690
博客等级：上校
技术积分： 4341
用户组：普通用户
注册时间： 2010-05-08 21:38

个人简介

无聊之人--除了技术，还是技术，你懂得

文章分类

全部博文（335）

ORACEL（4）
REDIS（1）
REDIS（0）
LINUX/UNIX（4）
PHP（1）
COGNOS（1）
COBOL（4）
CICS（3）
EXCEL（2）
MYSQL（7）
DB2（72）
TWS（0）
SA（0）
mainframe（16）
web（10）

javascript（1）
APUE（43）
REXX（7）
work（13）
life（13）
python（95）
c/c++（26）
asage（7）
未分配的博文（6）

文章存档

2016年（29）

2015年（18）

2014年（7）

2013年（86）

2012年（90）

2011年（105）

我的朋友

相关博文

What does SELECT do in the background

分类： Mysql/postgreSQL

2012-11-26 16:02:13

Hi. I am new in the MariaDB (and RDBMS) world. I am writing an API layer "on top" of MariaDB and Sphinx, to help porting applications written for another (rather specific and non-standard) backend. In order to make this API layer work in a way that is not totally inefficient with MariaDB, I would need some understanding of what happens in the background when one does a SELECT. Basically, the question is : if I execute the following statement :

SELECT col1,col2,col3,col4,...,coln FROM table1 WHERE (condition) ..

and it happens that 1000 rows match the (condition), does MariaDB immediately retrieve the corresponding row data and build a full data representation of the result set in memory ? or does it wait (in some kind of "lazy" mode) until I really start retrieving the rows contents before it actually retrieves the data from the table ?

In other words, imagine that there are a lot of columns in the table, and some of them are very large (such as "medium text" or "long text"), would it be more efficient to first do a SELECT with only 1 column, and then when I am sure of the result set, do the "real" SELECT with the real columns needed ?

Or is this a stupid question in the context of MariaDB and RDBMS (and why) ?

Answer Answered by in .

Your SQL query will be place into a thread that will live the time of your connection into the database server.

That thread will parse the query and compute an execution plan (QP) and run it.

The QP ask statistics to the underlying tables and indexes to decide ( brut force) what plan cost less.

You can see the plan running :

EXPLAIN SELECT col1,col2,col3,col4,...,coln FROM table1 WHERE (condition) ..

You need to take care that your conditions are covered by an index that will promote a plan that using the index have direct access to those 1000 thousand records

Without the index the plan will push all table records to the SQL thread that will reject records that don't match your conditions. DBA are calling this a full table scan and try to limit this with good indexing.

The result will be put into a network buffer and send to your client. This is costly CPU as well so thread does not really stop working until the plan is finished.

In many case resultset need to be sorted: ORDER BY, GROUP BY, DISTINCT, that are not covered by an index. In that case the result will be fully BUFFERED into a memory table or a disk base table (MyISAM engine type) when the result can't be put in memory : sizing or data type reasons (BLOB does not like memory tables and vis versa)

if you wan't to force the server to buffer the resulset into a temporary table just add SQL_BUFFER_RESULT after the SELECT. This will free up a table lock while the resultset is being sent to the client.

在mariaDB中，默认的该参数是off

*************************** 4. row ***************************

Variable_name: sql_buffer_result

Value: OFF

也就是不缓存该结果集，

Some API calls also enable this same server buffering feature.

In MySQL your thread is living in what is call the SQL layer and your data and indexes are manage by what is call the STORAGE ENGINE layer. Each storage engine use some memory buffers to cache indexes and table data. so the first execution of your query will load data from disk and then retrive it later from memory if the all dataset feat into those buffer. Monitoring the hit ratio of those buffers is a requirement for the overall performance and setting them correctly is a must do : read about innodb_buffer_pool_size , and key_buffer. There are mainly the only variables that need to be change according to memory of your hardware.

这段不是很好理解，作者的意思是你根据hit ratio来调节这些参数

Variable_name: innodb_buffer_pool_size

Value: 415236096

*************************** 1. row ***************************

Variable_name: key_buffer_size

Value: 134217728

分别为innodb data buffer pool 与index buffer pool的大小

There is an other cache call the query cache (QC) this one stay in the SQL LAYER and will store the resultset of your query until the undelying tables get some changes. If the query come again the result will be served from the QC and not get executed again.

阅读(1273) | 评论(0) | 转发(0) |

上一篇：Compressing your data

下一篇：How to quickly insert data into MariaDB

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6