分类: LINUX
2009-03-05 11:44:59
In one of the last commits I added a SQL Tokenizer which understands the basic tokens of (My)SQL:
With this basic understanding we can normalize Queries and build statistics over similar queries.
The idea is simple and already implemented in mysqldumpslow
:
/* login.php:37 */SELECT * FROM tbl WHERE id = 1243 AND name = "jan"
is turned into
SELECT * FROM `tbl` WHERE `id` = ? AND `name` = ?
The queries look like prepared statements now and can be used the characterize queries of the same kind.
Taking the famous world-db and executing some simple queries like:
root@127.0.0.1:4040 [world]> select * from City where Id = 1; +----+-------+-------------+----------+------------+ | ID | Name | CountryCode | District | Population | +----+-------+-------------+----------+------------+ | 1 | Kabul | AFG | Kabol | 1780000 | +----+-------+-------------+----------+------------+
Now you can fetch the stats from the proxy:
root@127.0.0.1:4040 [world]> select * from histogram.tables; +------------+-------+--------+ | table | reads | writes | +------------+-------+--------+ | world.City | 5 | 0 | +------------+-------+--------+ 1 row in set (0.00 sec) root@127.0.0.1:4040 [world]> select * from histogram.queries; +--------------------------------------+-------+----------------+----------------+ | query | count | max_query_time | avg_query_time | +--------------------------------------+-------+----------------+----------------+ | SELECT DATABASE ( ) | 1 | 432 | 432 | | SHOW `tables` | 1 | 589 | 589 | | SELECT * FROM `City` WHERE `Id` = ? | 5 | 147410 | 30063.6 | +--------------------------------------+-------+----------------+----------------+
max_query_time
and avg_query_time
are both in micro-seconds.
With this data you can now start analyzing the Queries which took most of the time or try to find the tables which would make sense to cache, because they have a high read/write ratio.
There are some more statements already implemented to maintain those stat-counters:
delete from histogram.tables
to clean the statsset GLOBAL histogram.tables = 0
to disable the tracingAll those statements are passed to the proxy and answered by it directly.
As side-effect of this query-stats I started to write some lua-libraries which will make handling the packets a lot easier. But that's for another article.