WHERE Clause Optimization WHERE Clause Optimization WHERE Clause Optimization

From MySQL官网

This section discusses optimizations that can be made for processing WHERE clauses. The examples use SELECT statements, but the same optimizations apply for WHERE clauses in DELETE and UPDATE statements.
这个章节讨论了可以用来处理WHERE子句的优化。 这些例子使用的是SELECT语句,但是这相同的优化同样适用于DELETE和UPDATE中的WHERE子句。

Because work on the MySQL optimizer is ongoing, not all of the optimizations that MySQL performs are documented here. 注意 因为MySQL优化的工作还在进行中,不是所有的MySQL执行的优化都会在这篇文档中。

You might be tempted to rewrite your queries to make arithmetic operations faster, while sacrificing readability. Because MySQL does similar optimizations automatically, you can often avoid this work, and leave the query in a more understandable and maintainable form. Some of the optimizations performed by MySQL follow:

你或许有兴趣重写你的查询来使算术操作更快些,然而牺牲的是可读性。 因为MySQL自动做了类似的优化,你可以经常避免这些个工作,并行使查询语言保持一个更加容易理解的容易维护的形式。 一些MySQL执行的优化如下:

  • Removal of unnecessary parentheses:
((a AND b) AND c OR (((a AND b) AND (c AND d))))  
-> (a AND b AND c) OR (a AND b AND c AND d)  
  • Constant folding:
 (a<b AND b=c) AND a=5
-> b>5 AND b=c AND a=5
  • Constant condition removal (needed because of constant folding):
   (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
-> B=5 OR B=6
  • Constant expressions used by indexes are evaluated only once.

  • COUNT() on a single table without a WHERE is retrieved directly from the table information for MyISAM and MEMORY tables. This is also done for any NOT NULL expression when used with only one table. COUNT() 用在单表上并且不用WHERE的时候,从table infomation中直接获取对于 MyISAM 和 MEMORY 表。 对于任何 NOT NULL expression 非空表达式用在单表的时候,这种情况仍然适用。 (实事求是第二句不是太理解)

  • Early detection of invalid constant expressions. MySQL quickly detects that some SELECT statements are impossible and returns no rows.

  • HAVING is merged with WHERE if you do not use GROUP BY or aggregate functions (COUNT(), MIN(), and so on).
    HAVING 和 WHERE 合并,当你在不使用 GROUP BY 或者聚合函数(COUNT(), MIN(), and so on)的时候

  • For each table in a join, a simpler WHERE is constructed to get a fast WHERE evaluation for the table and also to skip rows as soon as possible.
    对于join的每一个表,越简单的 WHERE 语句,评估速度越快对于该表,并且尽可能快的跳过行。

  • All constant tables are read first before any other tables in the query. A constant table is any of the following:

    • An empty table or a table with one row.

    • A table that is used with a WHERE clause on a PRIMARY KEY or a UNIQUE index, where all index parts are compared to constant expressions and are defined as NOT NULL.
      一个表,被用在和 WHERE 字句作用于 主键或者唯一的索引上, 并且所有的索引部分是跟一个常量相比较的表达式,并且这些索引都定义了非空。

    All of the following tables are used as constant tables:

SELECT * FROM t WHERE primary_key=1;  
SELECT * FROM t1,t2  
  WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
  • The best join combination for joining the tables is found by trying all possibilities. If all columns in ORDER BY and GROUP BY clauses come from the same table, that table is preferred first when joining.
    对于表联合最好的join组合是通过尝试所有的可能才被发现的。 如果所有的在ORDER BY 和 GROUP BY 的列来自于相同的表,这个表在联合时被优选。

  • If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.
    如果有一个 ORDER BY 子句和一个不同的 GROUP BY 子句, 或者 如果 ORDER BY 或者 GROUP BY 中 包含有不是join队列里的第一个表的列,一个临时表就会被创建。

  • If you use the SQLSMALLRESULT modifier, MySQL uses an in-memory temporary table.
    如果你使用 SQLSMALLRESULT 修饰符, MySQL 使用一个内存临时表。

  • Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size.
    每一个表索引被查询, 并且最好的索引被使用 除非优化器相信这比使用表扫描更有效率。 曾经,一个扫描被使用 是基于 是否这个最好的索引跨越了这个表的30%,但是这个固定的百分比不再决定使用索引还是扫描的选择。这个优化器现在更加复杂而且以它在附加因素比如表大小,表行数,I/O 块大小的估计作为基础。

  • In some cases, MySQL can read rows from the index without even consulting the data file. If all columns used from the index are numeric, only the index tree is used to resolve the query.

  • Before each row is output, those that do not match the HAVING clause are skipped.
    在每一行被输出之前,这些不匹配 HAVING 子句的将会被跳过。

Some examples of queries that are very fast:

SELECT COUNT(*) FROM tbl_name;

SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;

SELECT MAX(key_part2) FROM tbl_name  
  WHERE key_part1=constant;

SELECT ... FROM tbl_name  
  ORDER BY key_part1,key_part2,... LIMIT 10;

SELECT ... FROM tbl_name  
  ORDER BY key_part1 DESC, key_part2 DESC, ... LIMIT 10;

MySQL resolves the following queries using only the index tree, assuming that the indexed columns are numeric:
MySQL 解决一下查询只使用索引树,假设索引列是数值的:

SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;

SELECT COUNT(*) FROM tbl_name  
  WHERE key_part1=val1 AND key_part2=val2;

SELECT key_part2 FROM tbl_name GROUP BY key_part1;  

The following queries use indexing to retrieve the rows in sorted order without a separate sorting pass:

SELECT ... FROM tbl_name  
  ORDER BY key_part1,key_part2,... ;

SELECT ... FROM tbl_name  
  ORDER BY key_part1 DESC, key_part2 DESC, ... ;

Related Article