Query optimization is a key feature in all databases. In this blog, I try to offer my own experiences with query optimization. This blog explains the methods for achieving query optimization in MySQL along with directions on how one can do it.

Topics:

1. Introduction
2. importance of MySQL Query Optimization.
3. How to Optimize MySQL Queries for Better Performance?

  • Indexes
  • Avoid ‘*’ in select query
  • Joins
  • Limit
  • Use UNION Instead of OR
  • Avoid Using Wildcards at the Start of LIKE Queries
  • Avoid Using Functions in WHERE

Conclusion

References

1. Introduction:

Query optimization is a feature of many relational database management systems and other databases such as No SQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans. Generally, the query optimizer cannot be accessed directly by users:- once queries are submitted to the database server, and parsed by the parser, they are then passed to the query optimizer where optimization occurs.

MySQL query refers to an SQL statement that directs the database to execute specific operations such as retrieving, inserting, updating, or deleting data.

Query optimization involves improving the execution speed of the SQL queries to ensure faster response times and efficient resource utilization.

2. Importance of MySQL Query Optimization

MySQL Query optimization is crucial for enhancing data retrieval speed and efficiency, directly impacting the application’s overall performance and success.

  • Improved Performance: Optimized queries execute faster, reducing response times for your applications. This enhanced performance leads to a smoother user experience and higher customer satisfaction.
  • Scalability: As your application grows and handles larger data volumes, optimized queries ensure that the database can efficiently handle the increased load without sacrificing performance.
  • Resource Utilization: Efficient queries consume fewer server resources, such as CPU and memory, which lowers infrastructure costs.
  • Reduced Downtime: Enhancing queries minimizes the risk of performance bottlenecks and potential crashes, leading to improved system stability and reduced downtime.
  • Faster Development: Efficient queries lead to shorter development cycles, as developers spend less time troubleshooting slow queries and can focus on building new features and functionalities.
  • Improved User Experience: Faster data retrieval and processing times lead to a more responsive application, keeping users engaged and reducing bounce rates.
  • Database Maintenance: Well-designed queries simplify database maintenance tasks, making it easier to manage and monitor the MySQL database.
  • Cost Savings: Efficient queries can lead to cost savings, as they reduce hardware requirements, optimize server usage, and improve overall system performance.
  • Competitive Advantage: In a highly competitive market, faster application performance can give your business a competitive edge, attracting and retaining customers.
  • Handling High Traffic: For web applications facing heavy user traffic, optimization ensures that the system can handle a high number of concurrent queries without compromising performance.
  • Future-Proofing: Optimized queries can adapt to changing data patterns and growing workloads, ensuring that your application remains responsive and reliable in the long run.

3. How to Optimize MySQL Queries for Better Performance?

Explain:

The EXPLAIN command can help you identify performance issues in your queries by showing you how MySQL is executing them. This can help you identify slow queries and optimize them.

See the below query with an explain output

mysql> explain select id,name from employee_table where name = ‘test2’;
+—-+————-+——————-+————+——-+—————–+———+———+————+——+———-+————-+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+——————-+————+——-+—————–+——+————-+———-+——+———-+—————+
| 1 | SIMPLE | employee_table | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where |
+—-+————-+———————————+——-+——————+——–+———+——+——+———-+——————–+


Explanation of each column in the EXPLAIN plan:

id – The query ID.

select_type – The type of query (e.g., SIMPLE, PRIMARY, DERIVED).

select_typeDefinition
SimpleA Simple query without any sub queries or unions
PrimaryThe primary query in a query with sub queries.
DerivedA derived table or sub query in the FROM clause.

Table – The table being accessed.

Type – The join type (e.g., ALL, eq_ref, ref, range, index, const).

Type
Definition

Description

Performance
ALLFull table scan.The database scans every row in the table.Poor (especially for large tables).
eq_refEquality joins using a primary key or unique index.Equality joins using a primary key or unique index.Excellent.
refNon-unique index join.The database uses a non-unique index to join tables.Good.
rangeIndex scan using a range of values.The database scans an index within a specific range.Good.
indexFull index scan.The database scans every entry in the index.Fair.
constConstant value lookup.The database uses a constant value to access a single row.Excellent.

possible_keys – The indexes that can be used.

Key – The index used.

key_len – The length of the index used.

Ref – The columns or constants being used with the index.

Rows – The estimated number of rows to be examined.

Indexes:

Indexes can significantly improve the performance of your queries by allowing MySQL to locate and retrieve the data more quickly.

mysql> explain select id,name from employee_table where name = ‘test2’;

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

| 1 | SIMPLE | employee_table | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where |

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

See the above output; the type column is ALL, i.e., the query was scanned to the full table. So, we need to create an index in the where condition column, reducing the query cost. See the below output. After we created the index, the type column shows the output ref, and the key also choose the proper index.

mysql> explain select id,name from employee_table where name = ‘test2’;

+—-+————-+—————-+————+——+—————+———-+———+——-+——+———-+————-+

| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+—-+————-+—————-+————+——+—————+———-+———+——-+——+———-+————-+

| 1 | SIMPLE | employee_table | NULL | ref | idx_name | idx_name | 182 | const | 1 | 100.00 | Using index |

+—-+————-+—————-+————+——+—————+———-+———+——-+——+———-+————-+

So, the query avoids the full table scan after choosing the proper index.

Avoid ‘*’ in select query :

Instead of using SELECT *, select only the columns that you need. ( * indicates getting the full table.)

select query with *

mysql> explain select * from employee_table;

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

| 1 | SIMPLE | employee_table | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where |

+—-+————-+—————-+————+——+—————+——+———+——+——+———-+————-+

Thus, the type column is null and the key column is also null; there is no need to create an index because this query-designed output came in a full table. As the result, the query will increase the query cost.

But when your table has large set of data, It is good to avoid retrieving whole data, try to use columns specifically so we avoid max query cost. See the attached output for your reference

mysql> explain select id, name from employee_table;

+—-+————-+—————-+————+——-+—————+———-+———+——+——+———-+————-+

| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+—-+————-+—————-+————+——-+—————+———-+———+——+——+———-+————-+

| 1 | SIMPLE | employee_table | NULL | index | NULL | idx_name | 182 | NULL | 3 | 100.00 | Using index |

+—-+————-+—————-+————+——-+—————+———-+———+——+——+———-+————-+

The type column is an index and the key also chooses the proper index. Hence, the query was run smoothly, and the output also came with a minimum execution time.

Joins:

To link rows from two or more tables according to the relevant column between them, use a join clause.
MySQL is better at optimizing joins than sub queries, so if you find the load averages on your MySQL server hitting unacceptably high levels, examine your application code and try rewriting your sub queries as joins or sequences of joins.

Using a sub query

mysql> explain SELECT r.RouteID, f.FlightID FROM route AS r, flight AS f WHERE r.RouteID = f.RouteID AND r.Status = 71 AND f.AircraftID IN (SELECT AircraftID FROM aircraft WHERE AircraftTypeID = 007);
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–+
| 1 | SIMPLE | aircraft | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where; Start temporary |
| 1 | SIMPLE | r | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Using join buffer (hash join) |
| 1 | SIMPLE | f | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; End temporary; Using join buffer (hash join) |
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–

Using join

explain SELECT r.RouteID, f.FlightID FROM route AS r JOIN flight AS f ON r.RouteID = f.RouteID JOIN aircraft AS a ON f.AircraftID = a.AircraftID WHERE r.Status = 71 AND a.AircraftTypeID = 007;
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–+
| 1 | SIMPLE | aircraft | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
| 1 | SIMPLE | r | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Using join buffer (hash join) |
| 1 | SIMPLE | f | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where; Using join buffer (hash join) |
+—-+————-+———-+————+——+—————+——+———+——+——+———-+———————————————————–


compared with the above output “Extra” column presence of Start temporary and End temporary suggests that the first query is using a temporary table or derived table, likely due to a sub query. The second query might be optimized to avoid this temporary table.

Generally speaking, joins are faster than subqueries in terms of performance because they can use indexes and other optimization techniques. Subqueries, on the other hand, may require more processing and memory, especially if they return large or complex results.

LIMIT:

For large tables, retrieving all rows can put a significant load on the server. Using LIMIT can help limit the number of rows processed, reducing memory usage and execution time.This is especially useful for queries without restrictive conditions (WHERE clause) and when you only need a sample of the data.

Without limit.

EXPLAIN SELECT * FROM orders WHERE total_amount > 100;

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

| 1 | SIMPLE | orders | range | total_amount | NULL | NULL | NULL | 1005 | Using where |

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

With limit :

EXPLAIN SELECT * FROM orders WHERE total_amount > 100 LIMIT 10;

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

| 1 | SIMPLE | orders | range | total_amount | NULL | NULL | NULL | 10 | Using where; Using filesort |

+—-+————-+——–+——-+—————+——-+———+——+——+——-+

Key Differences Between the Two Queries:

The query without LIMIT scans 1005 rows, while the query with LIMIT scans only 10 rows.

The query with LIMIT uses file sort, indicating that MySQL needs to sort the rows to satisfy the LIMIT clause.

Use UNION Instead of OR:

To improve query efficiency, use UNION instead of several OR criteria.

Using OR

Explain SELECT country FROM customers WHERE country =’USA’ OR country = ‘canada’;

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

| 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 10000 | Using where |

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

The index does not significantly reduce the number of rows scanned.

The OR condition has multiple columns, making it harder for the database to use the index.

Using UNION

Explain SELECT country FROM customers WHERE country =’USA’ UNION SELECT country FROM customers WHERE country = ‘canada’;

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

| 1 | UNION

| 2 | SIMPLE | customers | ALL | idx_country | 72 | const | 5000 | | Using where |

| 3 | SIMPLE | customers | ALL | idx_country | 72 | const | 5000 | | Using where |

+—-+————-+———–+——-+—————+——-+———+———-+——–+————-+

Each SELECT statement uses the index separately.

Fewer rows scanned overall.

UNION can lead to faster execution (with small datasets).

Note: Check before and after query data because sometimes a few queries will give different data.

FYI – However, using UNION in this case is not the most efficient approach, especially when dealing with large datasets. This is because UNION removes duplicate rows, which requires additional processing.

A better approach would be to use a single SELECT statement with OR conditions or IN operator as you’ve already done.

Using IN

Explain SELECT country FROM customers WHERE country IN (‘USA’, ‘Canada’);

+—-+————-+———–+——-+—————+———-+———+———-+——–+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——-+—————+———-+———+———-+——–+————-+

| 1 | SIMPLE | customers | ref | idx_country | idx_country | 52 | const | 100 | Using where |

+—-+————-+———–+——-+—————+———-+———+———-+——–+————-+

Avoid Using Wildcards at the Start of LIKE Queries:

With wildcard (%)

explain select name from customers where name LIKE ‘%Genex%’;

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

| 1 | SIMPLE | customers | scan | NULL | NULL | NULL | NULL | 1005 | Where clause|

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

Leading wildcards (%) prevent the database from utilizing indexes on the column, leading to slower query performance.

The database must perform a full table scan to find matches, resulting in increased query execution time.

Without wildcard (%)

explain select name from customers where name LIKE ‘Genex%’;

+—-+————-+———–+——-+—————+——-+———+——+-+———-+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+—-+————-+———–+——-+—————+——-+———+——+-+———-+————-+

| 1 | SIMPLE | customers | range | NULL | name | 767 | NULL | 10 | 100.00 | Using where |

+—-+————-+———–+——-+—————+——-+———+——+-+———-+————-+

The query uses the Proper index so query cost were reduced.

Avoid Using Functions in WHERE Clause:

There are two reasons to avoid utilizing a function on a column inside a WHERE clause.

  1. Always have a second idea of adding Functions in where clause.The usage of FUNCTIONS results in Heavy CPU load because the function gets called for each record in your result set.
  2. It stops the database from using an index, sort key, or cluster key. As a result, the database must read additional data from the storage layer. This can significantly affect query performance.

The query with function

explain select * from customers where UPPER(country) = ‘USA’;

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

| 1 | SIMPLE | customers | ALL | NULL | NULL | NULL | NULL | 1003 | Using where |

+—-+————-+———–+——-+—————+——-+———+——+——+————-+

Functions can prevent the database from using indexes. it requires additional computation, slowing down queries.

Look at the type and key column in the explain output; the type column shows ALL and the index was not chosen, so the query will take a full table scan.

The query without function

explain select * from customers where country = ‘USA’;

+—-+————-+———–+——————–+—————+——-+———+——+——+————-+

| id | select_type | table |partitions | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+———–+——————–+—————+——-+———+——+——+————-+

| 1 | SIMPLE | customers | ref | range | country_idx |country | 3 | const | 100 | Using index |

+—-+————-+———–+——————–+—————+——-+———+——+——+————-+

Queries without functions use indexes, while queries with functions do not.

Additional advantage of this it uses more efficient scan types (e.g., ref, range), while queries with functions use ALL scans and also scan fewer rows.

Conclusion:

These are the few ways can help you to optimize the MySQL queries and improve overall database performance. Keep in mind that optimization depends on the specific requirements and characteristics of your database and queries, so always validate the impact of your changes. We will come up with more optimization strategies in our upcoming blogs.

References:

https://dev.mysql.com/doc/refman/8.4/en/where-optimization.html

https://medium.com/@priyankgondaliya/mastering-mysql-query-optimization-15-proven-tricks-for-lightning-fast-database-performance-and-b863a8a00524

https://www.linkedin.com/advice/0/how-do-you-decide-when-use-subquery-join-sql#:~:text=Generally%20speaking%2C%20joins%20are%20faster,return%20large%20or%20complex%20results.

Discover more from Genexdbs

Subscribe now to keep reading and get access to the full archive.

Continue reading