Skip to content

Optimising queries

Optimising queries is essential for improving database performance and ensuring efficient data retrieval. This document outlines best practices and techniques for optimising queries in your database.

General guidelines

  1. Use indexes: Ensure that the columns used in WHERE clauses, JOIN conditions, and ORDER BY clauses are indexed.
  2. Avoid SELECT *: Specify only the columns you need in the SELECT statement to reduce the amount of data transferred.
  3. Use LIMIT: When possible, use the LIMIT clause to restrict the number of rows returned by a query.
  4. Optimise joins: Use appropriate join types and ensure that the join conditions are indexed.
  5. Avoid subqueries: Use JOINs instead of subqueries where possible, as they are generally more efficient.
  6. Use caching: Cache the results of frequently run queries to reduce the load on the database.
  7. Analyse query plans: Use the EXPLAIN command to analyze query execution plans and identify performance bottlenecks.

Using EXPLAIN

The EXPLAIN command provides insight into how a query is executed. It helps identify which parts of the query are causing performance issues.

IMPORTANT: When using EXPLAIN ANALYSE, the query is actually executed. If the query modifies data (e.g., UPDATE, DELETE, INSERT), use EXPLAIN instead.

Example

EXPLAIN ANALYSE
SELECT
   column1,
   column2
FROM
   table_name
WHERE
   condition
ORDER BY
   column1;

Output

The output of the EXPLAIN command provides information about the query execution plan, including the order of operations, the estimated cost of each operation, and the number of rows processed at each step. Some important operations to look out for include:

  • Seq scan: A sequential scan of the table, which can be inefficient for large tables.
  • Index scan: An index scan, which is generally more efficient than a sequential scan.
  • Sort: Sorting operations, which can be costly for large result sets.
  • Join: Join operations, which can be optimised using appropriate join types and conditions.
  • Aggregate: Aggregate operations, which can be resource-intensive for large datasets.
  • Limit: Limit operations, which restrict the number of rows returned by the query.
  • Filter: Filtering operations, which apply conditions to the result set.
  • Nested loop: Nested loop join operations, which can be inefficient for large datasets.
  • Materialize: Materialize operations, which store intermediate results in memory.