Working With Queries

With pgDash, you can examine how your SQL queries are performing, identify reasons for slow queries, track query stats over time and examine query execution plans.

Setting Up

pgDash relies on the popular extensions pg_stat_statements and auto_explain to collect query and plan information. These extensions are part of the core PostgreSQL distribution, and are typically available already in your deployment (including AWS RDS and Aurora). Documentation for the extensions can be found in the PostgreSQL docs here and here.

Tip: You can use the psql command \dx to list all the currently installed extensions in a database.

pg_stat_statements

Follow these steps to install the pg_stat_statements extension:

Update the configuration setting shared_preload_libraries (docs) to include "pg_stat_statements".
The PostgreSQL server needs to be restarted for this change to take affect. (Yes, this is required and cannot be avoided. This is the case for AWS RDS also.)
Connect to the database and run CREATE EXTENSION pg_stat_statements; You'll need to connect as a superuser or at least the database owner.
Verify that regular users are able to run SELECT * FROM pg_stat_statements LIMIT 1 and see the results without any errors. In particular, the "query" column should not say "<insufficient privilege>".

Connect in turn to each database you want to monitor, and repeat the last two steps.

auto_explain

Follow these steps to install the auto_explain extension:

Update the configuration setting shared_preload_libraries (docs) to include "auto_explain".
Add auto_explain related settings to your PostgreSQL configuration. Refer to the auto_explain documentation for the full list. Here is a starting point:
- auto_explain.log_min_duration = 2000
- auto_explain.log_analyze = 1
- auto_explain.log_timing = 0
- auto_explain.log_format = json
- With the settings above, any SQL statement that executes for more than 2 seconds will have it's execution plan logged into the PostgreSQL log file in JSON format. Be sure to go through all available settings in the documentation and tweak them to suit your deployment.
For the changes to take effect, manually run LOAD auto_explain; to load the extension and ALTER SYSTEM SET to change the value of the configuration settings. (You can also reload PostgreSQL for the configuration changes to take effect).
The auto_explain extension is independent of any database, and the above steps need to be done only once for the entire server.

Using pgmetrics to collect query and plan information

Query information (via pg_stat_statements)

The query information is collected by connecting to PostgreSQL and querying the pg_stat_statements view.

By default, pgmetrics will attempt to collect query information from pg_stat_statements without requiring any explicit actions. There are a few command-line parameters you can tweak to ensure it does what you want it to:

--sql-limit=500 By default, pgmetrics collects only the first 500 characters of SQL queries, which is hopefully enough to identify the query which you can then locate in your code. If 500 is too few for you, you can increase the value by using this option, like so: pgmetrics --sql-limit=1000 {other.args}. Also note that pg_stat_statements itself has a similar limit (by default 5000). If you need to go above this limit, you'll need to change the setting pg_stat_statements.max (more details here).
--statements-limit=100 By default, pgmetrics collects the 100 slowest queries (as ordered by total_time). You can change this number using this parameter, like so: pgmetrics --statements-limit=200 {other.args}.

Plan information (via PostgreSQL log files)

The plan information is collected from PostgreSQL log files. pgmetrics has to be run on the same system as the PostgreSQL server so that it can read and parse the PostgreSQL log files. In case of AWS RDS Postgres and AWS Aurora, pgmetrics can connect via the AWS CloudWatch Logs API to fetch and parse PostgreSQL log files.

When pgmetrics is run from the machine that hosts both the Postgres server and its log files, pgmetrics will attempt to automatically locate and read the last few minutes' worth of log files to collect auto_explain output. The following command-line arguments can help tweak the exact behavior:

--log-file=path/to/log/file If the default logic does not locate the correct log file, use this option to specify the exact location of the log file.
--log-span=MINUTES By default pgmetrics scans only the log file entries made during last 5 minutes of the log file. (Note that pgmetrics is meant to be run periodically, like every 5 minutes). You can change the default value of 5 using this option.
(AWS RDS and Aurora only) The --aws-rds-dbid option, as well credentials to access the AWS APIs must be set so that pgmetrics can fetch the Postgres logs via CloudWatch. See this page for more info.

Collecting Disk I/O Times

PostgreSQL can track the amount of time it spends doing disk I/O, but this feature is turned off by default. The configuration setting "track_io_timing" (docs) controls this setting. It is off by default because the measurement may incur an overhead. However, it should be safe to turn it on in nearly any modern Linux system. You can check the actual overhead using the pg_test_timing tool.

When track_io_timing is enabled, pg_stat_statements can report the time spent by each query in performing disk reads and writes. In practice, this information can be quite useful in analyzing query performance.

Resetting The Statistics

pg_stat_statements reports aggregate values (average, standard deviation, sum) of the actual query execution statistics. These values are accumulated permanently (even across server restarts) until reset manually. Because of this, you may find that it takes time for the stats to reflect the current numbers. Resetting is especially helpful after an earnest round of query optimization or a significant change in workload patterns.

You can use the function pg_stat_statements_reset() to clear accumulated statistics. Use it like so: SELECT pg_stat_statements_reset(); You'll need superuser privileges to run this function.

Postgres 12 is more flexible and can discard only a subset of the accumulated statistics. For more information see the docs.

PreviousWebhooks

Last updated 3 years ago