PostgreSQL Database Maintenance Operations
PostgreSQL Database Maintenance Operations
To keep a PostgreSQL database fast and healthy, regular maintenance operations are essential. Here are some of these maintenance operations:

1. VACUUM Operations (vacuum, vacuum full)
Regularly performed maintenance operation in PostgreSQL. This operation:
- Reclaims disk space occupied by updated and deleted rows.
- Updates statistics used by the query planner.
- Updates Visibility Map files, allowing faster data access.
Autovacuum
The vacuum operation can be automated with autovacuum. Autovacuum regularly examines statistics and detects tables with a large number of dead rows. It creates workers within the limits of the autovacuum_max_workers parameter to perform maintenance on these tables. These workers apply VACUUM and ANALYZE operations to the necessary tables. If the number of workers is insufficient for all tasks, the tables are queued.
VACUUM FULL
- Uses a more effective cleaning algorithm compared to normal VACUUM.
- Rebuilds tables in a new copy, using more disk space temporarily. Once the operation is complete, this space is released.
- Locks the table and prevents access until the operation is finished.
- Useful for tables experiencing significant data changes.
VACUUM Examples
VACUUM; -- no locks, removes dead rows and marks for reuse
VACUUM FULL; -- locks, compacts, more time
VACUUM FULL VERBOSE emp;
-- \h vacuum (check detail vacum options)
VACUUM (index_cleanup true, verbose true, analyze true) customers;
VACUUM (full true, index_cleanup true, verbose true, analyze true) customers;
VACUUM (index_cleanup true, verbose true, analyze true, parallel 4) customers; -- with full no parallel
Monitoring Vacuum Status
SELECT * FROM pg_stat_progress_vacuum;
2. CLUSTER Operations
The CLUSTER command physically reorganizes a table according to an index. Subsequent data entries may not follow this order, and the operation locks the table, preventing access until complete.
CLUSTER Examples
create table testcluster (id int, name text);
CREATE TABLE
insert into testcluster values(2,'B'),(1,'A'),(3,'A'),(5,'C'),(4,'B');
INSERT 0 5
select * from testcluster;
id | name
----+------
2 | B
1 | A
3 | A
5 | C
4 | B
(5 rows)
create index idx_id on testcluster(id);
CREATE INDEX
postgres=# cluster testcluster using idx_id;
CLUSTER
postgres=# select * from testcluster;
id | name
----+------
1 | A
2 | B
3 | A
4 | B
5 | C
(5 rows)
Monitoring CLUSTER Status
SELECT * FROM pg_stat_progress_cluster;
3. REINDEX Operations
REINDEX rebuilds and updates old indexes. It is useful in the following situations:
- If index data is corrupted or damaged.
- To clean up indexes with too many invalid and empty entries.
- To apply parameter changes related to the index.
- To correct a previously failed index operation.
REINDEX Examples
REINDEX INDEX index_name;
REINDEX TABLE table_name;
REINDEX SCHEMA schema_name;
REINDEX DATABASE db_name;
REINDEX TABLE CONCURRENTLY cs_agreement;
REINDEX (verbose, CONCURRENTLY, TABLESPACE pg_default) DATABASE db_name;
REINDEX (verbose) SYSTEM;
4. Updating Statistics (ANALYZE)
As data in tables changes, these statistics can become outdated. The ANALYZE command is used to update these statistics.
ANALYZE Examples
ANALYZE table_name;
ANALYZE;
ANALYZE VERBOSE;
Statistical Data
Gathers information about database objects, such as access frequency, how often the data block is found in the cache, and the frequency of disk reads. This data is stored in relevant statistics tables.
pg_stat_activitypg_stat_replicationpg_stat_user_tablespg_stat_user_indexespg_statio_user_tablespg_statio_user_indexes
Table Statistics
Table statistics are used by the query planner to generate the most efficient plan. Database content statistics are stored in the pg_statistic catalog. The ANALYZE command updates these statistics.
Lock Mechanism
The lock mechanism ensures data integrity during concurrent read/write operations in the database. PostgreSQL uses page-level locking to control access to pages in the shared_buffer.
Starting a Transaction
postgres=# create table emp (salary int , number int);
CREATE TABLE
postgres=# insert into emp (salary, number) values (1000, 12);
INSERT 0 1
postgres=# Begin;
BEGIN
postgres=*# update emp set salary=10 where number=12;
UPDATE 1
postgres=*# commit;
COMMIT
Monitoring Locks
SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;
SELECT a.datname, c.relname, l.transactionid, l.mode, l.GRANTED, a.usename, a.query, a.query_start, age(now(), a.query_start) AS "age"
FROM pg_stat_activity a
JOIN pg_locks l ON l.pid = a.pid
JOIN pg_class c ON c.oid = l.relation
ORDER BY a.query_start;
-- Also, you can see idle in transaction
watch -n 2 "psql -c \"SELECT pid, datname, usename, state, query, state_change FROM pg_stat_activity WHERE state = 'idle in transaction';\""
Session Kill
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid() -- don't kill your connection!
AND datname = 'exampledb'; -- don't kill the connections to other databases
These maintenance operations will make your PostgreSQL database more efficient and secure. For more detailed and technical articles like this, keep following our blog on Medium. If you have any questions or need further assistance, feel free to reach out in the comments below and directly.
← PostgreSQL Blog