Forget Everything You Know About PostgreSQL Maintenance

A practical PostgreSQL maintenance guide for production environments, covering instance, database, and table-level operations with…

Forget Everything You Know About PostgreSQL Maintenance

A practical PostgreSQL maintenance guide for production environments, covering instance, database, and table-level operations with real-world commands.

1. Why PostgreSQL Maintenance Must Be Thought in Layers

2. Instance-Level Maintenance

vacuumdb
reindexdb
clusterdb
CHECKPOINT

3. Database-Level Physical Operations

4. Table-Level Maintenance (Now It Actually Makes Sense)

VACUUM vs VACUUM FULL
Advanced VACUUM options
ANALYZE

5. pg_repack: The Minimal-Lock Power Tool

6. CLUSTER: Physical Reordering with Intent

7. Index Maintenance Deep Dive

8. Observability Before Action

pg_stat_statements
EXPLAIN / EXPLAIN ANALYZE

9. The Peak: Measuring Bloat Like a Professional

Table Bloat Analysis
Index Bloat Analysis

10. Final Thoughts: Thinking Like a PostgreSQL DBA

1. Why PostgreSQL Maintenance Must Be Thought in Layers

PostgreSQL maintenance operates on three distinct layers:

Instance (cluster) level
Database level
Table / index level

Mistaking one for another leads to:

unnecessary locks
unexpected I/O storms
why did everything slow down? moments

Great DBAs don’t run commands.
They choose the correct scope.

2. Instance-Level Maintenance

These commands operate at the PostgreSQL instance (cluster) level.

That means:

They affect ALL databases inside the same PostgreSQL instance.

This is where real power and real danger lives.

vacuumdb (Cluster-Aware by Nature)

vacuumdb -U postgres --all --analyze-only --verbose

Connects to every database
Runs ANALYZE only
Zero table rewrite
Zero blocking
Planner statistics refreshed instance-wide

vacuumdb -U postgres --all --full --analyze --verbose

This is VACUUM FULL across ALL databases.

Table rewrites
Exclusive locks
Massive I/O
Never run casually on production

Key insight:
vacuumdb is not just a wrapper.
It is an instance-level orchestrator.

reindexdb (Instance-Wide Index Surgery)

reindexdb --all --concurrently --jobs 4 --echo

Rebuilds indexes in all databases
CONCURRENTLY minimizes blocking
Parallel jobs distribute CPU load

Without CONCURRENTLY:

writes block
sessions wait
alerts start screaming

clusterdb (The Most Misunderstood Tool)

clusterdb -U postgres -d sales_db \
  --table orders \
  --index idx_orders_timestamp \
  --echo

Even though it targets a table,
clusterdb itself is an instance-level utility.

Why this matters:

It connects from outside
It respects instance-level resource limits
It behaves differently than in-session CLUSTER

CHECKPOINT (Silent but Brutal)

CHECKPOINT;

Forces dirty buffers to disk
Flushes WAL
Can create sudden I/O spikes

Used carefully:

before backups
during controlled maintenance windows

Used blindly:

latency explosions

3. Database-Level Physical Operations

Now we zoom in.

Database-level commands still affect many objects, but stay inside one database.

Examples:

REINDEX DATABASE
CLUSTER DATABASE
VACUUM DATABASE

This is the sweet spot for controlled maintenance.

4. Table-Level Maintenance (Now It Actually Makes Sense)

Only now does it make sense to talk about VACUUM.

VACUUM

VACUUM; -- non-blocking, removes dead tuples

Marks space reusable
No OS-level space return
No exclusive locks
Everyday hygiene

VACUUM FULL

VACUUM FULL;
VACUUM FULL VERBOSE emp;

Full table rewrite
Disk space returned
ACCESS EXCLUSIVE LOCK
Equivalent to controlled demolition

Advanced VACUUM Usage

VACUUM (index_cleanup true, verbose true, analyze true) customers;

VACUUM (full true, index_cleanup true, verbose true, analyze true) customers;

VACUUM (index_cleanup true, verbose true, analyze true, parallel 4) customers;
-- FULL does NOT support parallelism

Parallel VACUUM:

reduces wall-clock time
increases CPU usage
perfect for large tables on modern hardware

ANALYZE (Planner’s Oxygen)

ANALYZE;
ANALYZE VERBOSE;

No ANALYZE = blind planner.
Blind planner = bad execution plans.

5. pg_repack: The Minimal-Lock Power Tool

This is where PostgreSQL becomes elegant.

pg_repack -a

Reorganizes all tables
Uses triggers + shadow tables
Near-zero blocking
Ideal for production

pg_repack -t tablename -d employee

Targeted precision.

pg_repack -s newtablespace -d employee

This does three things at once:

Removes bloat
Rewrites data
Moves objects to a new tablespace

All with minimal lock time.

Fillfactor is respected.
Indexes are rebuilt cleanly.
Downtime is measured in seconds.

6. CLUSTER: Physical Reordering with Intent

CLUSTER testcluster USING idx_id;

Physically reorders table
Optimizes range scans
One index defines the layout

But:

Table rewrite
Blocking
Needs re-run after data churn

CLUSTER is powerful but situational.

7. Index Maintenance Deep Dive

REINDEX INDEX index_name;

REINDEX TABLE table_name;

REINDEX SCHEMA schema_name;

REINDEX DATABASE db_name;

REINDEX TABLE CONCURRENTLY cs_agreement;

REINDEX (verbose) SYSTEM;

REINDEX SYSTEM:

touches system catalogs
rarely needed
always dangerous without planning

8. Observability Before Action

pg_stat_statements

Shows you:

what hurts
how often
how expensive

EXPLAIN / EXPLAIN ANALYZE

Theory vs reality.

Always trust ANALYZE.

9. The Peak: Measuring Bloat Like a Professional

Feelings end here.
Numbers begin.

Table Bloat Analysis

-- This query inspects bloat in tables.
SELECT current_database(), schemaname, tblname, bs*tblpages AS real_size,
  (tblpages-est_tblpages)*bs AS extra_size,
  CASE WHEN tblpages &amp;amp;gt; 0 AND tblpages - est_tblpages &amp;amp;gt; 0
    THEN 100 * (tblpages - est_tblpages)/tblpages::float
    ELSE 0
  END AS extra_pct, fillfactor,
  CASE WHEN tblpages - est_tblpages_ff &amp;amp;gt; 0
    THEN (tblpages-est_tblpages_ff)*bs
    ELSE 0
  END AS bloat_size,
  CASE WHEN tblpages &amp;amp;gt; 0 AND tblpages - est_tblpages_ff &amp;amp;gt; 0
    THEN 100 * (tblpages - est_tblpages_ff)/tblpages::float
    ELSE 0
  END AS bloat_pct, is_na
FROM (
  SELECT ceil(reltuples / ((bs-page_hdr)/tpl_size)) + ceil(toasttuples / 4) AS est_tblpages,
         ceil(reltuples / ((bs-page_hdr)*fillfactor/(tpl_size*100))) + ceil(toasttuples / 4) AS est_tblpages_ff,
         tblpages, fillfactor, bs, tblid, schemaname, tblname, heappages, toastpages, is_na
  FROM (
    SELECT (4 + tpl_hdr_size + tpl_data_size + (2*ma)
            - CASE WHEN tpl_hdr_size%ma = 0 THEN ma ELSE tpl_hdr_size%ma END
            - CASE WHEN ceil(tpl_data_size)::int%ma = 0 THEN ma ELSE ceil(tpl_data_size)::int%ma END
           ) AS tpl_size, bs - page_hdr AS size_per_block, (heappages + toastpages) AS tblpages, heappages,
           toastpages, reltuples, toasttuples, bs, page_hdr, tblid, schemaname, tblname, fillfactor, is_na
    FROM (
      SELECT tbl.oid AS tblid, ns.nspname AS schemaname, tbl.relname AS tblname, tbl.reltuples,
             tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,
             coalesce(toast.reltuples, 0) AS toasttuples,
             coalesce(substring(array_to_string(tbl.reloptions, &amp;amp;#x27; &amp;amp;#x27;) FROM &amp;amp;#x27;fillfactor=([0-9]+)&amp;amp;#x27;)::smallint, 100) AS fillfactor,
             current_setting(&amp;amp;#x27;block_size&amp;amp;#x27;)::numeric AS bs,
             CASE WHEN version()~&amp;amp;#x27;mingw32&amp;amp;#x27; OR version()~&amp;amp;#x27;64-bit|x86_64|ppc64|ia64|amd64&amp;amp;#x27; THEN 8 ELSE 4 END AS ma,
             24 AS page_hdr,
             23 + CASE WHEN MAX(coalesce(s.null_frac,0)) &amp;amp;gt; 0 THEN (7 + count(s.attname)) / 8 ELSE 0::int END
               + CASE WHEN bool_or(att.attname = &amp;amp;#x27;oid&amp;amp;#x27; and att.attnum &amp;amp;lt; 0) THEN 4 ELSE 0 END AS tpl_hdr_size,
             sum((1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0)) AS tpl_data_size,
             bool_or(att.atttypid = &amp;amp;#x27;pg_catalog.name&amp;amp;#x27;::regtype) OR sum(CASE WHEN att.attnum &amp;amp;gt; 0 THEN 1 ELSE 0 END) &amp;amp;lt;&amp;amp;gt; count(s.attname) AS is_na
      FROM pg_attribute AS att
      JOIN pg_class AS tbl ON att.attrelid = tbl.oid
      JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
      LEFT JOIN pg_stats AS s ON s.schemaname=ns.nspname AND s.tablename=tbl.relname AND s.inherited=false AND s.attname=att.attname
      LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
      WHERE NOT att.attisdropped
      AND tbl.relkind in (&amp;amp;#x27;r&amp;amp;#x27;,&amp;amp;#x27;m&amp;amp;#x27;)
      GROUP BY 1,2,3,4,5,6,7,8,9,10
      ORDER BY 2,3
    ) AS s
  ) AS s2
) AS s3
ORDER BY schemaname, tblname;

This query answers:

Is the table actually bloated?
Or just large?
Does VACUUM help, or is rewrite needed?

Index Bloat Analysis

-- This query inspects bloat in indexes.
SELECT current_database(), nspname AS schemaname, tblname, idxname, 
  bs*(relpages)::bigint AS real_size,
  bs*(relpages-est_pages)::bigint AS extra_size,
  100 * (relpages-est_pages)::float / relpages AS extra_pct,
  fillfactor,
  CASE WHEN relpages &amp;amp;gt; est_pages_ff THEN bs*(relpages-est_pages_ff) ELSE 0 END AS bloat_size,
  100 * (relpages-est_pages_ff)::float / relpages AS bloat_pct,
  is_na
FROM (
  SELECT coalesce(1 + ceil(reltuples/floor((bs-pageopqdata-pagehdr)/(4+nulldatahdrwidth)::float)), 0) AS est_pages,
         coalesce(1 + ceil(reltuples/floor((bs-pageopqdata-pagehdr)*fillfactor/(100*(4+nulldatahdrwidth)::float))), 0) AS est_pages_ff,
         bs, nspname, tblname, idxname, relpages, fillfactor, is_na
  FROM (
    SELECT maxalign, bs, nspname, tblname, idxname, reltuples, relpages, idxoid, fillfactor,
           (index_tuple_hdr_bm + maxalign - CASE WHEN index_tuple_hdr_bm%maxalign = 0 THEN maxalign ELSE index_tuple_hdr_bm%maxalign END
            + nulldatawidth + maxalign - CASE WHEN nulldatawidth = 0 THEN 0 WHEN nulldatawidth::integer%maxalign = 0 THEN maxalign ELSE nulldatawidth::integer%maxalign END
           )::numeric AS nulldatahdrwidth, pagehdr, pageopqdata, is_na
    FROM (
      SELECT n.nspname, i.tblname, i.idxname, i.reltuples, i.relpages, i.idxoid, i.fillfactor,
             current_setting(&amp;amp;#x27;block_size&amp;amp;#x27;)::numeric AS bs,
             CASE WHEN version() ~ &amp;amp;#x27;mingw32&amp;amp;#x27; OR version() ~ &amp;amp;#x27;64-bit|x86_64|ppc64|ia64|amd64&amp;amp;#x27; THEN 8 ELSE 4 END AS maxalign,
             24 AS pagehdr,
             16 AS pageopqdata,
             CASE WHEN max(coalesce(s.null_frac,0)) = 0 THEN 8 ELSE 8 + ((32 + 8 - 1) / 8) END AS index_tuple_hdr_bm,
             sum((1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024)) AS nulldatawidth,
             max(CASE WHEN i.atttypid = &amp;amp;#x27;pg_catalog.name&amp;amp;#x27;::regtype THEN 1 ELSE 0 END) &amp;amp;gt; 0 AS is_na
      FROM (
        SELECT ct.relname AS tblname, ct.relnamespace, ic.idxname, ic.attpos, ic.indkey, ic.indkey[ic.attpos], ic.reltuples, ic.relpages, ic.tbloid, ic.idxoid, ic.fillfactor,
               coalesce(a1.attnum, a2.attnum) AS attnum, coalesce(a1.attname, a2.attname) AS attname, coalesce(a1.atttypid, a2.atttypid) AS atttypid,
               CASE WHEN a1.attnum IS NULL THEN ic.idxname ELSE ct.relname END AS attrelname
        FROM (
          SELECT idxname, reltuples, relpages, tbloid, idxoid, fillfactor, indkey, pg_catalog.generate_series(1,indnatts) AS attpos
          FROM (
            SELECT ci.relname AS idxname, ci.reltuples, ci.relpages, i.indrelid AS tbloid, i.indexrelid AS idxoid,
                   coalesce(substring(array_to_string(ci.reloptions, &amp;amp;#x27; &amp;amp;#x27;) from &amp;amp;#x27;fillfactor=([0-9]+)&amp;amp;#x27;)::smallint, 90) AS fillfactor,
                   i.indnatts, pg_catalog.string_to_array(pg_catalog.textin(pg_catalog.int2vectorout(i.indkey)),&amp;amp;#x27; &amp;amp;#x27;)::int[] AS indkey
            FROM pg_catalog.pg_index i
            JOIN pg_catalog.pg_class ci ON ci.oid = i.indexrelid
            WHERE ci.relam=(SELECT oid FROM pg_am WHERE amname = &amp;amp;#x27;btree&amp;amp;#x27;) AND ci.relpages &amp;amp;gt; 0
          ) AS idx_data
        ) AS ic
        JOIN pg_catalog.pg_class ct ON ct.oid = ic.tbloid
        LEFT JOIN pg_catalog.pg_attribute a1 ON ic.indkey[ic.attpos] &amp;amp;lt;&amp;amp;gt; 0 AND a1.attrelid = ic.tbloid AND a1.attnum = ic.indkey[ic.attpos]
        LEFT JOIN pg_catalog.pg_attribute a2 ON ic.indkey[ic.attpos] = 0 AND a2.attrelid = ic.idxoid AND a2.attnum = ic.attpos
      ) i
      JOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespace
      JOIN pg_catalog.pg_stats s ON s.schemaname = n.nspname AND s.tablename = i.attrelname AND s.attname = i.attname
      GROUP BY 1,2,3,4,5,6,7,8,9,10,11
    ) AS rows_data_stats
  ) AS rows_hdr_pdg_stats
) AS relation_stats
ORDER BY nspname, tblname, idxname;

This is how you decide:

REINDEX?
pg_repack?
Ignore and move on?

Final Thoughts: Thinking Like a PostgreSQL DBA

Maintenance is not a checklist.
It’s scope awareness.

Great DBAs:

think in layers
measure before acting
avoid locks by design
respect the instance

Wisdom is knowing where and how wide to use them.