Logo ← PostgreSQL Blog

Write Heavy Tables in PostgreSQL

Before diving into performance tuning, you need to know where the pressure is. In PostgreSQL, high DML activity, meaning constant INSERT…

Write Heavy Tables in PostgreSQL

Before diving into performance tuning, you need to know where the pressure is. In PostgreSQL, high DML activity, meaning constant INSERT, UPDATE, and DELETE operations, is the leading cause of table bloat and IO bottlenecks.

However, the PostgreSQL statistics collector does not always reflect the immediate state of the world unless the stats are fresh.

Step 1 Refreshing the Statistics

Before querying the metadata, we must ensure the statistics are up to date. Running an ANALYZE across all databases forces PostgreSQL to update its internal counters regarding row counts and distribution.

Run this command in your terminal before starting your analysis:

vacuumdb -U postgres --all --analyze-only --verbose
  • all: Processes all databases in the cluster.
  • analyze-only: We do not need a full vacuum which is resource intensive; we just want to update the statistics.
  • verbose: Shows you exactly what the planner is looking at in real time.

Step 2 Tracking DML Activity Across the Cluster

Once your statistics are refreshed, you can use the following bash script to iterate through every database and find the top 20 noisiest tables.

for db in $(psql -At -c "SELECT datname FROM pg_database WHERE NOT datname=('postgres') AND NOT datistemplate ORDER BY datname");
do 
  echo "--- Database: $db ---";
  psql -d $db -c "
    SELECT
        relname AS table_name,
        n_tup_ins AS inserts,
        n_tup_upd AS updates,
        n_tup_del AS deletes,
        (n_tup_ins + n_tup_upd + n_tup_del) AS total_dml
    FROM
        pg_stat_user_tables
    WHERE
        (n_tup_ins + n_tup_upd + n_tup_del) > 0
    ORDER BY
        total_dml DESC LIMIT 20;
  "; 
done

Step 3 Verifying the Timeline

How long has it taken to reach these numbers? To calculate the DML per day or DML per hour, you need to know the last time these counters were reset.

SELECT datname, stats_reset
FROM pg_stat_database 
ORDER BY stats_reset DESC;

If stats_reset was months ago, the numbers might look scarier than they actually are. Context is key.

Step 4 Why This Matters

Once you identify your top 20 write heavy tables, here is your checklist:

  • Index Audit: If a table has 10 billion inserts but 30 indexes, your disk IO is struggling. Remove unused indexes.
  • Autovacuum Tuning: These tables are bloat magnets. You should likely lower the autovacuum vacuum scale factor specifically for these tables to trigger cleaning more frequently.
  • HOT Updates: Check your n_tup_hot_upd. If your updates are not Heap Only Tuples, you might have too many indexes on updated columns, causing performance degradation.

Conclusion

PostgreSQL performance tuning is not magic; it is about asking the right questions. By forcing an ANALYZE and then inspecting pg_stat_user_tables, you gain a clear map of where your database is working the hardest.