Overview
Part 1 looks at some principles and introduces some of the tools in your arsenal.
Part 2 (this post) covers the dreaded PostgreSQL bloat issue as well as how TOAST can help performance, or hinder it.
Part 3 dives deeper into identifying, investigating and mitigating slow queries.
Part 4 covers indexing.
Part 5 covers rewriting queries.
TL;DR
- Bloat occurs because PostgreSQL does not update in place, it creates a new row and marks the old one as obsolete
- VACUUM fixes bloat, well, it marks obsolete rows as re-usable space
- VACUUM FULL fixes bloat by compactly rewriting the data file and rebuilding indexes
- You should never run VACUUM FULL because it requires a full table lock
- pg_repack is an option … possibly
- Use pg_repack with care - it can have problematic side effects
- TOAST compresses wide columns so that the resulting row is < 2kB
- If the table is still too wide, then compressed wide columns are split into 2kB chunks and move to a TOAST table
- A 2kB (ish) wide table can be problematic
Bloat
What is bloat?
Strictly speaking, bloat is the unused space left behind by a VACUUM that is not a VACUUM FULL.