PostgreSQL Performance Tuning Lessons from Production

PostgreSQL's default configuration is conservative for broad compatibility. Key parameters like shared_buffers, effective_cache_size, and work_mem demand attention in production.

Shared_buffers should be around 25% of RAM on dedicated servers. Effective_cache_size tells the planner how much memory is available for caching—set it to 50-75% of RAM. Work_mem controls memory per sort operation but increases risk when multiplied by active queries and connections.

For instance, I've seen systems with 64GB of RAM allocate 16GB to shared_buffers and 32GB to effective_cache_size, resulting in significant performance gains. However, this also means that work_mem must be carefully managed to avoid memory exhaustion. A good rule of thumb is to start with a low work_mem value, such as 4MB, and gradually increase it based on the actual workload and available memory.

Max_connections eats RAM per connection. At scale, use PgBouncer to manage connection pooling. Each PostgreSQL backend process costs memory; hundreds of concurrent queries degrade performance rapidly.

Use EXPLAIN ANALYZE to dissect query plans. Seq Scans on large tables? Missing indexes. Hash Joins on big datasets? Missed join optimisation chances. Sort nodes chewing up time? Index ORDER BY and GROUP BY columns. Big gaps in estimated vs actual rows? Run ANALYZE. I've seen cases where a single missing index on a large table caused query execution times to increase by a factor of 10, resulting in significant performance degradation and user complaints.

In one production system I worked on, we used a combination of B-tree and GIN indexes to improve query performance. We created a B-tree index on the primary key column and a GIN index on a JSONB column that contained full-text data. This allowed us to efficiently query the data using both equality and full-text searches. However, we also had to carefully manage the index maintenance overhead, as the GIN index required periodic reindexing to maintain its efficiency.

PostgreSQL supports B-tree (default for range/equality), GIN (full-text/JSONB), BRIN (ordered data like timestamps), and partial indexes (for selective queries). BRIN's compact and efficient for append-only tables. Partial indexes shrink size by filtering data upfront.

B-tree is default for most queries but not all. GIN handles JSONB containment and full-text. GiST supports geospatial. Hash indexes are rarely useful. BRIN is tiny but only works on naturally ordered data—timestamps in logs, for example. In my experience, the choice of index type depends on the specific use case and data distribution. For instance, if you have a table with a large number of unique values, a B-tree index may be the best choice, while a GIN index may be more suitable for tables with a large number of duplicate values.

PostgreSQL connections cost RAM and CPU. Each spawns a backend process. At scale, use PgBouncer in transaction pooling mode. It lets apps share a small pool of PostgreSQL connections (20–100 typical). Transaction pooling assigns a backend for each app transaction, avoiding process churn. For example, in a system with 1000 concurrent users, using PgBouncer to manage connections can reduce the number of backend processes from 1000 to around 50, resulting in significant memory and CPU savings.

Monitor pg_stat_statements for slow queries. Set track_activity_query_size to capture full query text. Use pgbadger for log analysis. Track down long-running transactions and lock contention. Set idle_in_transaction_session_timeout to kill stalled sessions. I've found that monitoring pg_stat_statements and pgbadger logs on a regular basis helps to identify and fix performance issues before they become critical, and that setting idle_in_transaction_session_timeout to a reasonable value, such as 30 minutes, can help prevent long-running transactions from causing performance problems.