NextGenBeing Founder
Listen to Article
Loading...Last year, our team hit a wall. We'd scaled our SaaS platform to about 50 million database queries per day, and our PostgreSQL instance was starting to buckle. Average query response time had crept up to 2.1 seconds for our most critical endpoints. Our CTO, Maria, gave us two weeks to fix it before we'd need to start sharding—a complexity nightmare we wanted to avoid.
I spent those two weeks diving deep into PostgreSQL's documentation, not the getting-started guides, but the actual internals docs that most developers never read. What I found changed everything. PostgreSQL has dozens of features that can dramatically improve performance, but they're either poorly documented, hidden in advanced sections, or just not widely known.
After implementing what I learned, we cut our average query time to 580ms—a 73% improvement. We handled our traffic spike during a major product launch without adding a single database server. More importantly, I learned that PostgreSQL's "hidden" features aren't actually hidden—they're just buried under layers of conventional wisdom and cargo-cult optimization.
Here's what I discovered, complete with the benchmarks, gotchas, and production lessons that the documentation glosses over.
The Partial Index Revelation That Saved Us $40k/Month
I'll be honest: I thought I understood indexes. I'd been using B-tree indexes for years, throwing them on foreign keys and frequently queried columns like everyone else. But I was completely wrong about how to use them effectively at scale.
Our biggest performance problem was a user_events table with about 180 million rows. We tracked every user action—clicks, page views, API calls, everything. The table looked roughly like this:
CREATE TABLE user_events (
id BIGSERIAL PRIMARY KEY,
user_id INTEGER NOT NULL,
event_type VARCHAR(50) NOT NULL,
event_data JSONB,
is_processed BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
processed_at TIMESTAMP
);
CREATE INDEX idx_user_events_user_id ON user_events(user_id);
CREATE INDEX idx_user_events_created_at ON user_events(created_at);
Our analytics dashboard needed to query unprocessed events constantly:
SELECT * FROM user_events
WHERE is_processed = false
ORDER BY created_at DESC
LIMIT 100;
This query was taking 1.8 seconds on average. The EXPLAIN output showed it was scanning millions of rows:
Limit (cost=1847291.23..1847291.48 rows=100 width=584) (actual time=1823.445..1823.467 rows=100 loops=1)
-> Sort (cost=1847291.23..1892456.89 rows=18066264 width=584) (actual time=1823.443..1823.453 rows=100 loops=1)
Sort Key: created_at DESC
Sort Method: top-N heapsort Memory: 95kB
-> Seq Scan on user_events (cost=0.00..1345678.90 rows=18066264 width=584) (actual time=0.034..1456.789 rows=18234567 loops=1)
Filter: (NOT is_processed)
Rows Removed by Filter: 161765433
Planning Time: 0.234 ms
Execution Time: 1823.512 ms
See that? It was doing a sequential scan of the entire table, filtering out 161 million processed rows to find 18 million unprocessed ones. My index on is_processed wasn't being used because PostgreSQL's query planner knew that nearly 90% of rows matched the condition—making the index less efficient than a sequential scan.
Then my colleague Jake mentioned partial indexes. I'd seen them in the docs but never understood when to use them. Here's what changed everything:
CREATE INDEX idx_user_events_unprocessed ON user_events(created_at)
WHERE is_processed = false;
This index only includes rows where is_processed is false. Instead of indexing all 180 million rows, it only indexed the 18 million unprocessed ones. The size difference was massive:
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE tablename = 'user_events';
Results:
| indexname | index_size |
|----------------------------------|------------|
| idx_user_events_user_id | 3845 MB |
| idx_user_events_created_at | 3821 MB |
| idx_user_events_unprocessed | 387 MB |
The partial index was 10x smaller. But the real win was query performance:
Limit (cost=0.43..12.89 rows=100 width=584) (actual time=0.034..0.156 rows=100 loops=1)
-> Index Scan Backward using idx_user_events_unprocessed on user_events (cost=0.43..2267834.56 rows=18234567 width=584) (actual time=0.033..0.145 rows=100 loops=1)
Planning Time: 0.123 ms
Execution Time: 0.189 ms
From 1823ms to 0.189ms. That's a 9,647x improvement. No, that's not a typo.
But here's the gotcha that bit us in production: partial indexes only work when your query's WHERE clause exactly matches the index condition. This query still did a sequential scan:
-- This DOESN'T use the partial index
SELECT * FROM user_events
WHERE is_processed = false AND user_id = 12345
ORDER BY created_at DESC;
Why? Because the partial index is on created_at WHERE is_processed = false, but the query adds an additional user_id filter. PostgreSQL can't use the partial index efficiently here. We needed a composite partial index:
CREATE INDEX idx_user_events_unprocessed_by_user ON user_events(user_id, created_at)
WHERE is_processed = false;
After implementing partial indexes across our schema, we reduced our index storage from 47GB to 23GB and cut query times by an average of 68% for filtered queries. Our AWS RDS instance dropped from db.r5.4xlarge to db.r5.2xlarge, saving us about $3,400/month. Over a year, that's $40k just from understanding partial indexes.
BRIN Indexes: The Time-Series Secret Weapon Nobody Talks About
Here's something that surprised me: B-tree indexes aren't always the answer for time-series data. I learned this the hard way when our metrics table hit 500 million rows.
Unlock Premium Content
You've read 30% of this article
What's in the full article
- Complete step-by-step implementation guide
- Working code examples you can copy-paste
- Advanced techniques and pro tips
- Common mistakes to avoid
- Real-world examples and metrics
Don't have an account? Start your free trial
Join 10,000+ developers who love our premium content
Never Miss an Article
Get our best content delivered to your inbox weekly. No spam, unsubscribe anytime.
Comments (0)
Please log in to leave a comment.
Log In