Don’t trust the query planner!
This is an experiment in cross-posting something from my internal blog with my employer Atlassian. I thought it may be be of general interest so I’ve edited it a bit and shared it here.
We recently did a regular zero downtime upgrade of our order processing system. This is a fairly routine event, we use Bamboo deployments to push from git master to our dev, staging and then production environments every day. The vast majority of these upgrades go smoothly without serious regressions or problems. On this particular day the application did not come back properly, we had an outage! We treat these very seriously because the ordering system is a big monolithic do-all-the-things type of application. An outage means customers cannot make new purchases, view their license keys, sign up for OnDemand, view prices on Marketplace and many other things.
We eventually tracked down the problem to an updated SQL query which performed significantly worse than the previous version. After running for about a minute all the database connections would be busy running this query so no other work could get done! A rollback of the application resolved the problem.
What happened?
So what made this query so much worse than before? The query in question is called AccountDao.findOverviewsByContactEmail(). It’s purpose is to find all the accounts for a particular customer along with some basic account information like expiry dates and what they purchased. This drives the front page of my.atlassian.com which has fairly high traffic.
The original query had not been highly optimised, but it performed reasonably well. As part of some project work I needed to fetch additional data from some new tables. I added them to the from clause. Something like this:
select ... from account, othertable1, othertable2, othertable3, -- added this othertable4, -- added this contactrelationship contact where contact.email = 'foobar@example.com'
Now if I were to plan this query myself I’d start from the contact table and use the email index to get a contact id. I’d use the indexes on contactrelationship and account to join to those tables. We’ll have a fairly short list of just the accounts for that person and we can join to the ‘othertable1’, etc which contain more details on the account. Looking at the query plan, this is exactly what the original query did. Unfortunately adding the extra join tables caused Postgres to start at the opposite end! It instead did a sequential scan on all the accounts in the database and then filtered by contact email! Oh dear!
This change went all the way through our peer review process. At least three people, including myself saw this query and didn’t think to question it. That’s a problem because we can’t let problems like this into production. The code passed all our automated testing and some quick manual verification by myself. It returned the correct results. Unfortunately we don’t have any automated performance tests, they would hopefully have caught this!
Can I try this myself?
I’m glad you asked! I put together a stripped down example so you can see this in action yourself. Note I did this all on Ubuntu 12.04 using postgres 9.1.13. Your results may obviously be different on different OSes and database versions.
- schema.sql – Extremely stripped down version of the database schema we use, including indexes.
- gendata.py – Simple Python script to put dummy data into the schema.
- reset.sh – I used this whenever I changed the schema or gendata.py to recreate my test database.
- query-1-original.sql
- query-2-bad.sql
- query-3-fixed.sql
The query plans
Original query
query-1-original.sql approximates the original query which performed fairly well.
Not sure why it’s doing a sequential scan on the contact table here, probably because it only has 500 rows.
The rest of the plan looks pretty much the same as from the original.
Nested Loop (cost=7.45..15.87 rows=1 width=623) -> Nested Loop (cost=7.45..15.55 rows=1 width=606) -> Nested Loop (cost=7.45..15.15 rows=1 width=82) -> Nested Loop (cost=7.45..14.63 rows=1 width=24) -> Hash Join (cost=7.45..14.24 rows=1 width=8) Hash Cond: (query_cr.contact_id = query_c.id) -> Seq Scan on contactrelationship query_cr (cost=0.00..5.75 rows=275 width=16) -> Hash (cost=7.44..7.44 rows=1 width=8) -> Seq Scan on contact query_c (cost=0.00..7.44 rows=1 width=8) Filter: ((email)::text = 'foobar@example.com'::text) -> Index Scan using contactrelationship_account on contactrelationship main_cr (cost=0.00..0.38 rows=1 width=16) Index Cond: (account_id = query_cr.account_id) Filter: main -> Index Scan using contact_pkey on contact main_c (cost=0.00..0.51 rows=1 width=74) Index Cond: (id = main_cr.contact_id) -> Index Scan using accountdata_account on accountdata d1 (cost=0.00..0.39 rows=1 width=524) Index Cond: (account_id = main_cr.account_id) -> Index Scan using account_pkey on account a (cost=0.00..0.30 rows=1 width=41) Index Cond: (id = main_cr.account_id)
Bad query
For query-2-bad.sql I added dummy joins to confuse the query planner in the same way as the change I made to cause the outage.
Note the sequential scan on account! Ouch!! That table is huge!
Nested Loop (cost=81.36..108.32 rows=1 width=623) Join Filter: (query_cr.account_id = a.id) -> Hash Join (cost=7.45..14.24 rows=1 width=8) Hash Cond: (query_cr.contact_id = query_c.id) -> Seq Scan on contactrelationship query_cr (cost=0.00..5.75 rows=275 width=16) -> Hash (cost=7.44..7.44 rows=1 width=8) -> Seq Scan on contact query_c (cost=0.00..7.44 rows=1 width=8) Filter: ((email)::text = 'f39b503a908441cb85d3c3a2f97b0a43'::text) -> Hash Join (cost=73.91..92.83 rows=100 width=671) Hash Cond: (a.id = d5.account_id) -> Hash Join (cost=60.76..78.31 rows=100 width=663) Hash Cond: (a.id = d4.account_id) -> Hash Join (cost=47.61..63.78 rows=100 width=655) Hash Cond: (a.id = d3.account_id) -> Hash Join (cost=34.46..49.26 rows=100 width=647) Hash Cond: (a.id = main_cr.account_id) -> Hash Join (cost=17.43..30.35 rows=100 width=573) Hash Cond: (d2.account_id = a.id) -> Seq Scan on accountdata d2 (cost=0.00..11.40 rows=140 width=8) -> Hash (cost=16.18..16.18 rows=100 width=565) -> Hash Join (cost=3.25..16.18 rows=100 width=565) Hash Cond: (d1.account_id = a.id) -> Seq Scan on accountdata d1 (cost=0.00..11.40 rows=140 width=524) -> Hash (cost=2.00..2.00 rows=100 width=41) -> Seq Scan on account a (cost=0.00..2.00 rows=100 width=41) -> Hash (cost=15.78..15.78 rows=100 width=74) -> Hash Join (cost=7.00..15.78 rows=100 width=74) Hash Cond: (main_c.id = main_cr.contact_id) -> Seq Scan on contact main_c (cost=0.00..6.75 rows=275 width=74) -> Hash (cost=5.75..5.75 rows=100 width=16) -> Seq Scan on contactrelationship main_cr (cost=0.00..5.75 rows=100 width=16) Filter: main -> Hash (cost=11.40..11.40 rows=140 width=8) -> Seq Scan on accountdata d3 (cost=0.00..11.40 rows=140 width=8) -> Hash (cost=11.40..11.40 rows=140 width=8) -> Seq Scan on accountdata d4 (cost=0.00..11.40 rows=140 width=8) -> Hash (cost=11.40..11.40 rows=140 width=8) -> Seq Scan on accountdata d5 (cost=0.00..11.40 rows=140 width=8)
Fixed query
For query-3-fixed.sql I left the dummy joins in place and moved the important narrowing tables into a subquery to encourage the query planner to behave. Just moving the narrowing tables higher up in the from clause did not seem to help.
This plan is much nicer! It still has these sequential scans on contact, but that doesn’t happen in the production database. Once again I think this is because it’s only got a few hundred rows in that demo table. You can also see the cost function is getting a much lower result, although that’s in arbitrary units so I’m not sure direct comparisons are valid.
Nested Loop (cost=7.45..33.24 rows=1 width=623) -> Nested Loop (cost=7.45..24.96 rows=1 width=671) -> Nested Loop (cost=7.45..16.68 rows=1 width=663) -> Nested Loop (cost=7.45..16.28 rows=1 width=655) -> Nested Loop (cost=7.45..15.87 rows=1 width=647) -> Nested Loop (cost=7.45..15.46 rows=1 width=123) -> Nested Loop (cost=7.45..14.94 rows=1 width=65) -> Nested Loop (cost=7.45..14.63 rows=1 width=24) -> Hash Join (cost=7.45..14.24 rows=1 width=8) Hash Cond: (query_cr.contact_id = query_c.id) -> Seq Scan on contactrelationship query_cr (cost=0.00..5.75 rows=275 width=16) -> Hash (cost=7.44..7.44 rows=1 width=8) -> Seq Scan on contact query_c (cost=0.00..7.44 rows=1 width=8) Filter: ((email)::text = 'f39b503a908441cb85d3c3a2f97b0a43'::text) -> Index Scan using contactrelationship_account on contactrelationship main_cr (cost=0.00..0.38 rows=1 width=16) Index Cond: (account_id = query_cr.account_id) Filter: main -> Index Scan using account_pkey on account a (cost=0.00..0.30 rows=1 width=41) Index Cond: (id = query_cr.account_id) -> Index Scan using contact_pkey on contact main_c (cost=0.00..0.51 rows=1 width=74) Index Cond: (id = main_cr.contact_id) -> Index Scan using accountdata_account on accountdata d1 (cost=0.00..0.39 rows=1 width=524) Index Cond: (account_id = query_cr.account_id) -> Index Scan using accountdata_account on accountdata d2 (cost=0.00..0.39 rows=1 width=8) Index Cond: (account_id = query_cr.account_id) -> Index Scan using accountdata_account on accountdata d3 (cost=0.00..0.39 rows=1 width=8) Index Cond: (account_id = query_cr.account_id) -> Index Scan using accountdata_account on accountdata d4 (cost=0.00..8.27 rows=1 width=8) Index Cond: (account_id = query_cr.account_id) -> Index Scan using accountdata_account on accountdata d5 (cost=0.00..8.27 rows=1 width=8) Index Cond: (account_id = query_cr.account_id)
What I learned
While investigating I learned that postgres has a genetic query optimiser. That seems really cool, but potentially the cause of this problem. Disabling it did change the query plan, but it still did a full scan of the account table.
What have I learned from this?
- Different versions of postgres can sometimes execute your queries with dramatic differences!
- The author of an SQL change must always always run an explain on the query on the production database.
- Reviewers should check for this as well as the code diff.
- We need automated performance regression testing against our staging environment.
As an aside we should stop using ORMs like Hibernate. Who knows what query plans get created for them!
Postgres usually does a very good job of executing queries. But I will a lot more careful in the future! :)
Interesting, I don’t think I’ve read much on query optimisation and had no idea postgres used genetic programming. In general it’s not the best solution to problems (it’s usually an okay solution), so it’s surprising that it is used in such a major piece of software.
@Jonathan, yeah I was quite surprised too!