Another Mike

Oracle and Postgres Redux

12/10/2008

So, my little rant about Oracle got a lot more attention than I'd expected.

While I was mainly focused on the price of Oracle, now I'd like to explore my favorite alternative a little more: PostgreSQL (PG). I've been busy the last few days using the excellent Ora2Pg to convert my old 8i database. What I've done is a quick and dirty off-hours conversion for a project that I've been working on for several months just to see how PG would work.

The project is a dashboard, a business reporting tool that is all the rage these days. It includes every business indicator they could think of from hours a deal spent waiting for approval, to projected totals of deals not yet approved, to charts and graphs of deals long ago approved. Gone are the days when you could give the users a copy of Crystal Reports and let them at it. They expect much more.

In the beginning I was saddled with an enormous spec that delineated only some of what the application should do and a number of existing reports. Unfortunately, not one of them ran in under 10 minutes and they were each a knotted mess of SQL depending on an untold number of views and procs. It took months to streamline the SQL and create a new data warehousing project. When it was all said and done, 51 queries dominated the main screen of the dashboard.

And it still took 30 seconds to draw for simple searches.

That was when I began caching like mad. There were hundreds of thousands of possible search term combinations, meaning I couldn't simply populate the cache ahead of time, but at least it only needed to run so often. Terracotta , which I'll write about soon, was a huge help. It's ten pounds of awesome in a five pound sack.

Caching is probably the only reason my users haven't gone running into the hands of Oracle, thrusting dollars at them. It alleviates the load and allows only queries that haven't yet been run to actually touch the database. I can get away with a whole lot less horsepower than I would need otherwise.

Porting

This isn't meant to be a benchmark because that would be fantastically unfair to Postgres. I've spent an enormous amount of effort in Oracle optimization. It's big job filled with expensive consultants who post weird pictures of themselves.

The schema contains materialized views built on partitioned tables. This greatly helps cut down the amount of data Oracle must sort through to fetch rows. The indexing has been deliberate and extensively tested, some times only to gain a .10 second difference. It's been de-normalized as much as possible but the structure of the data made that difficult. A couple queries still touch as many as five tables. The queries themselves have also been written and rewritten with Oracle in mind, using inline views, subselects or whatever Oracle happened to fetch faster. I also made some use of optimizer hints like /*+ RULE */ and /*+ JUST_USE_MY_INDEX_ALREADY(table) */. I think of them as an implicit admission that the optimizer is broken if I have to tell it what to use.

But for Postgres I generally let Ora2Pg do it's thing. I didn't even bother changing the types on many of the columns, some of which defaulted to the slower arbitrary precision type numeric. Postgres supports more sensible integral types like int2 and int4. I also didn't bother partitioning the tables similar to my Oracle setup, and I didn't even bother compiling it myself to my architecture for the best performance. I pulled 8.3 using apt-get from the Ubuntu repository.

In short, I did about as little as possible. The only changes I made to the queries were the simple syntax differences between the two. Such as Postgres requires the "AS" keyword in the SELECT expression. Or a few type casts here and there. I found that a few of my indexes depended on Oracle's trunc function, so I created my own function that simply returned the result of a ::date cast and indexed that. There's a lot of room to improve these results.

Results

So without further ado, here's how an unoptimized PG faired:

That's pretty impressive.