How much can you scale incremental approaches?

I can still remember a seminar I’ve been to during my master’s studies. It was about project management, and the speaker was a very experienced project manager from defence industry. When you are building a submarine, or a fighter jet, you don’t get to do many iterations! We are quite used to engineering projects in defence industry that takes billions of dollars, and years to complete.

When you speak to engineers, project managers and other people in these projects, you see that they all have the same approach: “of course this is the way to do it!”. How else can you run something as big as this? In theory, as time and billions of dollars go by, these engineering practices should have become perfect, with all the lessons learned.

There are some signs however, that I can’t ignore. Consider the circumstances and technology when NASA managed to send a man to the moon. 50 years after that ( almost), and NASA has just announced that another launch to moon would be delayed to 2020. The innovation in space shuttles now comes from small private companies, which do not have the billions to spend. Look at NHS, look at other gigantic engineering projects. There is something wrong here, but people are perfect in ignoring the pink elephant in the middle of the room.

So I’m thinking, we should find a better way of ensuring agility, to the max we can do, we need to change a couple of things, even for larger projects. I’m not a core project manager, though I’ve done my share of it, but I’d like see some people with experience of both formal, and less formal and agile projects to try to join the two approaches.

Why am I drinking at 9 AM in the morning?

Because Oracle has just bought Sun Microsystems. Worst mergers news I’ve heard in the last 5 years. If it had been IBM, I would be dancing around the CHIME, since that would put all Sun’s assets in to hands of a capable company that knows what they are doing.

But Oracle? I mean Oracle???? Come on! Now let’s see Java become a headless chicken, while MS eats Oracle’s DB shares every day. Back to depression now.

Things we forget about the database

Like many of you out there, I’ve been involved in distributed, multi layer software development for some time. Starting with the first ever asp based web site I’ve developed in 96 or 97, I’ve built a lot of multilayer applications.

In the process, I’ve gotten used to switching to new tools and approaches as long as they were tested and approved by the industry, and the developers I trust and respect. As a result, I’ve developed habits, I’ve become used to reaching for my trusted tools in many of the projects.  I’ve made some friends like llblgen and hibernate for db access, and they’ve been with me for a long time now. However, a group of friends of mine, did not switch to these widely accepted solutions, and I’ve been giving them a lot of critisim for not doing so. Especially Ufuk, an old friend of mine, who has been developing his product for over 10 years now, has been a proponent of db level optimization. Where I defended the idea of portability of databases, he defended performance, and choosing a platform for db and sticking with it.

In the past years, he has proven that my argument did not hold so well. In all the projects I’ve taken part in the last 8 years or so, how many times do you think the db layer changed? I can only think of one case, and even that one did not go beyond a proof of concept. I can’t help remember being forced to resorting to stored procedures and db level code when my team was challenged with sorting and printing out financial records of almost 10 companies from a central db, that spans a year’s transactions. Would you like to do that using hibernate? Good luck with that.

While I was investing in the future configuration and product changes in future, Ufuk kept optimizing his product, and investing into stored procedures in db, in his chosen db layer product: MS Sql Server.  I’ve seen his software perform exceptionally fast on quite moderate hardware. So for the last two years I’ve started to pay attention to the things I’ve seen in the past. The things done by experienced DBAs working on Oracle or MS Sql Server, the things done by developers who avoid ORM, and focus on squeezing out every bit of performance from their DB layer. Looking back at the last five years or so, I have to admit that I’ve been wrong in some of my choices, and I intend to not to make the same mistakes again. Especially for the persistence layer I’m testing for openEHR reference implementation extensions I’m developing.

I’ve chosen an open source db, postgresql, and I’m still using hibernate, but this time, I’ll put some effort into stored procedures (functions in postgresql speak), and make sure that I am using temporary tables, sets covering custom types etc. After all, there is no doubt that healthcare data is going to be large in size, so it is time to give the respect to the db layer, which has been the most mature layer in today’s solutions. DB products has been evolving for the last 20 years or more, and we have been neglecting them, focusing on throwing more hardware to the middle layer(s). This seems to be the classic hammer and nail symptom, where a man with only a hammer sees all problems as nails.  I’ll put some serious effort into exploiting postgresql, and this time, I’ll even drop the assumption that people would like to use mysql. If they do want to use MySql indeed, the whole thing is open source, so they can port my db layer code. If they can’t do it just because MySql does not have some feature, than I won’t sacrifice performance by establishing a lower common denominator,  just to make an uncertain future porting operation easier. As long as I’m using a trusted open source DB solution, this should be acceptable, since all the code is open source, and that’s an opportunity for others to change it into their favorite architecture or configuration.

Let’s see how it goes, I have a couple of plans for generating stored procedures automatically from archtype wrappers I’ve written, and if it works, it may be really fast. I’ll write down the results.