Pentaho, does it work?

I have a project in which I have to come up with a basic data warehouse implementation. I have to deal with all the basics, ETL, Cube design, etc, and on top of that I intend to build a naive Bayesian classifier generator for decision support. (I might consider ID3 or C 4.5, but I’m not sure they are free for this kind of use). Developing all of these from scratch is out of the question, after all why should I do it if I do not have to. Having a decent UI at least for some tasks would be nice though, and Pentaho might be the answer. I have been following Pentaho for quite some time now, and finally I need exactly what they provide, for a consultancy job. I guess we’ll see if they are up to the claims they make. Most of the parts of their product portfolio are based on well known tools like weka or mondrian, but they have been building solutions that use eclipse rcp to wrap these tools, and might be able to do a lot with their existing solutions. I’ll write a detailed summary of my experience, but for the moment Pentaho seems to be the only vendor that opens a free, open source solution. If I can reuse their work, that’d be a really very important base for my future plans, because I’ve always believed that business intelligence and/or analysis tools require knowledge in various areas like data mining, machine learning etc, to provide a real benefit. So money paid for any of these tools should actually be paid for the expert not the tool, since I can hardly imagine an off the shelf tools providing the real benefit of the mentioned concepts. Well, I guess we’ll see about that.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s