I've recently been working with an internal home brew system for research support systems at work at TCD. The system aggregates data from different sources, provides an interface to the members of staff to publish metadata about themselves and most importantly automate some processes to publish information into a digital repository called TARA - Trinity's Access to Research Archive. It's actually not too bad of a system, it works and the users and university get what they need out of the system. Another aspect of the system is the ability to extract data and export some of the data as an XML file, I was the unlucky soul charged with loading up this exported XML and rendering it on a different system.

Talend Open Studio, jasper server and jasper ireport came to my rescue. With much help from http://www.frau-klein.de it was possible to extract and load the data that we were interested in to a useful destination, in our case a MySQL database. After the data was imported into a MySQL DB, it was possible to generate reports with ireport using the MySQL DB as a data source.

As Frau Klein suggested

  1. Find out what data you need to present.
  2. Find out where to get your data from your data source.
  3. Decide how your data should look like in your target DB.
  4. Get Talend Open Studio or JasperETL and design your ETL/ELT process, and make sure you have a small and valid dataset to test with. Note, I had to use TOS because JETL had some bug with the multiple schema xml component.
  5. Use ireport to generate your needed reports.
  6. Profit! - well not really.

What did learn from the process of doing this

  1. Working in academia is different from the enterprise, there are lots of tools out there to do things, so don't reinvent the wheel or do a half assed job. I was going to hack up a perl script to do the above task. I am glad I didn't.
  2. Make people tell you what they want and try to meet those targets and write down these targets so people can agree on them, such that it cannot be changed without good reason.
  3. If there is a standard (in my case, the input XML data file) make sure your source is validated according to the schema, and make people provide data in the proposed format. Deviating from this will cause pain, misery and lots of work. I haven't quite experienced this yet but I can see it coming my way soon.
  4. Need to learn how to say no to feature requests half way through the project.
  5. Make sure there is plenty of time to do the project!

I would highly recommend anyone who needs to do an ELT process to take a look at Talend Open Studio and for anyone wanting to generate reports to look at the Jasper range of opensource tools.

Bookmark and Share