Apache FOP: formatting objects is fun

I’ve been working on a tricky problem at work this week (and last week as well actually, its been really really tricky in fact), we need to be able to output a form in both PDF (Portable Document Format) and PCL (Printer Control Language) output, because our fax system can only handle PCL format.

Ghostscript

I had a look at using Ghostscript, its been around a while and is widely-used, freely-available and, by all accounts, stable. I had some trouble getting it working initially but I think it would have done the job.

Apache FOP

The Apache Project has a project called FOP (Formatting Objects Project) which is part of their XML Graphics project. Its a module that takes a particular type of XML format called Formatting Objects (now a w3c recommendation and known as xml:fo), a type of XML used to represent a document of information along with information about presentation.

Since xml:fo is a recognised standard, its a great format to choose to implement the conversions to PCL and PDF. Other output versions are also available with more on the way too, so its an application that can be adapted to meet other needs as they arise.

XSL translations

Since xml:fo is a standard and its XML, it should be possible to get any number of XML formats (including Open Office or Word XML) translated into it using an XSL (eXtensible Stylesheet Language). I tried out a couple of these from http://www.antennahouse.com/, however although these worked well with the sample files I found that I had trouble with the resulting xml:fo formats produced from my own xhtml files.

AntennaHouse clearly have a lot of knowledge in this area though, and their site is well worth a visit for background reading on this topic. I suspect that part of the problem was that FOP only has a partial implementation of the xml:fo specification, so although I was feeding it valid xml:fo, it didn’t know what to do with all of it. There is a rewrite in progress so I expect that newer versions will be much more robust.

Final Solution

In the end (since I only wanted a simple one-page form), I settled on writing the xml:fo format by hand, producing really great results in both formats and with images as well. I’ve also been asked to look into programs to generate this output, they’re mostly commercial but if I come across anything interesting I’ll add it here. Apache FOP is a great project and I hope it doesn’t lose it momentum!

Leave a Reply

Please use [code] and [/code] around any source code you wish to share.

This site uses Akismet to reduce spam. Learn how your comment data is processed.