Thursday, October 2, 2008

Geospatial Live DVD

In the three weeks leading up to the FOSS4G conference, I had the opportunity to work on LISAsoft's Live DVD efforts. These initially started as an investigation into getting some of the Java packages onto an Ubuntu Live DVD image, but after a quick consultation with the OSGeo community, became a collaborative push to develop a live demo CD for FOSS4G.

We made one big mistake early on; we started too late. Let me give a quick overview of the process we used. Ominiverdi has already produced a Live DVD with some key C/C++ based packages on it, so we used this as our starting image. This image was then mounted into the file system and the root folder changed, essentially giving us a terminal in the image. From here, we were able to install and configure applications to our hearts content. There are various emulators and virtual machines that can also be used to boot into the image, but I stuck mostly with the command line. We knew from the start that much of the work we would do in this manner would not be replicable or upgradable. We simply didn't have the time to do things right from the start.

Since this effort was intended to be collaborative, we created a new image and uploaded it to the OSGeo server every night. This nearly proved our undoing. The image creation process is not a fast one. Made slower by Stefans use of a virtual machine to provide an Ubuntu environment in which to operate. This gave him, at times, a XUbuntu fakeroot mounted into an XUbuntu virtual machine running on Windows XP. Not hard to guess that his performance wasn't optimal. This meant that the last hour and a half of each day were spent creating and verifying the image. On one evening, Stefan ran into trouble and finally gave up at 9:30pm. Even after the image was produced, it took 2-3 hours to upload before being copied into the correct directory. Since Stefan is Melbourne based, and I'm in Sydney, that also meant that I spent 2-3 hours every morning downloading a new image. All in all, a huge drain on time.

The other impact of this approach was that it greatly reduced the ability of others to contribute. Lorenza Becchi managed to get in some MapServer configuration, which gave us something to show off. Other than that, most people were taken out of the equation by the 2GB download required.

Despite these problems, the community was keen to offer support and assistance to the effort, and we ultimately had a stable and reasonably feature-rich image available in time to burn a stack of disks before the conference.

Yesterday, there was a Bird of a Feather (BOF) session at FOSS4G. We had half a dozen people in a room in Cape Town, as well as half a dozen more scattered across the globe. It was more productive than I had expected, with Tim Bowden ensuring that the live discussion in Cape Town was sufficiently transcribed into the IRC channel to keep us involved. While a full transcript will significantly — and negatively — effect the popularity of this blog, I can provide a quick summary.

There was almost unanimous agreement that a better packaging system, using debian (.deb) style packages, was required. My proposal, which was vaguely accepted, was to break the packaging of everything down to a fairly fine granularity. Essentially we would first package all the applications that are not yet packaged, in particular the Java ones we at LISAsoft added. Next would be a packaging of the sample datasets that were added. In particular, there is a dataset of basic Australian features. Next would come the default configurations, which would depend on both the package of the application being configured, and the data it's linking the application to. For example, we create a package to install uDig to the standard location and drop an icon on the desktop. We can then create a package to load some shapefiles into a common spot on the file system, or into a database. Finally, yet another package can add a default uDig workspace that has been configured with connection details for the data and appropriate styling to leave a pleasant first impression. The image below shows an overly-simplified, incomplete and possibly misleading example showing expected package dependencies. The advantage of this system, is you can create a Demo DVD from an existing Ubuntu or Debian Live DVD by simply adding the Demo DVD package.



This system of packaging is very powerful. We have already identified three broad use cases for these Live DVDs: demo dvds, educational dvds and installers. The demo dvd is a true live dvd. It will be run from RAM, and thus should be kept small. It will have as many basic applications as we can manage, but lacking in the developer tools, script bindings and such. Basic tutorials, default configurations and such will need to be included, and it should be fully functional off-line. The educational dvds will be more involved. There is an Education Initiative in OSGeo that has the potential to create large amounts of high-quality instructional content. Unlike the demo dvd, this is less self-driven, and is likely to be used as a train-the-instructor tool as well as a classroom resource. Finally, the install dvd is useful in areas, like Cape Town apparently, where downloading a large disk image is not practical. This disk will need to be able to install a new operating system, individual packages, Windows installers or Mac installers. It will need to be completely off-line, and include as much of the product documentation as we can handle. Sample data and curriculum materials are secondary. By modularising with this granularity, we can quickly produce special purpose Live DVD images using existing tools simply by pointing them at our repository and selecting the packages we desire.

That raises the next issue: a package repository. I've done a small amount of research into the debian package repository structure, and it seems fairly easy to do poorly. Fortunately, Tim Bowden has volunteered to get one up and running for us using OSGeo supplied infrastructure and Chris Schmidt supplied guidance. Having the repository will provide us with a central location to share our work in a much more efficient manner than disk image uploads.

The final bridge to cross is content. This is the sticking point at the moment. Our BOF session was fairly tech-heavy. The instructional content, be it tutorials, documentation, or full blown curricula, needs to be developed by people that know what their doing, both from a literary and product standpoint. The Edu group will be invaluable here for broad content and product communities for application specific content. If they can produce modular content, we can package it fairly easily. This is where the real value of the Live DVD will be found. Simply loading up an application and poking at it is great. I spend a great deal of time doing that. But to have well thought out, structured material to walk people through the capabilities of application is invaluable here.

I'm excited about the direction this project is taking. We've got interest from some smart and dedicated people, and we've got enough direction to keep us going for some time. I think having the versatility this approach will give us will make these DVDs ubiquitous at conferences, workshops and hopefully universities. Getting the word out is valuable. Getting the product out is better.

No comments: